| Preface | p. xi |
| Introduction | p. 1 |
| A Quick View of Technological Advances | p. 2 |
| Performance Metrics | p. 6 |
| Performance Evaluation | p. 12 |
| Summary | p. 22 |
| Further Reading and Bibliographical Notes | p. 23 |
| Exercises | p. 24 |
| References | p. 28 |
| The Basics | p. 29 |
| Pipelining | p. 29 |
| Caches | p. 46 |
| Virtual Memory and Paging | p. 59 |
| Summary | p. 68 |
| Further Reading and Bibliographical Notes | p. 68 |
| Exercises | p. 69 |
| References | p. 73 |
| Superscalar Processors | p. 75 |
| From Scalar to Superscalar Processors | p. 75 |
| Overview of the Instruction Pipeline of the DEC Alpha 21164 | p. 78 |
| Introducing Register Renaming, Reorder Buffer, and Reservation Stations | p. 89 |
| Overview of the Pentium P6 Microarchitecture | p. 102 |
| VLIW/EPIC Processors | p. 111 |
| Summary | p. 121 |
| Further Reading and Bibliographical Notes | p. 122 |
| Exercises | p. 124 |
| References | p. 126 |
| Front-End: Branch Prediction, Instruction Fetching, and Register Renaming | p. 129 |
| Branch Prediction | p. 130 |
| Sidebar: The DEC Alpha 21264 Branch Predictor | p. 157 |
| Instruction Fetching | p. 158 |
| Decoding | p. 164 |
| Register Renaming (a Second Look) | p. 165 |
| Summary | p. 170 |
| Further Reading and Bibliographical Notes | p. 170 |
| Exercises | p. 171 |
| Programming Projects | p. 174 |
| References | p. 174 |
| Back-End: Instruction Scheduling, Memory Access Instructions, and Clusters | p. 177 |
| Instruction Issue and Scheduling (Wakeup and Select) | p. 178 |
| Memory-Accessing Instructions | p. 184 |
| Back-End Optimizations | p. 195 |
| Summary | p. 203 |
| Further Reading and Bibliographical Notes | p. 204 |
| Exercises | p. 205 |
| Programming Project | p. 206 |
| References | p. 206 |
| The Cache Hierarchy | p. 208 |
| Improving Access to L1 Caches | p. 209 |
| Hiding Memory Latencies | p. 218 |
| Design Issues for Large Higher-Level Caches | p. 232 |
| Main Memory | p. 245 |
| Summary | p. 253 |
| Further Reading and Bibliographical Notes | p. 254 |
| Exercises | p. 255 |
| Programming Projects | p. 257 |
| References | p. 258 |
| Multiprocessors | p. 260 |
| Multiprocessor Organization | p. 261 |
| Cache Coherence | p. 269 |
| Synchronization | p. 281 |
| Relaxed Memory Models | p. 290 |
| Multimedia Instruction Set Extensions | p. 294 |
| Summary | p. 296 |
| Further Reading and Bibliographical Notes | p. 297 |
| Exercises | p. 298 |
| References | p. 300 |
| Multithreading and (Chip) Multiprocessing | p. 303 |
| Single-Processor Multithreading | p. 304 |
| General-Purpose Multithreaded Chip Multiprocessors | p. 318 |
| Special-Purpose Multithreaded Chip Multiprocessors | p. 324 |
| Summary | p. 330 |
| Further Reading and Bibliographical Notes | p. 331 |
| Exercises | p. 332 |
| References | p. 333 |
| Current Limitations and Future Challenges | p. 335 |
| Power and Thermal Management | p. 336 |
| Technological Limitations: Wire Delays and Pipeline Depths | p. 343 |
| Challenges for Chip Multiprocessors | p. 346 |
| Summary | p. 348 |
| Further Reading and Bibliographical Notes | p. 349 |
| References | p. 349 |
| Bibliography | p. 351 |
| Index | p. 361 |
|