A fast-paced technical guide to modern processor microarchitecture covering pipelining, superscalar execution, VLIW, out-of-order execution, branch prediction, predication, SMT/hyper-threading, multi-core design, and the power and ILP walls. Explains why clock speed alone doesn't determine performance, how x86 processors use internal RISC-like micro-ops to stay competitive, and the tradeoffs between brainiac (OOO-heavy) and speed-demon (simpler, higher-clocked) design philosophies. Includes concrete pipeline depth and issue-width tables for processors from MIPS R10000 to Apple M-series and AMD Zen 5.
Table of contents
Table of ContentsMore Than Just MegahertzPipelining & Instruction-Level ParallelismDeeper Pipelines – SuperpipeliningMultiple Issue – SuperscalarExplicit Parallelism – VLIWInstruction Dependencies & LatenciesBranches & Branch PredictionEliminating Branches with PredicationInstruction Scheduling, Register Renaming & OOOThe Brainiac DebateThe Power Wall & The ILP WallWhat About x86?Threads – SMT, Hyper-Threading & Multi-CoreMore Cores or Wider Cores?Data Parallelism – SIMD Vector InstructionsMemory & The Memory WallCaches & The Memory HierarchyCache Conflicts & AssociativityMemory Bandwidth vs LatencyAcknowledgmentsMore Information?Sort: