CA Reading Notes:Chapter 1

Parallelism & Parallel Architectures

Data-level parallelism
Task-level parallelism

(P10)

Instruction-level parallelism
Vector architectures, graphic processor units (GPUs), and multimedia instruction sets
Thread-level parallelism
Request-level parallelism

SISD: single instruction stream, single data stream
SIMD
MISD
MIMD

ISA

class of ISA: register-memory(80x86), load-store(RISC-V)…

memory addressing(P13)
addressing modes
types and sizes of operands
operations
control flow instructions
ISA encoding

Design Principles

(P18)

Power and energy

Energy is now the major constraint.

maximum power
voltage indexing methods
Thermal design power (TDP)
energy but not power to evaluate
power could be used when max power is fixed

reduce the voltage(1V up to now, hard for further reduction)
Caution: lowering the frequency will not reduce the energy consumption.
Enhance energy efficiency with flat clock rates and constant voltage：(P27)

stop part of the clock when idle
Dynamic voltage-frequency scaling(DVFS)
Design for typical case
Overclocking

static power

transister size ↓ –→ current leakage ↑

race-to-halt(P28)

Shift in design
Combination of general-purpose cores and specialized accelerators

Cost

Omitted.
Cost of an integrated circuit(P31)
Yield Calculation(P31, P34)

Dependability

Easier to specify the fault with the recursive structure of the computer.
reliability, availability(quantifiable metrics, P37)

Measuring, Reporting, and Summarizing Performance

is the performance。

Benchmarks
Out of date
- Kernels
- Toy programs
- Synthetic benchmarks
  Benchmark suites
- EEMBC
- *SPEC
Report
Omitted.
Summarize
$\frac{\sqrt[n]{\prod_{i=1}^n \text { SPECRatio A }i}}{\sqrt[n]{\prod{i=1}^n \text { SPECRatio B }i}}=\sqrt[n]{\prod{i=1}^n \frac{\text { SPECRatio } \mathrm{A}i}{\text { SPECRatio B }i}}=\sqrt[n]{\prod{i=1}^n \frac{\frac{\text { Execution time }{\text {efference }i}}{\text { Execution time }{\mathrm{A}i}}}{\frac{\text { Execution time }{\text {reference }i}}{\text { Execution time }{\mathrm{B}i}}}}=\sqrt[n]{\prod{i=1}^n \frac{\text { Execution time }{\mathrm{B}i}}{\text { Execution time }{\mathrm{A}i}}}=\sqrt[n]{\prod{i=1}^n \frac{\text { Performance }{\mathrm{A}i}}{\text { Performance }{\mathrm{B}_i}}}$

Quantitative Principles

Principle of Locality
90% time on 10% code
- Temporal Locality
- Spatial Locality
Amhdal’s Law
CPI & IPC
- clock cycle time: Hardware and Organization
- CPI: Organization and ISA
- Instruction count: ISA and compiler
Fallacies and Pitfalls
the real-world MTTF is about 2-10 times worse than the manufacturer’s MTTF.
Fault detection can lower the availability.
The Others Omitted. (P59)

Summary for chapters and appendices
Omitted.(P66)