Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31

CSCE430/830 Computer Architecture

Instruction Set Architecture: An Introduction


Instructor: Hong Jiang
Courtesy of Prof. Yifeng Zhu @ U. of Maine

Fall, 2006

CSCE430/830

Portions of these slides are derived from: Dave Patterson UCB

ISA

Review
Amdahls Law:
Execution Time without enhancement E 1 Speedup(E) = --------------------------------------------------------- = ---------------------Execution Time with enhancement E (1 - F) + F/S

CPU Time & CPI: CPU time = Instruction count x CPI x clock cycle time CPU time = Instruction count x CPI / clock rate

CSCE430/830

ISA

Outline
Instruction Set Overview
Classifying Instruction Set Architectures (ISAs) Memory Addressing Types of Instructions \

MIPS Instruction Set (Topic of next lecture)

CSCE430/830

ISA

Instruction Set Architecture (ISA)


Serves as an interface between software and hardware. Provides a mechanism by which the software tells the hardware what should be done.
High level language code : C, C++, Java, Fortran, compiler Assembly language code: architecture specific statements assembler Machine language code: architecture specific bit patterns software instruction set hardware
CSCE430/830 ISA

Interface Design
A good interface:

Lasts through many implementations (portability, compatability)


Is used in many different ways (generality) Provides convenient functionality to higher levels Permits an efficient implementation at lower levels
use use use

Interface

imp 1

time

imp 2
imp 3

CSCE430/830

ISA

Instruction Set Design Issues


Instruction set design issues include:
Where are operands stored? registers, memory, stack, accumulator How many explicit operands are there? 0, 1, 2, or 3 How is the operand location specified? register, immediate, indirect, . . . What type & size of operands are supported? byte, int, float, double, string, vector. . . What operations are supported? add, sub, mul, move, compare . . .

CSCE430/830

ISA

Evolution of Instruction Sets


Single Accumulator (EDSAC 1950, Maurice Wilkes) Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953) Separation of Programming Model from Implementation High-level Language Based (B5000 1963) Concept of a Family (IBM 360 1964)

General Purpose Register Machines Complex Instruction Sets (Vax, Intel 432 1977-80) CISC Intel x86, Pentium
CSCE430/830

Load/Store Architecture (CDC 6600, Cray 1 1963-76) RISC (MIPS,Sparc,HP-PA,IBM RS6000,PowerPC . . .1987)
ISA

Classifying ISAs
Accumulator (before 1960, e.g. 68HC11):
1-address add A acc acc + mem[A]

Stack (1960s to 1970s):


0-address add tos tos + next

Memory-Memory (1970s to 1980s):


2-address 3-address add A, B add A, B, C mem[A] mem[A] + mem[B] mem[A] mem[B] + mem[C]

Register-Memory (1970s to present, e.g. 80x86):


2-address add R1, A load R1, A R1 R1 + mem[A] R1 mem[A]

Register-Register (Load/Store) (1960s to present, e.g. MIPS):


3-address add R1, R2, R3 load R1, R2 store R1, R2 R1 R2 + R3 R1 mem[R2] mem[R1] R2
ISA

CSCE430/830

Operand Locations in Four ISA Classes


GPR

CSCE430/830

ISA

Code Sequence C = A + B for Four Instruction Sets


Stack Push A Push B Add Pop C Accumulator Load A Add B Store C Register (register-memory) Load R1, A Add R1, B Store C, R1 Register (loadstore) Load R1,A Load R2, B Add R3, R1, R2 Store C, R3

memory
CSCE430/830

memory
R1 = R1 + mem[C]

acc = acc + mem[C]

R3 = R1 + R2

ISA

More About General Purpose Registers


Why do almost all new architectures use GPRs?
Registers are much faster than memory (even cache) Register values are available immediately When memory isnt ready, processor must wait (stall) Registers are convenient for variable storage Compiler assigns some variables just to registers More compact code since small fields specify registers (compared to memory addresses)
Processor
Registers Cache

Memory

Disk

CSCE430/830

ISA

Stack Architectures
Instruction set:
add, sub, mult, div, . . . push A, pop A

Example: A*B - (A+C*B)


push A push B mul push A push C push B mul add sub
A B A A*B
A A*B C A A*B B C A A*B B*C A A*B A+B*C result A*B

CSCE430/830

ISA

Stacks: Pros and Cons


Pros
Good code density (implicit top of stack) Low hardware requirements Easy to write a simpler compiler for stack architectures

Cons
Stack becomes the bottleneck Little ability for parallelism or pipelining Data is not always at the top of stack when need, so additional instructions like TOP and SWAP are needed Difficult to write an optimizing compiler for stack architectures

CSCE430/830

ISA

Accumulator Architectures
Instruction set:
add A, sub A, mult A, div A, . . . load A, store A

Example: A*B - (A+C*B)


load B mul C add A store D load A mul B sub D
B B*C A+B*C

acc = acc +,-,*,/ mem[A]


A+B*C A A*B result

CSCE430/830

ISA

Accumulators: Pros and Cons


Pros
Very low hardware requirements Easy to design and understand

Cons
Accumulator becomes the bottleneck Little ability for parallelism or pipelining High memory traffic

CSCE430/830

ISA

Memory-Memory Architectures
Instruction set:
(3 operands) (2 operands) add A, B, C add A, B sub A, B, C sub A, B mul A, B, C mul A, B

Example: A*B - (A+C*B)


3 operands mul D, A, B mul E, C, B add E, A, E sub E, D, E 2 operands mov D, A mul D, B mov E, C mul E, B add E, A sub E, D

CSCE430/830

ISA

Memory-Memory: Pros and Cons


Pros
Requires fewer instructions (especially if 3 operands) Easy to write compilers for (especially if 3 operands)

Cons
Very high memory traffic (especially if 3 operands) Variable number of clocks per instruction With two operands, more data movements are required

CSCE430/830

ISA

Register-Memory Architectures
Instruction set:
add R1, A load R1, A sub R1, A store R1, A mul R1, B

Example: A*B - (A+C*B)


load R1, A mul R1, B store R1, D load R2, C mul R2, B add R2, A sub R2, D
CSCE430/830

R1 = R1 +,-,*,/ mem[B]

/*

A*B

*/

/* /* /*

C*B */ A + CB */ AB - (A + C*B) */
ISA

Memory-Register: Pros and Cons


Pros
Some data can be accessed without loading first Instruction format easy to encode Good code density

Cons
Operands are not equivalent (poor orthogonal) Variable number of clocks per instruction May limit number of registers

CSCE430/830

ISA

Load-Store Architectures
Instruction set:
add R1, R2, R3 load R1, &A sub R1, R2, R3 mul R1, R2, R3 store R1, &A move R1, R2

Example: A*B - (A+C*B)


load R1, &A load R2, &B load R3, &C mul R7, R3, R2 add R8, R7, R1 mul R9, R1, R2 sub R10, R9, R8
R3 = R1 +,-,*,/ R2

/* /* /* /*

C*B A + C*B A*B A*B - (A+C*B)

*/ */ */ */

CSCE430/830

ISA

Load-Store: Pros and Cons


Pros
Simple, fixed length instruction encodings Instructions take similar number of cycles Relatively easy to pipeline and make superscalar

Cons
Higher instruction count Not all instructions need three operands Dependent on good compiler

CSCE430/830

ISA

Registers: Advantages and Disadvantages


Advantages
Faster than cache or main memory (no addressing mode or tags) Deterministic (no misses) Can replicate (multiple read ports) Short identifier (typically 3 to 8 bits) Reduce memory traffic

Disadvantages
Need to save and restore on procedure calls and context switch Cant take the address of a register (for pointers) Fixed size (cant store strings or structures efficiently) Compiler must manage Limited number

CSCE430/830

Every ISA designed after 1980 uses a load-store ISA (i.e RISC, to simplify CPU design).

ISA

Word-Oriented Memory Organization 32-bit


Memory is byte addressed and provides access for bytes (8 bits), half words (16 bits), words (32 bits), and double words(64 bits). Addresses Specify Byte Locations
Address of first byte in word Addresses of successive words differ by 4 (32-bit) or 8 (64-bit)
Addr = 0008 ?? Addr = 0000 ??

64-bit Words Words

Bytes Addr. 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015
ISA

Addr = 0000 ?? Addr = 0004 ??

Addr = 0008 ??

Addr = 0012 ??

CSCE430/830

Byte Ordering
How should bytes within multi-byte word be ordered in memory? Conventions
Suns, Macs are Big Endian machines Least significant byte has highest address Alphas, PCs are Little Endian machines Least significant byte has lowest address

CSCE430/830

ISA

Byte Ordering Example


Big Endian
Least significant byte has highest address

Little Endian
Least significant byte has lowest address

Example
Variable x has 4-byte representation 0x01234567 Address given by &x is 0x100 Big Endian 0x100 0x101 0x102 0x103

01
Little Endian

23

45

67

0x100 0x101 0x102 0x103

67
CSCE430/830

45

23

01
ISA

Reading Byte-Reversed Listings


Disassembly
Text representation of binary machine code Generated by program that reads the machine code

Example Fragment
Address 8048365: 8048366: 804836c: Instruction Code 5b 81 c3 ab 12 00 00 83 bb 28 00 00 00 00 Assembly Rendition pop %ebx add $0x12ab,%ebx cmpl $0x0,0x28(%ebx)

Deciphering Numbers

CSCE430/830

Value: Pad to 4 bytes: Split into bytes: Reverse:

0x12ab 0x000012ab 00 00 12 ab ab 12 00 00
ISA

Types of Addressing Modes (VAX)


Addressing Mode 1. Register direct 2. Immediate 3. Displacement 4. Register indirect 5. Indexed 6. Direct 7. Memory Indirect 8. Autoincrement Action R4 <- R4 + R3 R4 <- R4 + 3 R4 <- R4 + M[100 + R1] R4 <- R4 + M[R1] R4 <- R4 + M[R1 + R2] R4 <- R4 + M[1000] R4 <- R4 + M[M[R3]] R4 <- R4 + M[R2] R2 <- R2 + d 9. Autodecrement Add R4, (R2)R4 <- R4 + M[R2] R2 <- R2 - d 10. Scaled Add R4, 100(R2)[R3] R4 <- R4 + M[100 + R2 + R3*d] Studies by [Clark and Emer] indicate that modes 1-4 account for 93% of all operands on the VAX.
CSCE430/830 ISA

Example Add R4, R3 Add R4, #3 Add R4, 100(R1) Add R4, (R1) Add R4, (R1 + R2) Add R4, (1000) Add R4, @(R3) Add R4, (R2)+

Types of Operations
Arithmetic and Logic: Data Transfer: Control System Floating Point Decimal String Graphics AND, ADD MOVE, LOAD, STORE BRANCH, JUMP, CALL OS CALL, VM ADDF, MULF, DIVF ADDD, CONVERT MOVE, COMPARE (DE)COMPRESS

CSCE430/830

ISA

80x86 Instruction Frequency


Rank 1 2 3 4 5 6 7 8 9 10 Total
CSCE430/830

Instruction load branch compare store add and sub register move call return

Frequency 22% 20% 16% 12% 8% 6% 5% 4% 1% 1% 96%


9

ISA

Relative Frequency of Control Instructions


Operation Call/Return Jumps Branches SPECint92 13% 6% 81% SPECfp92 11% 4% 87%

Design hardware to handle branches quickly, since these occur most frequently

CSCE430/830

ISA

Summery
Instruction Set Overview
Classifying Instruction Set Architectures (ISAs) Memory Addressing Types of Instructions

MIPS Instruction Set (Topic of next class)


Overview Registers and Memory Instructions

CSCE430/830

ISA

You might also like