Reduced Instruction Set Computer (RISC)

These set of slides: https://www.elsevier.com/books-and-journals/book-companion/9780128122754/lecture-slides#Lecture%20Slides accompany the book

RISC-V ISA

There are multiple families of RISC-V, as you can see in the table below:

Family	# of Registers
RV32I	32
RV32E	16
RV64I	32
RV128I	32

In this course, I learned RV32I, where

RV = RISC-V
32 = width of the registers (address space size)
I represents the number of registers, (E has less, used in embedded)

Information about RISC-V

RISC-V is Harvard Architecture
RISC-V assembly language notation uses 64-bit Registers,
There are 32 registers, namely x0-x31, where x0 is always zero
RISC-V instructions are 32-bits, instruction[31:0]

Other Terminology: Word = 32 bits = 4 bytes Double word= 64 bits = 8 bytes

memory contains $2^{61}$ memory words - using load/store instructions e.g. 64-bits available (bits 63 downto 0, 3 of those bits are used for byte addressing, leaving 61 bits)
data memory is byte-addressable: Data is 64-bits long (so increment by 8 to get next sequential data value)
- instruction memory is byte-addressable: Instructions are 32-bits long (increment by 4 bytes to get next instruction)
- byte-addressable was chosen for supporting 16-bit instructions as well

Other Notation:

R[rs1] refers to contents of register rs1
M[addr] refers to value stored at address addr in memory
IMPORTANT: ${imm, 1 b ’0}$ means concatenating immediate imm with one bit of value 0
- ${imm, 12 b ’0}$ means concatenate with 12 zero bits

Convert RISC-V to Machine Code

Ahh, this is actually really easy. Look at this RISC-V green card for reference.

Actually note, there are some problems with having to Sign-Extend, which I always get confused about.

To convert to machine code, you look at the function you (ex: add) and then follow the core instruction format to convert your instructions into machine code (binary).

Instructions have similar instruction format, same length opcode fields:

opcode, funct3, funct7(this is specific to each instruction)
source registers: rs1, rs2
destination register: rd
immediate field: imm (used to like jump a certain number of bytes in certain instructions)

To understand why these instructions are designed the way they are, see RISC-V Implementation.

More notation

$R [rs 1]$ = contents of register $rs 1$
$M [R [rs 1] + imm]$ = contents of memory at address $R [rs 1] + imm$

Below is a summary table of what the commands actually do.

MNEMONIC a dd, a dd w a dd i, a dd i w ja l ja l r l d s d b e q bn e l u i FMT R I U J U J I S SB SB U NAME ADD(Word) ADD Immediate(Word) Jump and Link Jump and Link Register Load Doubleword Store Doubleword Branch Equal Branch Not Equal Load Upper Immediate DESCRIPTION (in Verilog) R [r d] = R [rs 1] + R [rs 2] R [r d] = R [rs 1] + imm R [r d] = PC + 4; PC = PC + {imm, 1 b^{'} 0} R [r d] = PC + 4; PC = R [rs 1] + imm R [r d] = M [R [rs 1] + imm] (63 : 0) M [R [rs 1] + imm] (63 : 0) = R [rs 2] (63 : 0) i f (R [rs 1] == R [rs 2]) PC = PC + {imm, 1 b^{'} 0} i f (R [rs 1]! = R [rs 2]) PC = PC + {imm, 1 b^{'} 0} R [r d] = {32 b^{'} imm ⟨ 31 ⟩, imm, 1 2^{'} b 0}

Below is a table that tells how how one expects the format of the Machine Language.

Notes on all the types

I-Type: We actually need to sign extend the immediate value, since you can’t directly add a 64-bit value (Register) to 12-bit value (Immediate)
S-Type: Notice how the 12-bit immediate is split into 2 parts, because we need also need $rs 2$ for this type of instruction
- this is to minimize hardware complexity as you will come to understand when you look at the RISC-V Implementation
SB-Type: This is just weird, why do we set the last bit to 0?
- Answer: Because it is only possible to branch to even addresses.
- But why in the machine language it is so weird?? Don’t worry about it, it’s for performance reasons, but basically you get all the bits you need, just in a different order, but yea you get 12 bits
- Ahh, so basically when you write for example $2$ , the immediate would be 1, and not 10, because that 0 is added automatically. So you can basically index into 2 times more addresses!! This is a smart design. But what if you want more? You need to lui.
UJ-Type for jal: Same comment
- it cannot encode odd addresses
U-Type: Because it is lui, you want to set the register value to something

This is important to know because in exam, you might actually need to write out the binary.

Commands

Loading a particular value into a register

Write assembly for RV32I which puts 0xffffeffff into register x5:

lui (32 downto 12) and addi(11 downto 0) (12 bit)

We do sign extension by adding 1, since the 12th-bit has a 1

0xffffef + 0x000000001 = 0xfffff000

lui x5, 0xfffff000 # sign extended
addi x5, x5, fff

ahh, i finally get it with the sign extension.

The idea is that say you want to just set a register to a certain value. You can’t just do it in one go like you do for MIPS with the LIS command. Say you want to put 00000000 00111101 00000101 00000000 into $x 19$ . You start with lui, which loads bits 12-31 using the immediate value.

lui x19, 976 // 976decimal = 0000 0000 0011 1101 0000

So then register 19 is 00000000 00111101 00000000 00000000

We then do

addi x19, x19, 1280 // 1280 = 00000101 00000000

The final value in register $x 19$ is the desired value, 00000000 00111101 00000101 00000000

However, if bit 11 is 1, then the 12-bit immediate is sign-extended, so the addend would have been negative. This means that in addition to adding in the rightmost 11 bits of the constant, we would have also subtracted $2^{12}$ . To compensate for this error, it suffices to add 1 to the constant loaded with lui, since the lui constant is scaled by $2^{12}$ .

Some Examples

$d = b + c - e$ We can implement in assembly, by assigning variables to registers (variables $b, c, d, e$ are assigned to registers x6,x2,x3,x4):

add x5,x6,x2 # a=b+c , or x5 = x6 + x2, a is a temporary variable stored in x5 
sub x3,x5,x4 # d=a-e , or x3 = x5 - x4

$g = h + A [8]$

where $A []$ is an array of 100 double words, with a base address stored at x22

ld x9, 64(x22) # 8x8 = 64 bytes
add x20, x21, x9

Notice that we use 64 because RISC-V is Byte Addressed.

a doubleword is 8 bytes, and we want the 8th doubleword, so 8x8 =64th byte

I-type instructions $A [30] = h + A [30] + 1$

ld x9, 240(x10) # get A[30]
add x9, x9, x21 # Add h
addi x9, x9, 1 # Add 1
sd x9, 240(x10) # Store to A[30]

Conversions

Conversion between Assembly and Assembly and Machine Language. You should probably hand practice this at first and check yourself.

ld x9, 64(x22)
0000 0100 0000 10110 011 01001 0000011

Notice that We need to do Sign Extension to go from 12 bits of the immediate to 64-bit double word size

Store doubleword

sd x9,240(x10)
0000111 01001 01010 011 10000 0100011

#gap-in-knowledge We use sd instead of sw because we are working with 64-bit registers with RISC-64. What would happen if we used the sw command?

🛠️ Steven Gong

Table of Contents

Reduced Instruction Set Computer (RISC)

Information about RISC-V

Convert RISC-V to Machine Code

Commands

Some Examples

Conversions

Other

Graph View

Backlinks