Agenda
This article shows how to add a new instruction to RISC-V and simulate it.
These topics are covered along the way:
- Whole GNU
riscv
toolchain installation; - Implementation of a new instruction for
spike
RISC-V ISA simulator; - Manual instruction encoding in C/C++;
- Custom instruction simulation (with visible output);
- [riscv32-]GCC plugin development;
You may find useful.
Many things can go wrong. Be prepared to fix upcoming issues by yourself.
The final result is very rewarding, I promise.Toolchain installation
Choose installation directory. Call it RISCV
.
Add these lines to your ~/.bashrc
:
# Directory which will contain everything we need.export RISCV_HOME=~/riscv-home # $RISCV will point to toolchain install location.export RISCV="${RISCV_HOME}/riscv" export PATH="${PATH}:${RISCV}/bin"
Run mkdir -p "${RISCV_HOME}" "${RISCV}"
.
Use script to clone all required repositories.
If you wish to save some time and traffic, avoid recursive clone of toolchain repository. Instead, clone sub-modules by hand. You may exclude “riscv-glibc”.
Be warned: I have not tested partial toolchain build, caveat emptor
Satisfy prerequisites by installing all required packages. In addition, spike requires device-tree-compiler
package.
We choose:
- RISCV32 over RISCV64
- newlib over glibc
Repositories must be built in this order:
- riscv-gnu-toolchain
- riscv-fesvr, riscv-pk
- riscv-isa-sim
You can use script as a guideline.
To check installation, use .
Custom instruction description
Within the framework of this article, we will implement instruction.
rv32im
has mul
and add
instructions, mac
combines them.
a0 := a0 + a1 * a2
(ordinary 3-address instruction). # Without mac (preserve registers):mv t0, a0 # addi r0, a0, 0 mul a1, a2, a3add a1, a1, t0 # With mac: mac a1, a2, a3
Adding “mac” instruction to the rv32im
To add an instruction to the simulator: 1. Describe the instruction’s functional behavior; 2. Add the opcode and opcode mask to “riscv/opcodes.h”;
First step is accomplished by adding a riscv/insns/mac.h
file:
/* file "$RISCV_HOME/riscv-isa-sim/riscv/insns/mac.h" */// 'M' extension means we require integer mul/div standard extension.require_extension('M');// RD = RD + RS1 * RS2reg_t tmp = sext_xlen(RS1 * RS2); WRITE_RD(sext_xlen(READ_REG(insn.rd()) + tmp));
For the second step, we use .
cd "${RISCV_HOME}/riscv-opcodes"echo -e "mac rd rs1 rs2 31..25=1 14..12=0 6..2=0x1A 1..0=3\n" >> opcodes make install
It turns out there is a third step which is not documented. New entry must be added to the riscv_insn_list
.
sed -i 's/riscv_insn_list = \\/riscv_insn_list = mac\\/g' \ "${RISCV_HOME}/riscv-isa-sim/riscv/riscv.mk.in"
Rebuild the simulator.
cd "${RISCV}/riscv-isa-sim/build"sudo make install
Testing rv32im brand new instruction
At this stage:
- Compiler knows nothing about
mac
. It can not emit that instruction; - Assembler knows nothing about
mac
. We can not usemac
in inline assembly;
Our last resort is manual encoding.
#include// Needed to verify results.int mac_c(int a, int b, int c) { a += b * c; // Semantically, it is "mac" return a; } // Should not be inlined, because we expect arguments // in particular registers. __attribute__((noinline)) int mac_asm(int a, int b, int c) { asm __volatile__ (".word 0x02C5856B\n"); return a; } int main(int argc, char** argv) { int a = 2, b = 3, c = 4; printf("%d =?= %d\n", mac_c(a, b, c), mac_asm(a, b, c)); }
Save test program as test_mac.c
.
riscv32-unknown-elf-gcc test_mac.c -O1 -march=rv32im -o test_macspike --isa=RV32IM "${RISCV_PK}" test_mac
You should see 14 =?= 14
printed to stdout.
riscv32-unknown-elf-gdb
can help you in troubleshooting. Mac encoding explained
Be sure to look at if you aim for precise descriptions.
mac
will mimic mul
encoding, but use different opcode.
# file "riscv-opcodes/opcodes"# differs# |# vmac rd rs1 rs2 31..25=1 14..12=0 6..2=0x1A 1..0=3 mul rd rs1 rs2 31..25=1 14..12=0 6..2=0x0C 1..0=3 # ^ ^ ^ ^ ^ ^ ^ # | | | | | | | # | | | | | | | # | | | | | | also opcode 3 bits # | | | | | opcode 5 bits # | | | | funct3 3 bits # | | | funct7 7 bits # | | rs2 (src2) 5 bits # | rs1 (src1) 5 bits # dest 5 bits
Actual encoding has different order of components and opcode is really single 7 bit segment.
5 bits per register operand means that we have 32 addressable registers.
# Encoding used for "mac a0, a1, a2"0x02C5856B [base 16]==10110001011000010101101011 [base 2] == 00000010110001011000010101101011 [base 2] # Group by related bit chunks: 0000001 01100 01011 000 01010 1101011 ^ ^ ^ ^ ^ ^ | | | | | | | | | | | opcode (6..2=0x0C 1..0=3) | | | | dest (10 : a0) | | | funct3 (14..12=0) | | src1 (11 : a1) | src2 (12 : a2) funct7 (31..25=1)
Plugin vs patch
There are two ways to extend GCC:
- Patch GCC itself
- Write loadable plugin for GCC
Prefer plugins to GCC patches whenever possible.
GCC wiki page described advantages in the “Background” section.In this guide, both methods will be covered.
Useful links:
- series of posts
GCC “rv32imMac” plugin
TODO