Building a custom SoC to run DOOM¶
This page will describe my experience building a custom RISC-V CPU for the sole purpose of running DOOM on a small FPGA (the Gowin Tang Nano 20K).
I will be creating a custom instruction set (RV32doom?) that augments the base RV32IM with instructions useful for running DOOM at full resolution and framerate on this FPGA.
I will also be implementing all of the Uncore components necessary for the CPU to function.
Overview¶
[TODO (it isn't finished yet)]
Implementation details¶
Full in depth writeups will be added here as I complete components:
The development process:¶
To start this project, the naive first step would be to compile DOOM for RV32IM, profile it, and see what other instructions I should include and which I should invent. However, since I'm short on resources and I actually want ot make sure this will work, I'm instead going to create a minimum working example to prove that I can boot DOOM on this FPGA at all before I start LARPing as a computer architect.
It turns out that building real hardware requires a lot of other logic just to move/store data and interface with the real world. I'll be referring to all of this "stuff" as the Uncore.
Building the display driver¶
Perhaps the most interesting feature of the Tang Nano 20K, and the one I am most unfamiliar with, is the HDMI port. Driving signals over HDMI (really DVI-D) isn't too complicated. The only potential issue is the size of the frame buffer. I don't want to put any more load on the DRAM than necessary, but I have limited on chip memory available. Fortunately, "full resolution" for DOOM was only 320x200 in 8-bit color. This means I can store one full frame and a copy of the color palette on chip, but not much more than that.
The frame buffer and color palette will be connected to the system through the memory bus (AXI interconnect) with the frames being copied from DRAM via the CPU or DMA.
The display driver uses the data in the frame buffer and palette to calculate pixel values and upscale the frame to 640x480 before passing the pixels through a TDMS encoder, serializer and out to the display.
Full Details¶
Building the memory controller¶
The Tang Nano 20K includes 8MB of SDRAM in the FPGA package. A basic memory controller acts as the interface between the memory bus (AXI) and the SDRAM. The SDRAM is configured with burst length of 8 for both read and writes. However, variable lengths bursts (up to 8) and single byte accesses are supported via the AXI interface.
Full Details¶
Implementing the base core¶
Initially, I planed to base the core on the two-cycle RV32E CPU I previously built: tiny-riscv
As it turns out, that core is basically worthless for any useful implementation. DOOMCore (v1) required a pretty drastic rewrite. This involved much more than I cared to write about.
Full Details¶
Implementing the memory bus¶
I decided early on every component of this SoC would be connected via AXI bus. This means I will need an AXI interconnect to connect my CPU to all the peripherals. So far, the only RTL I haven't written entirely myself are the AXI interfaces. I don't wish to change that any time soon. So instead I will use the AXI crossbar written by PULP. They have a paper which I haven't read, but their RTL seems to work alright.
At first glance, instantiating the axi_xbar is a nightmare. It kind of is. Luckily these days we have LLMs to help drag us out of dependency hell.
The crossbar connects the two master ports on the CPU, to four slaves: the bootrom, the SD card interface, SDRAM, and the frame buffer.
TODOs:¶
- Connect framebuffer to memory bus
- Write SD card interface
- Connect SD card to memory bus
- Write Bootloader
- Port DOOM
- Attempt to run
- Fix bugs
- Optimize