FuryGpu – Custom PCIe FPGA GPU

argulane | 446 points

So, this is my project! Was somewhat hoping to wait until there was a bit more content up on the site before it started doing the rounds, but here we are! :)

To answer what seems to be the most common question I get asked about this, I am intending on open-sourcing the entire stack (PCB schematic/layout, all the HDL, Windows WDDM drivers, API runtime drivers, and Quake ported to use the API) at some point, but there are a number of legal issues that need to be cleared (with respect to my job) and I need to decide the rest of the particulars (license, etc.) - this stuff is not what I do for a living, but it's tangentially-related enough that I need to cover my ass.

The first commit for this project was on August 22, 2021. It's been a bit over two and a half years I've been working on this, and while I didn't write anything up during that process, there are a fair number of videos in my YouTube FuryGpu playlist (https://www.youtube.com/playlist?list=PL4FPA1MeZF440A9CFfMJ7...) that can kind of give you an idea of how things progressed.

The next set of blog posts that are in the works concern the PCIe interface. It'll probably be a multi-part series starting at the PCB schematic/layout and moving through the FPGA design and ending with the Windows drivers. No timeline on when that'll be done, though. After having written just that post on how the Texture Units work, I've got even more respect for those that can write up technical stuff like that with any sort of timing consistency.

I'll answer the remaining questions in the threads where they were asked.

Thanks for the interest!

PfhorSlayer | a month ago

It's incredible how influential Ben Eater's breadboard computer series has been in hobby electronics. I've been similarly inspired to try to design my own "retro" CPU.

I desperately want something as easy to plug into things as the 6502, but with jussst a little more capability - few more registers, hardware division, that sort of thing. It's a really daunting task.

I always end up coming back to just use an MCU and be done with it, and then I hit the How To Generate Graphics problem.

MalphasWats | a month ago

Cool! I found the hello blog here illuminating to understand the creators intentions: https://www.furygpu.com/blog/hello

As I read it, it's just a fun hobby project for them first and foremost and looks like they're intending to write a whole bunch more about how they built it.

It's certainly an impressive piece of work, in particular as they've got the full stack working, a windows driver implementing a custom graphics API and then quake running on top of that. A shame they've not got some DX/GL support but I can certainly understand why they went the custom API route.

I wonder if they'll open source the design?

gchadwick | a month ago

This is my dream!

The last year I've been working on a 2d focused GPU for I/O constrained microcontrollers (https://github.com/KallDrexx/microgpu). I've been able to utilize this to get user interfaces on slow SPI machines to render on large displays, and it's been fascinating to work on.

But seeing the limitation of processor pipelines I've had the thought for a while that FPGAs could make this faster. I've recently gotten some low end FPGAs to start learning to try and turn my microgpu from an ESP32 based one to an FPGA one.

I don't know if I"ll ever get to this level due to kids and free time constraints, but man, I would love to get even a hundredth of this level.

KallDrexx | a month ago

Pipeline seems retro, but far better than nothing.

There's no open hardware GPU to speak of. Depending on license (can't find information?), this could be the first, and a starting point for more.

snvzz | a month ago

I can't believe that this is the closest we have to a compact, stand-alone GPU option. There's nothing like a M.2 format GPU out there. All I want is a stand-alone M.2 GPU with modest performance, something on the level of embedded GPUs like Intel UHD Graphics, AMD Radeon, or Qualcomm's Adreno.

I have an idea for a small embedded product which needs a lot of compute and networking, but only very modest graphical capabilities. The NXP Layerscape LX2160A [1] would be perfect, but I have to pass on it because it doesn't come with an embedded GPU. I just want a small GPU!

[1]: https://www.nxp.com/products/processors-and-microcontrollers...

detuur | a month ago

Very cool project, and I love to see more work in this space.

Something else to look at is the Vortex project from Georgia Tech[1]. Rather than recapitulating the fixed-function past of GPU design, I think it looks toward the future, as it's at heart a highly parallel computer, based on RISC-V with some extensions to handle GPU workloads better. The boards it runs on are a few thousand dollars, so it's not exactly a hobbyist friendly, but it certainly is more accessible than closed, proprietary development. There's a 2.0 release that just landed a few months ago.

[1]: https://vortex.cc.gatech.edu/

raphlinus | a month ago

This looks like an incredible achievement. I'd love to see some photos of the physical device. I'm also slightly confused about which FGPA module is being used. The blog mentions the Xylinx Kria SoMs but if you follow the links to the specs of those modules, you see they have ARM SoCs rather than Xylinx FGPAs. The whole world of FGPAs is pretty unfamiliar to me so maybe I'm missing something.

https://www.amd.com/en/products/system-on-modules/kria/k26/k...

spuz | a month ago

Supporting hardware features equivalent to a high-end graphics card of the mid 1990s

I see no one else has asked this question yet, so I will: How VGA-compatible is it? Would I be able to e.g. plug it into any PC with a PCIe slot, boot to DOS and play DOOM with it?

userbinator | a month ago

I hope the author goes into some detail about how he implements the PCIe interface! I doubt I'll ever do hardware work at that level of sophistication, but for general cultural awareness I think it's worth looking under the hood of PCIe.

nxobject | a month ago

Hopefully their hardware programming model is going full hardware circular command/interrupt buffers (even for GPU register programming).

It is how it is done on AMD GPU, that said I have no idea what is the nvidia hardware programming model.

sylware | a month ago

Similarly there is this: https://github.com/ToNi3141/Rasterix

Would be neat if someone made an FPGA GPU which had a shader pipeline honestly.

jamesu | a month ago

Excellent job. Would be amazing if this became an open source hardware project.

wpwpwpw | a month ago

beyond amazing. i've dreamt of this. so inspiring. it reminds me of alot of time i spent thinking about this: https://rcl.ece.iastate.edu/sites/default/files/papers/SteJo... i actually wrote one of the professors asking for more info. didn't get a reply. my dream EE class I never got to take.

bobharris | a month ago
[deleted]
| a month ago

This is insane! As a hobby hardware designer myself, I can imagine how much work must have gone into reaching this stage. Well done!

bloatfish | a month ago

"UltraScale" in name assumes ultra price? FPGAs seem to be an expensive toy.

codedokode | a month ago

What an inspiring passion project! Very ambitious first Verilog project.

allanrbo | a month ago

FPGAs for native FP4 will change the entire landscape

iAkashPaul | a month ago

can you run valorant on it?

anon115 | a month ago

It needs to be very fancy to write text in light gray on white.

I am not sure your product will be a success.

I am sure you web design skills need a good overhaul.

notorandit | a month ago