Jump to content
Facebook Twitter Youtube

[Hardware] RV64X: A Free, Open Source GPU for RISC-V


rlex
 Share

Recommended Posts

 

 

A group of enthusiasts are proposing a new set of graphics instructions designed for 3D graphics and media processing. These new instructions are built on the RISC-V base vector instruction set. They will add support for new data types that are graphics specific as layered extensions in the spirit of the core RISC-V instruction set architecture (ISA). Vectors, transcendental math, pixel, and textures and Z/Frame buffer operations are supported. It can be a fused CPU-GPU ISA. The group is calling it the RV64X as instructions will be 64-bit long (32 bits will not be enough to support a robust ISA).

Why now?

The world has plenty of GPUs to choose from, why this? Because, says the group, commercial GPUs are less effective at meeting unusual needs such as dual-phase 3D frustum clipping, adaptable HPC (arbitrary bit depth FFTs), hardware SLAM. They believecollaboration provides flexible standards, reduces the 10 to 20 man-year effort otherwise needed, and will help with cross-verification to avoid mistakes.

The team says their motivation and goals are driven by the desire to create a small, area-efficient design with custom programmability and extensibility. It should offer low-cost IP ownership and development, and not compete with commercial offerings. It can be implemented in FPGA and ASIC targets and will be free and open source. The initial design will be targeted to low-power microcontrollers. It will be Khronos Vulkan-compliant, and over time support other APIs (OpenGL, DirectX and others).

The open source movement that transformed software development is gaining traction among hardware developers. Early efforts centered on the RISC-V architecture are leading the way. We explore the promise and the pitfalls of open hardware development in our upcoming Open Source Special Project.

The final hardware will be a RISC-V core with a GPU functional unit. To the programmer it will look like a single piece of hardware with 64-bit long instructions coded as scalar instructions. The programming model is an apparent SIMD, that is, the compiler generates SIMD from prefixed scalar opcodes. It will include variable-issue, predicated SIMD backend, vector front-end, precise exceptions, branch shadowing and much more. There won’t be any need for RPC/IPC calling mechanism to send 3D API calls to/from unused CPU memory space to GPU memory space and vice-versa, says the team. And it will be available as 16-bit fixed point (ideal for FPGAs), as well as 32-bit floating point (ASICs or FPGAs).

The design will employ the Vblock format (from the Libre GPU effort):

It is a bit-like VLIW (only not really)
A block of instructions is pre-fixed with register tags which give extra context to scalar instructions within the block
Sub-blocks include: vector length, swizzling, vector/width overrides and predication.
All this is added to scalar opcodes
There are no vector opcodes (and no need for any)
In the vector context, it goes like this: if a register is used by a scalar opcode, and the register is listed in the vector context, vector mode is activated
Activation results in a hardware-level for-loop issuing multiple contiguous scalar operations (instead of just one).
Implementers are free to implement the loop in any fashion they desire: SIMD, multi-issue, single-execution.
The design will employ scalars (8-, 16-, 24- and 32-bit fixed and floats), as well as transcendentals (sincos, atan, pow, exp, log, rcp, rsq, sqrt, etc.). The vectors (RV32-V) will support 2-4 element (8-, 16- or 32-bits/element) vector operations, along with specialized instructions for a general 3D graphics rendering pipeline for points, pixels, texels (essentially special vectors)

XYZW points (64- and 128-bit fixed and floats)
RGBA pixels (8-, 16-, 24- and 32-bit pixels)
UVW texels (8-, 16-bits per component)
Lights and materials (Ia, ka, Id, kd, Is, ks…)
Matrices will be 2 × 2, 3 × 3, and 4 × 4 matrices will be supported as a native data type along with memory structures to support them for attribute vectors and will be essentially represented in a 4 × 4 matrix.

Among the advantages of fused CPU-GPU ISA is the ability to implement a standard graphics pipeline in microcode, provide support for custom shaders and implement ray-tracing extensions. It also supports vectors for numerical simulations with 8-bit integer data types for AI and machine learning.

Custom rasterizers can be implemented such as splines, SubDiv surfaces and patches.

The design will be flexible enough that it can implement custom pipeline stages, custom geometry/pixel/frame buffer stages, custom tessellators and custom instancing operations.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share

WHO WE ARE?

CsBlackDevil Community [www.csblackdevil.com], a virtual world from May 1, 2012, which continues to grow in the gaming world. CSBD has over 70k members in continuous expansion, coming from different parts of the world.

 

 

Important Links