SiFive’s brand-new P550 is one of the world’s fastest RISC-V CPUs
Today, RISC-V CPU design company SiFive launched a new processor family with two core designs: P270 (a Linux-capable CPU with full support for RISC-V’s vector extension 1.0 release candidate) and P550 (the highest-performing RISC-V CPU to date).
A quick RISC-V overview
For those not immediately familiar with RISC-V, it is a relatively new CPU architecture which takes advantage of Reduced Instruction Set Computer (RISC) principles. RISC-V is an open standard specifically designed to be forward-looking and evade as much legacy cruft as possible. One example of this design is RISC-V’s dynamic width vector instruction set, which allows developers to execute vector instructions on data of arbitrary size with maximum efficiency.
In traditional processor designs, a vector instruction has a fixed width tied to the hardware register size of the processor—for example, SSE and SSE2 allow use of a Pentium III’s 128-bit registers, while making full use of an i7-4770’s 256-bit registers requires a completely separate instruction set (AVX2) for the same mathematical operations. Moving up to an i7-1065G7’s 512-bit registers requires yet another instruction set, AVX-512—again, for the same underlying mathematical operations.
In sharp contrast, RISC-V vector math allows a single set of CPU instructions to perform the same set of mathematical operations as efficiently as possible, using whatever size registers the current CPU design has available. This means a developer can simply write a single routine that will process vector operations as efficiently as possible on a phone with 64-bit registers or on a supercomputer with 1,024-bit registers.
In addition to forward-looking features built into the RISC-V spec, the architecture is designed to provide flexibility that its designers did not or could not think of ahead of time. Generic RISC-V designs feature reserved opcodes, which designers of specific RISC-V CPUs may then take over to provide additional, arbitrary functionality.
The ability to “take over” reserved opcodes allows for greatly streamlined ASIC design, since both specialized instructions and general controller functionality can be provided on a single die—and without CPU architects needing to reinvent any wheels to provide the generic controller functionality.
For the moment, RISC-V is not a serious competitor to either Arm or x86 in the general-purpose processor space, but it’s heavily used in the microcontroller space, due in part to its extensibility and inexpensive licensing. We do broadly expect RISC-V to become a third major player when it comes to general-purpose CPUs—the sort that provide the “main brain” for phones, tablets, and traditional computers—but that is still some years away.
What’s new in the SiFive Performance family?
The two new designs announced today are P270 and P550. P270 is SiFive’s first CPU to fully support the optional RISC-V vector extension 1.0 release candidate, and P550 is SiFive’s highest-performing RISC-V processor to date—also making it, as far as we know, the highest-performing RISC-V processor available.
P270 and “V” 1.0-rc1
As you’d expect from the “release candidate” rider, RISC-V’s “V” optional instruction set is not yet a frozen standard. When the V spec reaches 1.0—without the “release candidate” rider—it will be considered stable enough to freeze the feature set. This will allow developers to begin work on long-term projects using it for toolchains, functional simulators, and so forth, with some degree of certainty that the code the developers have written will “just work” on future CPU designs.
It’s worth noting that even once the release candidate tag is removed, the 1.0 version of the V instructions will still only be considered ready for public ratification. The first true production version of V will be 2.0, a version number awarded after public ratification is considered complete, with no major functionality changes necessary.
SiFive also offers a translation utility called Recode, which automatically converts legacy SIMD code to V-spec vector assembly.
P550 high performance
Both P270 and P550 are Linux-capable designs, but the P270 is limited to a dual-issue, in-order pipeline with only eight stages. While the P270’s full V extension support should make it a formidable processor for heavily vector-math-dependent applications, the P550 should prove far more powerful for applications closer to those currently handled by general-purpose CPUs.
SiFive’s new Performance P550 core features a 13-stage, triple-issue, out-of-order pipeline. SiFive claims that a four-core P550-based CPU takes up roughly the same on-die area as a single Arm Cortex-A75, with a significant performance advantage over that competing Arm design. SiFive says the P550 delivers 8.65 SPECInt 2006 per GHz, based on internal engineering test results—a laudable result when compared to Cortex-A75 (and not too far behind an i9-10900K’s 11.08/GHz). But it’s well behind an Apple A14’s 21.1/GHz.
Intel adopts P550 for use in its Horse Creek platform
First and foremost, we need to make one thing clear—we are almost certainly not talking about Intel ditching the x86_64 architecture for RISC-V! Modern x86_64 CPUs from Intel and AMD include management and supervisory cores, which are not directly accessible to end users. These are typically Arm CPU cores; for example, AMD’s first APUs used Cortex-A5 for their platform security processor.
The joint announcement from Intel and SiFive is unclear on just what Horse Creek will be. Intel generally reserves the “Creek” names for socketed platforms rather than all-in-one system on chip (SoC) boards. This suggests that, in all likelihood, the P550 will be limited to supervisory or management duties within x86_64 Horse Creek CPUs rather than directly processing instructions from software running on that platform.
Anandtech’s Ian Cuttress points out that building the P550 directly into Horse Creek—which will be built on Intel’s newest 7nm process node—might provide Intel with simpler testing and more rapid development of the new 7nm process itself.
https://arstechnica.com/?p=1775200