The new Speedcore eFPGA IP with Gen4 architecture dramatically improves upon Achronix’s original and highly successful Speedcore offering. Optimized for high-performance AI/ML and hardware-acceleration applications, Speedcore IP with Gen4 architecture delivers 60% faster performance (300% faster for AI/ML applications) while drawing 50% less power and consuming 65% less die area compared to the previous Speedcore eFPGA generation.

High-Performance FPGA Technology

AI/ML applications place heavy processing demands on systems, requiring billions or trillions of operations per second. Meanwhile, cloud and enterprise data-center computing resources and communications infrastructure can no longer keep pace with explosive, exponential growth in data bandwidth requirements, rapidly changing security protocols, or emerging networking standards. Today’s multi-core CPUs and SoCs cannot meet these needs unaided. Programmable hardware accelerators are now required to increase system performance by offloading these computations from overburdened server CPUs. Achronix specifically architected the Gen4 architecture to address the needs of these applications.

Architectural Innovations

Figure: Speedcore 7t (Gen4 Architecture) versus Speedcore 16t (Original Speedcore Architecture)

Speedcore eFPGA IP with Gen4 architecture incorporates many architectural enhancements that dramatically increase performance, reduce power consumption, and shrink die area:

Reconfigurable Logic Blocks (RLBs)

  • Logic – 6-input look-up-tables (LUTs) that implement all functions with as many as 7-inputs and some 8-input functions in a single level of logic. Reducing the need for multiple logic levels improves performance.
  • 8:1 Muxes – New, dedicated 8-to-1 multiplexers dramatically increase logic performance.
  • Shift chain – Double the number of registers compared to the original Speedcore architecture plus optimized routing for shift chains.
  • ALU – A larger ALU now supports 8-bit operations for addition, counting, comparison, and maximum functions.
  • LUT-based multiplication – Efficient, LUT-based multipliers require half the on-chip resources compared to other leading FPGA products: A 6 × 6 multiply requires only 11 LUTs and runs at 1 GHz. An 8 × 8 multiply requires only 18 LUTs and runs at 500 MHz.

Routing

  • Dedicated buses – A first in the FPGA industry! High-performance, bus-grouped routing channels, separate from the standard eFPGA routing channels, ensure that there is no congestion between bus-oriented data traffic — common with memories — and other types of data traffic routed over the eFPGA’s standard, bit-oriented channels.
  • Bus muxes – Another first in the FPGA industry; bus muxes allow users to efficiently create bus mux functions without consuming any LUTs or standard routing. This capability effectively creates a giant, distributed, run-time-configurable switching network that is separate from the eFPGA’s bit-oriented routing network.

New Machine Learning Processor (MLP) Block for AI/ML

The new MLP in Speedcore eFPGA IP with Gen4 architecture is a complete AI/ML compute engine. Each MLP includes a cyclical register file that leverages temporal locality to reuse stored/cached weights or data, thus boosting performance by significantly reducing data movement for a variety of calculations. The MLPs are tightly coupled with their neighboring MLPs and larger memory blocks to maximize processing performance and to deliver the highest number of operations per second with the lowest power profile. The MLPs support fixed-point and floating-point formats (Bfloat16; 16-bit, half-precision; and block floating point). Users can trade off precision versus performance by selecting the optimal data precision on the fly, as required by each application.

Feature Benefit
Configurable multiply precision and count Trade off performance/power vs. precision – Increasing multiplier count for lower precision functions.
Cyclical register file Double compute performance – Similar to a cache function in that data is saved for efficient reuse by the MLP. Optimized for AI/ML functions.
Column bonding and MLP cascade paths Higher performance – Hard paths between memory and other MLP blocks enable high-performance functionality while freeing up general-purpose routing.
Multiple number formats Flexibility – Supports mainstream fixed- and floating-point formats and frameworks.
Rounding and saturation System performance – Support for multiple rounding formats and saturation that would otherwise need to be implemented in LUTs.

Production Proven Design Process

Speedcore eFPGA IP with Gen4 architecture follows the same production-proven design process as Achronix’s first-generation Speedcore IP. Designers specify their custom mix of logic, memory, DSP, and MLP blocks to create a unique Speedcore instance that meets die-size, power-consumption, and resource-configuration requirements for their target application(s). Additionally designers can define the IP block aspect ratio and I/O port connections for a Speedcore eFPGA with Gen4 architecture to meet their specific SoC’s design requirements while balancing power against performance for specific applications. Achronix then generates and delivers a GDSII and all the supporting files required to complete integration and timing closure of the customized Speedcore eFPGA instance.

The Speedcore IP block instance can then be integrated directly into an ASIC or SoC using standard EDA design tools and methodology. Along with Speedcore eFPGA IP and supporting files, Achronix also provides a customized, full-featured version of the ACE design tools that designers use to design, verify, and program the functionality of their custom Speedcore eFPGA IP with Gen4 architecture.

Availability

Speedcore eFPGA IP with Gen4 architecture is available immediately, supported by the latest version of Achronix’s ACE design tool. This tool includes preconfigured example instances of Speedcore eFPGAs with Gen4 architecture. Users can evaluate performance, resource usage, and compile times for Speedcore eFPGA IP with Gen4 architecture using these example instances, even before developing their own designs.

To receive complete details of Speedcore eFPGA IP with Gen4 architecture, including die size and power consumption, contact Achronix.

Request More Information

Our experts are happy to advise you on how Achronix can help with your toughest design challenges.