Speedcore eFPGAs Offer Unbeatable Bandwidth and Latency Performance
Shipping to end customers since the middle of 2016, Speedcore embedded FPGA (eFPGA) IP has brought the power and flexibility of programmable logic to ASICs and SoCs. Customers can integrate a Speedcore eFPGA into an SoC for high-performance, compute-intensive and real-time processing applications such as AI, machine learning, 5G wireless, networking and automotive.
Proven on the TSMC 16FF+ process node, Speedcore eFPGA IP enables customers to create a customized programmable fabric. Users specify their logic, memory and DSP resource needs, then Achronix configures the Speedcore IP to meet their individual requirements. Speedcore look-up-tables (LUTs), RAM blocks and DSP64 blocks can be assembled like building blocks to create the optimal programmable fabric for any given application. Achronix delivers the customized Speedcore eFPGA instance as a hard macro in GDSII format along with a customized version of ACE design tools.
- 10× higher bandwidth
- 100× lower latency
- 10× lower cost
- 50% lower power
As opposed to a one-size-fits-all approach to building an embeddable FPGA fabric, the Speedcore solution is compiled architecture. Rather than having to pick from a library of pre-built fabrics, a system architect can define a mix of LUTs, LRAMs, BRAMs and DSP blocks for a cluster, then specify overall resource goals for the fabric along with an aspect ratio (expressed in clusters). The resulting fabric is an X by Y array of these custom clusters.
During this specification phase, Achronix returns a detailed proposal on the custom fabric, plus a software model for evaluation using ACE design tools.
- Logic – look-up-tables (LUTs) plus integrated wide MUX functions and fast adders
- Logic RAM – 4 kb per memory block
- Block RAM – 20 kb per memory block
- DSP64 – each block has a 18 × 27 multiplier, 64-bit accumulator and 27-bit pre-adder
- Custom blocks – customer/application-specific functions
Speedcore logic density – from 5K to 2M LUTs
Speedcore performance – max 750 MHz
Speedcore power – 12 mW static power per 1,000 LUTs at 105°C (on the TSMC 16FF+ process)
Speedcore die size – 0.20 mm2 per 1,000 LUTs (on the TSMC 16FF+ process)
Speedcore Custom Blocks
Speedcore custom blocks greatly increase the capabilities of Speedcore eFPGAs by allowing designers to define custom functions that can be added as additional blocks in the eFPGA fabric, alongside the traditional building blocks of LUTs, RAMs, and DSPs.
Speedcore look-up-tables (LUTs), RAM blocks, DSP64 blocks and custom blocks can then be assembled in flexible columns to create the optimal programmable function for any given application. Candidates for customer blocks range from custom memory configurations, TCAMs, to highly specialized blocks such as CNN-optimized DSP blocks targeting object recognition applications. Speedcore custom blocks are defined collaboratively with Achronix through a detailed architecture analysis of acceleration workloads in the customer’s target application.
Speedcore custom blocks massively improve performance, power, and area, enabling functionality that has never before been possible in standalone FPGAs. With Speedcore custom blocks, customers gain ASIC efficiency while retaining FPGA flexibility, resulting in a highly efficient implementation that minimizes power and area while maximizing data throughput with ASIC-level performance.
Design Tool Support
Achronix ACE design tools fully support Speedcore eFPGAs from design capture to bitstream generation and system debug. Customers can use the powerful floorplanner tool for design optimization and to make regional or site assignments for all block instances before heading to timing-drive place and route. ACE also includes a critical path analysis tool to analyze timing to ensure a design is meeting its performance specs. Customers also have access to ACE’s powerful Snapshot embedded logic analyzer to create complex triggers and show run-time signals within a Speedcore instance.