- Speedcore Gen4 embedded FPGAs deliver 60% faster performance, 50% lower power, 65% smaller die size compared to previous Speedcore eFPGA generation
- New Machine Learning Processor blocks deliver 300% higher performance for AI/ML applications
Santa Clara, Calif., December 4, 2018 – Achronix Semiconductor Corporation, a leader in FPGA-based hardware accelerator devices and high-performance eFPGA IP, today announced immediate availability of its Speedcore™ Gen4 embedded FPGA (eFPGA) IP for integration into users’ SoCs. Speedcore Gen4 increases performance 60%, reduces power by 50% and die area by 65% while retaining the original Speedcore eFPGA IP’s abilities to bring programmable hardware-acceleration capabilities to a broad range of compute, networking and storage systems for interface protocol bridging/switching, algorithmic acceleration and packet processing applications.
With the Speedcore Gen4 architecture, Achronix adds the new Machine Learning Processor (MLP) to the library of available blocks and delivers 300% higher system performance for artificial intelligence and machine learning (AI/ML) applications. MLP blocks are highly flexible, compute engines tightly coupled with embedded memories to give the highest performance per watt and lowest cost solution for AI/ML applications.
“Achronix Speedcore eFPGA with Gen4 architecture provides an optimal balance of hardware acceleration previously found only in ASIC implementations,” said Robert Blake, president and CEO of Achronix Semiconductor. “Our new architecture adds the flexibility and reprogrammability of our proven FPGA technology to support exploding demand for new AI/ML and high data bandwidth applications.”
The dramatic increase in fixed and wireless network bandwidth, coupled with the redistribution of processing, and the emergence of billions of IoT devices will stress traditional network and compute infrastructure. Classic Cloud and Enterprise Data Center computing resources and communications infrastructure can no longer keep pace with exponential growth in data rates, the rapidly changing security protocols, or the many new networking and connectivity requirements. Traditional multicore CPUs and SoCs cannot meet these requirements unaided. They need hardware accelerators, often reprogrammable to pre-process and offload computations to increase the systems’ overall compute performance.
Speedcore Gen4 is the Optimal AI/ML Accelerator
In addition to the general requirements of compute and networking infrastructure, AI/ML demands a significant increase in high-density, targeted computing. The new Achronix MLP exploits the specific attributes of AI/ML processing and increases performance for these applications by 300% compared to previous Achronix FPGA products. This is done through multiple architectural innovations that increase operating performance and the number of operations per clock cycle.
The MLP is a complete AI/ML compute engine. Each MLP includes a local cyclical register file that leverages temporal locality for optimal reuse of stored weights or data. The MLPs are tightly coupled with neighboring MLP blocks and larger embedded memory blocks to deliver the highest processing performance, the highest operations per second and the lowest power profile. The MLPs support multiple precision fixed point and floating point formats including Bfloat16, 16-bit, half-precision floating point, 24-bit floating point and block floating point (BFP). Users can select the optimal precision for their application for performance, power and area.
To complement the MLP and increase the AI/ML compute density, Speedcore Gen4 look-up-tables (LUTs) can implement multipliers that are 2x more efficient than any industry standalone or embedded FPGA products. Leading FPGAs implement 6×6 multipliers in 21 LUTs whereas Speedcore Gen4 implements 6×6 multipliers in 11 LUTs and can operate at 1 GHz.
“The AI/ML wave continues to gain momentum with more IP geared toward AI applications. Achronix’s announcement about its new IP architecture is encouraging and offers substantial improvements in performance, power and area: all issues that are important to silicon designers,” said Rich Wawrzyniak, senior analyst, ASIC Services. “The introduction of eFPGA IP aimed at AI/ML applications that can be cost effective in both Cloud servers for training and in end-point devices for inference applications reinforces the view that AI functionality will become a ‘check-list’ item in most silicon solutions going forward.”
Architectural Innovations Increase System Performance
The new Speedcore Gen4 architecture has many architectural innovations that increase overall operating performance by 60% compared to the previous generation of Speedcore products. All aspects of the LUTs have been enhanced to increase area efficiency and reduce resource usage which reduces power and die size and increases performance. Changes include doubling the size of the ALUs, doubling the registers per LUT, support for 7-bit functions and some 8-bit functions in a single level-of-logic delay, and dedicated high-speed connections for shift registers.
The routing architecture also has been enhanced with an independent and dedicated bus routing structure that includes dynamically selectable bus muxing that effectively create a distributed, run-time-configurable switching network. This is the first time that run-time logic functionality is available in the routing structure and it provides an optimal solution for high-bandwidth and low-latency applications.
How to Evaluate Speedcore Gen4
Achronix’s ACE design tools include pre-configured, Speedcore Gen4 eFPGA example instances users can use to evaluate Speedcore Gen4 quality of results for performance, resource usage, and compile times. The ACE design tools with support for Speedcore Gen4 are available today.
Speedcore is a modular architecture easily sized to users’ requirements. Achronix uses its Speedcore Builder tool to instantly create new Speedcore instances to match user requirements for quick evaluation. Users interested in receiving die size and power information can contact Achronix for details on their specific Speedcore Gen4 eFPGA size and process requirements.
Availability
Achronix is using its proven methodology to deliver Speedcore Gen4 eFPGA technology to users who want to combine the benefits and flexibility of eFPGA IP with enhanced AI/ML capabilities. Speedcore Gen4 IP is available for licensing on the most advanced FinFET processes today. Contact Achronix for details about supported process technologies.
