Compute Acceleration

Today's workloads in compute acceleration are as diverse as the end applications — everything from financial trading and genomics to machine learning inference and training. However, the workloads share some common characteristics including the types of arithmetic functions, number formats (integer and floating point), and aggressive performance targets. Furthermore, as processing naturally migrates closer to the edge, power, thermal aspects and performance per watt become key metrics. It is in these areas that FPGAs in general, and the Speedster7t family in particular, excel.

The Speedster7t FPGA family is optimized for high-bandwidth workloads and eliminates the performance bottlenecks associated with traditional FPGAs. Built on the TSMC 7nm FinFET process, Speedster7t FPGAs feature a revolutionary new 2D network-on-chip (2D NoC), an array of new machine learning processors (MLPs) optimized for high-bandwidth and artificial intelligence/machine learning (AI/ML) workloads, high-bandwidth GDDR6 interfaces, 400G Ethernet and PCI Express Gen5 ports. The 2D NoC connects all of the interfaces to over 80 access points in the FPGA fabric to deliver ASIC-level performance while retaining the full programmability of FPGAs. Get started today with the VectorPath accelerator card, featuring the Speedster7t FPGA.

Speedster7t Solution

Speedster7t FPGAs provide a high-performance, power efficient computational acceleration solution for defense, financial, medical, scientific, oil and gas, and life science applications:
- Machine learning (ML) inference and edge training
- Financial analysis and high-frequency trading
- Genomic analysis
- Video and image processing
The inherent parallelism and flexibility of the FPGA architecture is well suited to these high-throughput applications.
High-speed interfacing with PCIe Gen5 connectivity and high-performance Ethernet, as well as a dedicated 2D network-on-chip (NoC) for high bandwidth data movement.
Storage of large data sets is possible with DDR4/5 bulk storage and GDDR6 interfaces for high-bandwidth access to external memory.
Data processing supports a wide-variety of number formats from low-bit width integer math to high-performance floating point operations, including native support for matrix multiplications and complex arithmetic (for example, to support beamforming applications).
Speedster7t FPGAs are particularly well suited to ML inference and edge analytics operations.

Application Requirements	Speedster Value
Need for high bandwidth external connectivity	Multiple ports of 400G Ethernet and PCIe Gen5
Highest memory bandwidth for buffering, >1 Tbps	Up to 16 independent GDDR6 channels at 16 Gbps offering up to 4 Tbps of total bandwidth
Wide and high-performance datapath	Dataflow optimized for compute acceleration matrix vector mathematics Up to 20 Tbps of NoC bandwidth for high-speed, wide-data transfers Optimized bus routing quantized at one byte Fully flexible bit-wise routing Dedicated routing paths to support data reuse between multiply-accumulator and memory Cascade path to enable, for example, systolic array implementation Integrated register file to enable time-multiplexing of calculations
Significant computational requirement for integer arithmetic	MLP deliver up to 61 TOps for int8 Modified Booth algorithm allows double density of integer multiplies in LUTs
Neural network inferencing requires a large number of matrix multiplications, high-performance computation and significant amounts of data movement	Optimized multiply-accumulate core for integer and floating-point arithmetic Truly fracturable integer width: 4x int16 to 16x int8 to 32x int4 FP16, bfloat16 and custom floating point support Native support for block floating point

	Machine Learning Deep Learning	High Performance Compute	Genomics	Video & Image Processing
Highest Performance SerDes
112G multi-Standard SR/MR/LR PHY	Yes	Yes	Yes	Yes
Most Advanced Interface IP
PCIe Gen5	Yes	Yes	Yes	Yes
GDDR6 - 4 Tbits/sec of memory bandwidth	Yes	Yes	Yes	Yes
DDR4 - up to 3,200 MHz, 3DS stacked memory	Yes	Yes	Yes	Yes
DDR5 - up to 4,400 MHz	Yes	Yes	Yes
Application specific interface			Yes	Yes
Terabit Speed Routing
NoC	Yes	Yes	Yes	Yes
Bus routing	Yes			Yes
Fully flexibility bit wise routing	Yes
High-Throughput Processing
Datapath crypto	Yes			Yes
MLP	Yes	Yes	Yes	Yes
Fine grain hardware reprogrammability (examples listed)	Format conversion, activation function	Monte Carlo analysis	PairHMM algorithm	Custom codecs

Speedster7t FPGAs

Speedcore Embedded FPGA IP

VectorPath Accelerator Cards

Achronix Tool Suite

Products

6G Infrastructure

Automotive

Artificial Intelligence and Machine Learning

Computational Storage

Defense and Hardware Assurance

Networking

Test & Measurement

Applications

Documentation

Demo Videos

Training Videos

Support Portal

Getting Started

Technical Support

Newsroom

Achronix Management Team

Achronix Partners

Careers with Achronix

Sales Contacts

Company

Compute Acceleration

Speedster7t Solution