Today’s workloads in compute acceleration are as diverse as the end applications — everything from financial trading, genomics through to machine learning inferencing and training. However, the workloads share some common characteristics including the types of arithmetic functions, number formats (integer and floating point), and aggressive performance targets. Furthermore, as processing naturally migrates closer to the edge, power, thermal aspects and performance per watt become key metrics. It is in these areas that FPGAs in general, and the Speedster7t family in particular, excel.
- Speedster7t FPGAs provide a high-performance, power efficient computational acceleration solution for defense, financial, medical, scientific, oil and gas, and life science applications:
- Machine learning (ML) inference and edge training
- Financial analysis and high-frequency trading
- Genomic analysis
- Video and image processing
- The inherent parallelism and flexibility of the FPGA architecture is well suited to these high-throughput applications.
- High-speed interfacing is simplified with PCIe Gen5 connectivity and high-performance Ethernet up to 400G, as well as a dedicated 2D network-on-chip (NoC) for high bandwidth data movement.
- Storage of large data sets is possible with DDR4/5 bulk storage and GDDR6 interfaces for high-bandwidth access to external memory.
- Data processing supports a wide-variety of number formats from low-bit width integer math to high-performance floating point operations, including native support for matrix multiplications and complex arithmetic (for example, to support beamforming applications).
- Speedster7t FPGAs are particularly well suited to ML inference and edge analytics operations.
|Application Requirements||Speedster Value|
|Need for high bandwidth external connectivity||Multiple ports of 400G Ethernet and PCIe Gen5|
|Highest memory bandwidth for buffering, >1 Tbps||Up to 16 independent GDDR6 channels at 16 Gbps offering up to 4 Tbps of total bandwidth|
|Wide and high-performance datapath||Dataflow optimized for compute acceleration matrix vector mathematics
|Significant computational requirement for integer arithmetic||
|Neural network inferencing requires a large number of matrix multiplications, high-performance computation and significant amounts of data movement||Optimized multiply-accumulate core for integer and floating-point arithmetic
|High Performance Compute||Genomics||
Video & Image Processing
|Highest Performance SerDes|
|112G multi-Standard SR/MR/LR PHY||Yes||Yes||Yes||Yes|
|Most Advanced Interface IP|
|GDDR6 – 4 Tbits/sec of memory bandwidth||Yes||Yes||Yes||Yes|
|DDR4 – up to 3,200 MHz, 3DS stacked memory||Yes||Yes||Yes||Yes|
|DDR5 – up to 4,400 MHz||Yes||Yes||Yes|
|Application specific interface||Yes||Yes|
|Terabit Speed Routing|
|Fully flexibility bit wise routing||Yes|
|Fine grain hardware reprogrammability (examples listed)||Format conversion, activation function||Monte Carlo analysis||PairHMM algorithm||Custom codecs|