





## **VP815**

**VectorPath**™ Accelerator Card

Deliver next-gen AI with scalable FPGA acceleration

### High-Performance

AI/ML

**Compute** 

VectorPath 815 leverages the Speedster®7t FPGA architecture, optimized for Al/ML and high-performance computing workloads. Its Machine Learning Processor (MLP) blocks, 2D Network-on-Chip (NoC), GDDR6 memory, and 112G SerDes deliver exceptional Al inferencing performance.

### **Key benefits:**

- Accelerated AI Performance: Tensor-optimized MLP blocks outperform traditional solutions.
- **High Throughput, Low Latency:** NoC architecture provides up to 20 Tbps bandwidth for real-time data delivery.
- **Scalable Architecture:** Predictable latency and scalability for edge and data center deployments.

# Key Features

Packed with performance, the VP815 gives you fast GDDR6 memory, 400G networking interfaces (two QSDP-DDs), and PCle Gen5 x16. Of course, it's all driven by the Speedster7t FPGA with advanced 2D NoC, machine learning processors (MLPs), and programmable logic.



## 400GbE Networking

# Speedster7t FPGA

Revolutionary Chip Design by Achronix

The Speedster7t FPGA is highly optimized for AI/ML and high-bandwidth data acceleration and is at the heart of every VectorPath accelerator card.

## 2D NoC

**Two-Dimensional Network-on-Chip** 

### **Data Highway Unclogs FPGA Fabric**

The 2D Network-on-Chip (NoC) seamlessly interconnects high-speed interfaces such as GDDR6, DDR4/DDR5, Ethernet, and PCle with the core of the FPGA fabric without consuming any fabric resources. This eliminates the need for complex routing through the programmable logic, decouples data movement from fabric resources, and enables true partial reconfiguration. With over 20 Tbps of aggregate bandwidth, the 2D NoC delivers not only speed, but also unmatched design flexibility and modularity.





500GB/s

16x channels

GDDR6

5x

100GB/s

4x banks

DDR4

## **GDDR6 Memory**

### **5x Faster Large Memory**

Using high-bandwidth GDDR6 memory, the VP815 gives your application a large memory resource of 32 Gigabytes, but at more than 5 times greater bandwidth.

Plus with the 2D NoC, the GDDR6 is available for read/write from the host over PCIe without using FPGA fabric resources.

The VP815 card offers a range of network interfaces connected to the Speedster7t FPGA fabric. The card supports 112G PAM4, with hard IP MAC and FEC support. On-board jitter cleaners are available for synchronous ethernet (SyncE).

#### **QSFP-DD Interfaces**

Two QSFP-DD interfaces with up to 400 Gbps per port. A range of other options including 16x 100G (total) are available using breakout cables.



# **VP815** FPGA Card

Enterprise-Class Design by BittWare



The VP815 FPGA card delivers a wide range of advanced I/O, including 400G and multiple PCIe interfaces and the high-bandwidth GDDR6 memory.

Customers can get started quickly with the Achronix SDK, including an example project for Linux.

# **FPGA Fabric**

Up to 61 TOps (INT-8) Performance

### **RLB**

Reconfigurable Logic Block

The Speedster7t features RLBs: a new reconfigurable logic architecture with 6-input LUTs, 8-bit ALUs, 2 flip-flops per LUT, plus a reformulated multiplier LUT (MLUT) mode based on a modified Booth algorithm which doubles the performance of LUT-based multiplication.

The Speedster 7t1500 FPGA has 692K LUTs.

### MLP

**Machine Learning Processor** 

MLP blocks are large-scale matrix-vector and matrix-matrix multiplication engines supporting fixed- and floating-point computations. For integer multiplication, the MLP offers 4x int16, 16x

int8 or 32x int4 modes. For floating point and block floating point (OCP MXINT8) operations, the MLP supports FP16, FP24 or BF16.

MLP blocks include two memory blocks usable individually or with multipliers. Total embedded memory is 195 Mb.

Total MLP blocks: 2,560 capable of 72 MXINT8 TOPS.



# **Applications**

### AI/ML

Achieve an industry-leading tokens per second output with open source LLMs, accelerating text generation, chatbot responses, and Al-powered content creation while delivering significantly lower latency for Al workloads leveraging various size foundational or fine-tuned and task specific LLMs vs. traditional GPU and CPU solutions.

### **Automated Speech Recognition**

Transcribe up to 2,400 simultaneous audio streams with multiple language support, latency under 25ms, and an industry-leading word error rate (WER) of less than 3%, ensuring real-time, low-latency speech-to-text conversion for real-time agent assist in contact centers, voice assistants, transcription services, and real-time conversational Al applications.

#### Networking

Wire-rate 400 GbE capture, real-time traffic analysis, packet steering and more are all possible. With 16 channels of GDDR6, each operating at 256 Gbps, combined with 16x 100 GbE creates a platform that is uniquely constructed to enable a whole new generation of 400 GbE appliances.

# **ACE** FPGA Development Software

The ACE software from Achronix is the development environment for the Speedster7t FPGA. ACE handles the hardware design workflow, supporting RTL (VHDL and Verilog) input together with industry-standard simulation. ACE also enables using advanced chip features such as the NoC. ACE also ship with an Achronix-optimized version of Synplify Pro from Synopsys.





#### **Additional Card Features**

- · Jitter cleaner for SyncE
- · 1 PPS & 100 MHz ext. ref. clock
- · BMC with health monitoring
- · 8x GPIO pins
- · Drivers for Linux and Windows

### **Software Development Kit: Powerful Tools for Development**

The Software Development Kit (SDK) provides drivers, libraries, utilities and an example project for accessing, integrating and developing applications for the VectorPath card.

### **Card Specifications**

| FPGA              | <ul> <li>Achronix Speedster 7t1500</li> <li>52.5 x52.5 mm² package</li> <li>692K 6-input lookup tables (LUTs)</li> <li>195 Mb embedded RAM</li> <li>2,560 MLPs</li> </ul>                   |
|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| On-board memory   | <ul> <li>32 GBytes GDDR6: 8 interfaces, 2 independent 16-bit channels per interface (4 Tbps aggregate b/w)</li> <li>16 GB (x72) DDR4-3200</li> <li>Flash memory for booting FPGA</li> </ul> |
| QSFP-DD cages     | <ul> <li>2x QSFP-DD cages on front panel</li> <li>112G PAM4 transceivers</li> <li>2x 400GbE, 16x 100GbE, and more</li> <li>Hard MAC and FEC for every speed</li> </ul>                      |
| Host interface    | PCI SIG certified to support PCIe Gen5 x16 host interface                                                                                                                                   |
| External clocking | <ul><li>1 PPS inputs</li><li>100MHz Ref Clk</li></ul>                                                                                                                                       |
| USB               | <ul> <li>Front and back USB ports for access to BMC, USB-<br/>JTAG, USB-UART</li> <li>Additional USB port for daisy chain</li> </ul>                                                        |
| GPIO              | 8 GPIO pins, 3.3V, single ended, direction (Tx, Rx) independently settable by FPGA per GPIO, buffers rated to 200Mbps                                                                       |

| Board<br>Management<br>Controller 2.0 | <ul> <li>Voltage, current, temperature monitoring</li> <li>Power sequencing and reset</li> <li>Field upgrades</li> <li>FPGA configuration and control</li> <li>Clock configuration</li> <li>I²C bus access</li> <li>USB 2.0</li> </ul> |
|---------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Cooling                               | Dual-width passive heatsink                                                                                                                                                                                                            |
| Electrical                            | On-board power from a 12VHPWR (16-pin) connector     Power dissipation is application dependent                                                                                                                                        |
| Form factor                           | <ul> <li>Standard-height PCle dual-width board</li> <li>Size: 111.15mm x 266.70mm (4.376in x 10.500in)</li> <li>Full-length PCle extender included</li> </ul>                                                                          |

### **Development Tools**

| System           | Software development toolkit including libraries                                |
|------------------|---------------------------------------------------------------------------------|
| development      | and board monitoring utilities                                                  |
| FPGA development | <ul><li>Achronix tools—ACE Design Tools</li><li>FPGA example projects</li></ul> |

For more information, visit Achronix.com



