CertusPro-NX

CertusPro-NX Reinvigorates General-Purpose FPGAs

By Aakash Jani Senior Analyst

June 2021

http://www.linleygroup.com

CertusPro-NX Reinvigorates General-Purpose FPGAs

By Aakash Jani, Senior Analyst, The Linley Group

CertusPro-NX, the fourth product developed using Lattice Semiconductor’s Nexus platform in the last 18 months, delivers class-leading power, performance, and size for diverse applications. These general-purpose FPGAs offer low power, small packages, and high-bandwidth I/Os, such as PCIe Gen3 and Gigabit Ethernet. They’re well suited to edge AI, industrial IoT, 5G control planes, and other tasks. Lattice sponsored this white paper, but the opinions and analysis are those of the author.

Lattice Semiconductor has modernized an established segment of the FPGA market by releasing its fourth Nexus product, CertusPro-NX. Manufactured in 28nm FD-SOI technology, the new FPGAs have low power and a small form factor to target the low- density market. Compared with the company’s earlier Certus-NX, they jump from 17,000 to 96,000 logic cells. The general-purpose-FPGA segment is diverse and grew by approximately 10% in 2020; it serves a broad range of functions, including 5G cellular, AI, and IoT. These market segments are continuously changing, so FPGAs allow them to circumvent the rigidity of ASICs.

For its new FPGA family, Lattice offers two variants: the CPNX-50K, which has 52,000 logic cells, and the CPNX-100K, which has 96,000 logic cells and is the lead vehicle for alpha and engineering samples. As Figure 1 shows, the latter model has a programmable I/O that supports LPDDR4 DRAM (a first for an FPGA in this class). The company also boosted the internal memory by 3x, allowing CertusPro-NX to save power on memory- intensive operations.

Figure 1. CertusPro-NX block diagram. The new FPGA comprises 7.3Mb of on-die memory, 156x 18×18 multiplication-DSP blocks, programmable logic, and eight flexible 10Gbps serdes lanes configurable for DisplayPort or CoaXPress connections.

When designing the CertusPro-NX family, Lattice chose a 28nm FD-SOI process. Despite early skepticism about that choice, the company achieved leading power and soft-error-rate (SER) metrics—crucial to winning designs in multiple applications.
Beyond the programmable logic, CertusPro-NX has hard blocks that reduce power consumption. They include a 10G Ethernet port and a four-lane PCIe Gen3 controller. Lattice paid special attention to the bit-stream-configuration block, enabling impressive boot speeds: a fully utilized device is configurable in less than 30 milliseconds.

Relative to its predecessor, CertusPro-NX offers considerable improvements, letting customers implement advanced features in their FPGA-based designs. Lattice more than doubled the number of logic cells, increased on-die memory, and upgraded the PCIe controller as well as the programmable-I/O interface. As a newcomer, the design provides best-in-class performance compared with the Intel Cyclone V GT and Xilinx Artix-7.

Machine Vision and Edge AI

In addition to scaling its programmable logic fabric for CertusPro-NX, Lattice also broadened the platform’s AI capabilities. Using the 7.3Mb of internal memory, customers can load compact neural networks that recognize objects, listen for key phrases, or detect unusual behavior. But hardware is only half the story. The company’s SensAI software stack works with Caffe, TensorFlow, TensorFlow Lite, and Keras, and it’s backed by Lattice AI compilers. This proven offering delivers power- and resource-efficient AI to the plethora of Lattice customers. The software platform is compatible with many of the company’s FPGAs (CertusPro-NX compatibility is scheduled for later this year).

Machine vision at the network edge requires much more than just hardware to implement the neural network: it also requires sensor compatibility, sensor aggregation, and image preprocessing. Lattice gives CertusPro-NX customers flexibility in this segment through its programmable I/Os and serdes block. For instance, many high-resolution image sensors employ the SLVS-EC interface, which edge-AI accelerators often lack.

The programmable serdes also supports a variety of standards for moving data from the edge and deeper into the system. Examples include CoaXPress and 10G Ethernet.

CertusPro-NX dramatically outpaces its competitors in on-chip memory size. Because DRAM operations increase power consumption and decrease throughput, neural networks run best if all weights remain on the chip, minimizing DRAM accesses. This situation makes customers crave more on-die memory. The new Lattice FPGA can store up to one million 8-bit weights—nearly twice the capacity of Cyclone V GT or Artix-7. By holding more weights internally, CertusPro-NX can run larger AI models without accessing DRAM, thereby conserving power.

When Lattice’s FPGA does require DRAM access, it uses a programmable-I/O block that supports both LPDDR4 and DDR3 memory at 1,066Mbps. CertusPro-NX is the first of its category to add LPDDR4—one generation beyond its competitors, which only offer DDR3 and below. On average, however, this newer technology increases chip and system

power consumption. Thanks to its larger on-die memory and improved memory controller, CertusPro-NX achieves new power-efficiency heights through both its on- and off-chip memory by decreasing energy consumption and memory-access times. Long- term availability is also a concern in many markets, including embedded vision, and LPDDR4 alleviates such concerns.

A critical factor in building a smart city or even a smart home is visibility. Most end users prefer inconspicuous IoT sensors, and an area-efficient microprocessor is the core of such designs. At 81mm2, CertusPro-NX has the littlest serdes-capable package in its class, consuming 33% less area than the Cyclone V GT and 84% less than Artix-7. The low- profile FPGA further increases volumetric headroom, allowing OEMs to add more functions or slim the form factor.

Industrial IoT

The latest industrial-IoT generation is characterized by large-scale automation, driven by improvements in connectivity and data analysis. To automate tasks such as sorting and packing, smart factories require thousands of IoT devices that together generate and process terabytes per day. Silicon that powers these devices must be compact, power efficient, and reliable. To prepare its customers for Industry 4.0, Lattice adopted and applied these principles in its latest FPGA generation.

The company uses FD-SOI to reduce power in CertusPro-NX relative to competing CMOS-based FPGAs. One way to quantify this advantage is to consult each vendor’s power estimator, assuming a design that requires 65,000 logic cells, 75% DSP and memory utilization, and two serdes lanes operating at 5Gbps. For this design operating at a junction temperature of 85℃ and a fabric frequency of 125MHz, CertusPro-NX consumes 75% less total power (dynamic+static) than Artix-7 and 65% less than the Cyclone V GT, as Figure 2 shows.

These figures demonstrate its sizable power lead, which is driven by FD-SOI. That manufacturing technology uses an insulation layer in the substrate to reduce leakage current by up to 75% compared with other 28nm bulk-CMOS products; current leakage is the primary driver of static and standby power consumption.

As OEMs increase power to improve their products’ performance, the Intel and Xilinx FPGAs will cross their junction-temperature thresholds well before the Lattice FPGA. With its leading efficiency, CertusPro-NX frees up power and thermal headroom, allowing OEMs to cut system size or thermal-management cost. Systems operating below the junction temperature won’t need a fan, which is another mechanical point of failure.

Thermal budget is even more crucial for industrial-motor control. Motors tend to be airtight to prevent dust particles from reducing their longevity. During operation, however, heat accumulates in a motor and increases the ambient temperature around the FPGA. Lattice’s low-power solution, relative to its competitors, allows the FPGA to control higher-torque motors without overheating.

Figure 2. FPGA power comparison. LC = logic cells. Compared with similar FPGAs from Intel and Xilinx, the Lattice product reduces power by 65–75%. Power estimates are for 5Gbps two- lane serdes applications at 75% utilization with an 85°C junction temperature at 125MHz. (Data source: Lattice)

FD-SOI has the added benefit of nearly eliminating single-event upsets (SEUs), which occur when radiative particles travel through a device and interact with memory or register cells. This interaction causes the cells to flip incorrectly, corrupting memory or data paths. Compared with Artix-7, CertusPro-NX reduces the number of soft errors by 99%, eliminating the need for soft-error-detection logic and error-correction codes.
Lattice’s approach both improves system reliability and simplifies customer designs.

CertusPro-NX delivers 110x longer mean time between failures (MTBF) than Artix-7. Lattice’s high MTBF satisfies the reliability needs of automotive and health-care systems; it also reduces maintenance costs by requiring fewer field adjustments, keeping critical factory operations up and running. This MTBF performance can additionally improve the safety of industrial robotics, where a control FPGA falling into an unknown state could cause a malfunction that leads to personal injury or property damage.

Usually, OEMs pair FPGAs with other system components, requiring high-bandwidth chip-to-chip interfaces to prevent data-flow bottlenecks. As a newer device, CertusPro- NX has a four-lane PCIe Gen3 controller that enables such connections. Its more senior competitors merely have PCIe Gen2, which is 50% slower per lane. The combination of higher serdes bandwidth and newer PCIe technology helps CertusPro-NX customers sidestep chip-to-chip bottlenecks, whereas other solutions may stumble.

Application for 5G

To serve wireless networks, base-station OEMs separate the control and user planes, allowing each to scale independently—a crucial feature for 5G networks because both

planes change annually as the 3GPP releases new specifications. The control plane is modular, so the wireless provider can split its functions across multiple chips or consolidate them on a single chip. It handles various tasks, including authentication, client (UE) session management, and unified data management.

Although a CPU could perform each of these functions, it would be less efficient than an FPGA. OEMs require efficient hardware because each 5G base station consumes 70% more power than its 4G counterpart, according to industry estimates. Given the flexibility and power constraints, base-station OEMs often need FPGAs to augment processors or ASICs. The new Lattice product consumes less power than Artix-7 and the Cyclone V GT, easing base-station thermal management.

For 5G small cells, room is tight and data flows plentifully. As the smallest chip in its class with serdes capabilities, CertusPro-NX fits well in small form factors without stinting on data rates. Its leading 75Gbps serdes bandwidth surpasses that of Artix-7 by 36% and the Cyclone V GT by more than 2x, as Figure 3 shows. For high-bandwidth functions such as packet management, the Lattice FPGA delivers more throughput with its greater serdes bandwidth while leading in area efficiency.

Figure 3. Total serdes bandwidth. CertusPro-NX outpaces its competitors by up to 2x, gaining an edge for data-intensive operations such as unified data management in 5G base stations. (Source: Lattice)

Conclusion

Lattice designed CertusPro-NX to directly address machine vision, industrial IoT, 5G cellular, and other growing markets. Its improved internal memory and LPDDR4 support minimize energy consumption for memory-intensive operations such as neural networks.

Its FD-SOI technology reduces power consumption and failure rates, making next- generation devices more reliable and cheaper to operate. The new FPGA’s 10Gbps serdes and class-leading package size make it well suited to small systems that aid in data processing, such as 5G small cells. Although CertusPro-NX excels in these categories, OEMs can also use it for many other tasks, including defense, automotive, and frame grabbing.

The three competing FPGAs contain roughly the same number of logic cells, but the Lattice product sets itself apart by offering LPDDR4 support. By contrast, the others are stuck at DDR3. CertusPro-NX also provides more internal memory and best-in-class serdes bandwidth. Not only can customers move more data with the Lattice FPGA, but they can do so while consuming up to 75% less power and 84% less board area.

By offering CertusPro-NX, Lattice is injecting new energy into an important segment that has seen little investment for many years. Its primary competitors haven’t released a new low-cost architecture in the past decade, giving it the opportunity to augment its latest family with new technologies, such as PCIe Gen3 and LPDDR4. This strategy has catapulted Lattice to the top of the leaderboard in power and area for low-power FPGAs. CertusPro-NX capitalizes on the unique technologies of its predecessor and expands its memory, serdes, and logic capabilities to serve burgeoning markets such as 5G base stations, industrial IoT, and machine vision.

Aakash Jani is a senior analyst at The Linley Group and a senior editor of Microprocessor Report. The Linley Group offers the most-comprehensive analysis of microprocessors and SoC design, going beyond business strategy to examine internal technology. Our in-depth articles cover topics that include embedded processors, mobile processors, server processors, AI accelerators, IoT processors, processor-IP cores, and Ethernet chips. For more information, access our website at http://www.linleygroup.com.