FPGA+x86 can con...

  • 2022-09-23 10:07:24

FPGA+x86 can control the accuracy of delay to the order of 2.5ns

FPGA+x86 can control the accuracy of delay to the order of 2.5ns

As we all know, Ethernet has penetrated into our lives everywhere. Enterprises, campuses, big data centers and homes are all inseparable from the network, otherwise our lives will be seriously affected.

The interface rate of Ethernet is also developing rapidly: 10M , 100M, GE, 10GE, 40GE, 100GE, 2.5GE, 5GE, 25GE, 50GE and even 400GE, which are gradually mature.

Many data centers and operators are also preparing to expand their networks from 100GE to 400GE to support fifth-generation wireless technology (5G), artificial intelligence (AI), virtual reality (VR), Internet of Things (IoT), autonomous vehicles bandwidth and response time required by emerging technologies such as

FPGA+x86 can control the accuracy of delay to the order of 2.5ns

However, any new network interface rate, the development of network equipment to the formation of final products and the implementation of new network technologies are inseparable from each stage of testing and verification, and have brought great challenges to testing. At the same time, the development of high-performance, stable and high-speed network testers has not kept pace with the research and development needs of network equipment.

Therefore, the development of high-speed and stable network testers is urgent, especially the domestic network tester products that master the core technology.

The large-scale data center switch or core router test scenarios conducted by domestic telecom operators place extremely high requirements on the tester.

1. Stability

It has the ability of stable streaming, statistics and protocol simulation for a long time, such as 7x24 hours of stable operation for a long time;

2. Repeatability

The same physical environment and network conditions, multiple test results must be consistent;

3. Accuracy

The test results must accurately reflect the real indicators of the device or system under test, such as throughput indicators, accuracy of delay and jitter, accuracy of traffic scheduling, and accuracy of traffic statistics;

4. High performance

Support all packet lengths (such as 64-16000 bytes or IMIX mixed packet lengths) wire-speed flow and statistics capabilities, ultra-high routing exchange protocol emulation capabilities (such as BGP/OSPF/ISIS/PPPOE/IPOE/ EVPN, etc.), multi-port (such as hundreds of 100GE/10GE) and multi-service (such as IPV4/IPV6/MPLS/Multicast) traffic scenario simulation capabilities;

5. Standard

Comply with international test standards RFC-2544, RFC-2889, RFC3511, RFC3918, etc.;

6. Rich interface types

Supports 1GE/2.5GE/5GE/10GE/25GE/40GE/50GE/100GE/400GE and other interface types, and supports cascading of multiple chassis to build large-scale test scenarios.

There are two main types of network testers currently on the market.

1. Tester based on x86+DPDK+network card

The programming of X86 is relatively easy, the debugging methods are more abundant, and the cost has certain advantages. It is a good choice for functional testing with low requirements.

2. Tester based on FPGA+x86 hybrid

A test system that integrates software and hardware such as FPGA+x86 is suitable for high-performance, full coverage, large-scale test scenarios, and complex test scenarios.

The hybrid architecture of FPGA+x86, on the one hand, takes advantage of the increasingly powerful parallelism of the data plane hardware of the FPGA, and on the other hand, combines the processing flexibility of the CPU at the control plane, and because both FPGA and CPU are inherently programmable The system can flexibly move the business division boundary between the FPGA hardware and the CPU software according to the needs of business processing, so as to realize the comprehensive optimization of the entire business process.

According to the complex test scenario requirements of telecom operators mentioned above, we conduct a comprehensive comparative analysis of the testers of the two architectures as follows:

1. 64-16000 byte packet length wire-speed streaming and statistical capabilities

x86+DPDK+NIC: Take the 100G network shown in the figure as an example, in the case of a 64-byte packet length, about 150M data packets will be sent and received per second, which cannot be processed by the current computing and storage access capabilities of the CPU. FPGA-based testers do not have this limitation. According to the latest data (2019.10.9) publicly released by the DPDK official website [Data source, DPDK official website], using the system configuration in Figure 1, it can be clearly seen that 100G line speed cannot be achieved in the case of 64, 128, 256 and other small bytes send and receive packets;

FPGA+x86: All bytes can be sent and counted at wire speed.

FPGA+x86 can control the accuracy of delay to the order of 2.5ns

The achievable small-byte packet-length wire-speed performance is shown in Figure 2.

FPGA+x86 can control the accuracy of delay to the order of 2.5ns

2. Accuracy of delay and jitter

The accuracy of the delay test is a very important indicator in network testing.

x86+DPDK+NIC: The X86 system is a general-purpose computing system, and its own reference clock accuracy is not high, and the OS scheduling error is at least us; if the NIC of the x86 system does not support inserting timestamps in the physical layer, The network delay needs to be handled by a software system, which will bring further errors; therefore, the delay that is usually accurate to the order of 10ns required for network testing is difficult to achieve on a general-purpose computing-oriented x86 platform;

FPGA+x86: On the FPGA platform, a crystal oscillator with a precision of 0.1 to 0.001PPM generates a clock of up to 400M, which can control the accuracy of the delay to the order of 2.5ns.

3. Storage system flexibility

x86+DPDK+NIC: The x86 system is oriented to general computing. The current mainstream memory system is DDR4 memory, which has large bandwidth but large access delay. Depending on the read-write access mode, there may be delay jitter;

FPGA+x86: The memory of FPGA can be combined with various memory technologies such as on-chip RAM (which can realize the cache function) +DDR+QDR+RLDRAM as required to optimize bandwidth-demanding and latency-demanding access.

4. The ability of protocol acceleration

x86+DPDK+NIC: No additional resources to implement protocol acceleration functions such as TCP offloading;

FPGA+x86: FPGA is a hardware programmable system. According to the amount of hardware resources and the needs of business processing, FPGA can flexibly divide the interface boundary with the x86 system in terms of protocol processing, realize protocol acceleration functions such as TCP offloading, and put the protocol processing in the process. The computationally intensive stateless tasks are parallelized at the hardware level, which can greatly enhance the processing power of the entire system.

5. Accuracy of Layer 2-3 traffic scheduling

x86+DPDK+NIC: The x86 system cannot achieve small-byte packet-long line-speed streaming on high-speed ports, let alone accurate traffic scheduling;

FPGA+x86: In the face of more and more complex and larger-scale test business traffic, switches and routers, the FPGA system architecture supports the generation of thousands of streams (such as the typical 64K streams in high-end testers), and The bandwidth ratio between each stream and the sending scheduling mode can be precisely controlled, even to 5 decimal places.

6. Real-time performance and accuracy of statistics

x86+DPDK+NIC: The software implementation of the CPU test function is essentially a serial instruction set. With the implementation of new technologies such as multi-core hyperthreading, partial parallelism can be achieved at the instruction level, but for some statistical data, such as real-time The number of frames sent and received per second, etc., are defined by at least two parameters (a certain time interval delta and the number of packets sent and received within the interval). If the reading of these two parameters is implemented on a CPU core, Then the serial nature of the instructions will inevitably bring a lot of errors; if the reading of these two parameters is implemented on two cores, it is difficult for the current CPU technology to achieve ns-level synchronization between cores, which also brings statistical the imprecision of the value;

FPGA+x86: Inside the FPGA, through the hardware programming technology, the statistical value snapshot function can be easily realized, and the reading of the above two parameters is strictly guaranteed to be accurate.

7. System scalability

x86+DPDK+NIC: For a large-scale system under test, whether it is a software implementation of x86 or a hybrid system such as FPGA+X86, a single machine cannot complete the test task, and it is an inevitable option to cascade the system and achieve synchronization at the 10ns level. The x86 system is oriented to general computing, and can realize multi-machine synchronization by running the NTP protocol, but the synchronization accuracy of NTP cannot meet the requirements of the delay test service;

FPGA+x86: In the hybrid system of FPGA+x86, high-precision synchronization technologies such as local cable cascading/GPS/1588v2 can be implemented through FPGA to ensure the accuracy of time testing.

In addition, in the implementation of the FPGA+x86 hybrid system, the 2-3 layers of traffic processing are implemented in the FPGA, and do not need to go through the CPU's protocol stack or upper-layer applications. The CPU only needs to implement lightweight configuration delivery, interface presentation, etc. computing, avoiding the natural defects of the CPU in terms of wire-speed sending and receiving stream processing; on the X86 side, DPDK technology can also be flexibly deployed, and the accelerated pure protocol processing part is realized by the X86 system, combining the advantages of FPGA and x86 to achieve Efficient business processing.

Obviously, the use of FPGA+x86 hybrid system is the best choice to build a high-performance network tester.

In recent years, foreign Ethernet testing technology has developed rapidly, and new products have emerged one after another. With years of technical accumulation of high-speed and high-performance testing software and hardware platforms, American companies Spirent and keysight have long occupied the global leading position in the field of Ethernet testing. The industry is at the forefront of the world.

Domestic research on Ethernet test technology began in the early 21st century. After more than ten years of hard work, the independent design and development capabilities of related test products have also made great progress.