The previous article said that I had returned to China, but when I was still in quarantine in the hotel, I received a small toy from Xilinx. The unboxing video has been sent to Station B. Those who have seen it should know that this is actually Xilinx’s Kria KV260 Vision AI Starter Kit. As can be seen from the name, this thing is specially designed for vision applications. kit. Today’s article is to briefly introduce this product, and more importantly, to talk about some of my thoughts on this little toy. In one sentence, I have never played with such an FPGA board that can be developed without writing RTL.

The previous article said that I had returned to China, but when I was still in quarantine in the hotel, I received a small toy from Xilinx. The unboxing video has been sent to Station B. Those who have seen it should know that this is actually Xilinx’s Kria KV260 Vision AI Starter Kit. As can be seen from the name, this thing is specially designed for vision applications. kit. Today’s article is to briefly introduce this product, and more importantly, to talk about some of my thoughts on this little toy. In one sentence, I have never played with such an FPGA board that can be developed without writing RTL.

Let’s talk about the development board first. There are actually two boxes I received, one is the KV260 board, and the other is some necessary accessories, such as HDMI cable, SD card, power adapter, network cable, and a camera module. However, in order to make this board run, it is best to need a monitor (with HDMI or DP interface), or you can get a USB keyboard, USB camera, etc.

I’m in quarantine hotel, “made” an AI vision accelerator

But neither of these two things is necessary. I have also seen someone send the video directly to the laptop monitor through the RTSP protocol using a network cable, and then open it with a player that supports RTSP. I don’t have a USB keyboard either, so I use the serial port UART for simple command transmission. There are some pits in it, and I will summarize it later. If you have a keyboard, just plug it into the board and type commands.

Modular System SoM:

New Ideas for Board Design

As for the Kria KV260 board, it actually has two parts, one is the FPGA card itself, which is the part covered by the red fan. Another part is the motherboard base board or also called the carrier board. This is a bit different from the development boards we usually use. Of course, many professional FPGA development boards also have expansion cards based on the FMC interface, but the FMC card is mainly used for relatively simple functions such as IO expansion, and the KV260 board and the above part are actually the main body of the FPGA, all other The interface parts are all on the big board below.

I’m in quarantine hotel, “made” an AI vision accelerator

This design method is called SoM, which is System on Module. Its essence is actually a modular approach. We can design these core boards and motherboards separately to meet the needs of different application scenarios.

For example, for the development board, it definitely needs more interfaces and more debugging functions, so we can do more IO of the motherboard to facilitate our development.

On the other hand, for actual application and deployment, so many interfaces and debugging functions are not needed. Then you can use a minimal motherboard, keeping only the necessary functions. The FPGA board above remains unchanged.

In the same way, we can use the same motherboard to develop different FPGA devices, just replace the above board. Especially when you are familiar with the underlying resources, and new FPGA devices will come out in the future, you don’t need to buy a new board to re-familiarize yourself with, just change the FPGA board, which is very convenient.

Specific to this Kria KV260, the FPGA board above is called K26 SoM, and its main body is a Zynq UltraScale+ MPSoC. This is a 16-nanometer device containing a quad-core ARM Cortex-A53 processor and a series of SoC subsystems built around it, including an embedded GPU, memory controller, and various IO and bus control units, etc. . The programmable logic part, also known as PL, contains 256,000 programmable logic units, more than 1,000 DSP units, and a hard-core video Codec, which can support 4K60 frame video codec.

I’m in quarantine hotel, “made” an AI vision accelerator

In addition, this K26 SoM has 245 IO pins, can support 15 cameras, 4 USB ports, and 40G Ethernet, and can provide 1.4Tflops of AI processing power.

It can be seen from these performance indicators that this is a SoM board specially designed for vision applications. I will send all the specific technical documents about this board to Knowledge Planet, and friends who are interested can take a look.

There are many interfaces on the motherboard: Ethernet, 4 USB3.0, HDMI, DP, JTAG, UART, etc., it should also be very convenient for us to carry out actual development and learning.

However, I think the biggest feature of this development kit is not just this modular hardware design method, but also its development method.

FPGA development without writing RTL

Friends who have played with FPGA should know that FPGA development is very troublesome, especially compared with these software development based on CPU or GPU. For example, if we want to play Raspberry Pi, connect the power directly to the peripherals, and then start writing python to develop.

In contrast, FPGA is completely two concepts. The traditional development method uses a special hardware design language Verilog, VHDL or SystemVerilog, which is very troublesome to learn; it also requires special development software, such as Xilinx’s Vivado or Vitis, this also requires a lot of learning costs.

Not only that, the compilation and debugging time of FPGA is very long. For a normal-sized industrial-grade FPGA design, the compilation time usually takes several hours, which discourages many developers and application manufacturers. In addition, developers have to learn to master the corresponding simulation test methods. In the previous article, I specifically summarized the FPGA learning route, divided into introductory articles and advanced articles. Interested friends can take a look.

All in all, on the one hand, FPGAs have various benefits. For example, Microsoft’s brainwave project uses FPGAs to effectively accelerate their real-time AI inference; but on the other hand. The learning and development method of FPGA is very complicated and cumbersome, which is also the most important factor restricting the large-scale development of FPGA.

However, the development method of this KV260 is very different. We don’t need Vitis or RTL language to quickly run a visual application. In fact, according to Xilinx, the first full configuration can be done within an hour. According to my experience, plus my experience of stepping on pits, the complete configuration of the system can be completed with a high probability of one hour.

KV260 Instance C Stepping on the pit summary

Xilinx has a dedicated page that walks through configuring the KV260 all the way up to running a smart camera application. Happily, this configuration process supports macOS, which is relatively rare in FPGA development. For the specific start-up process, you can watch the video. It’s just that there are many small pits in this operation and configuration process, here is a brief summary.

1. Step 4: Typo

I’m in quarantine hotel, “made” an AI vision accelerator

There is a typo here, it should be

$ ls /dev/tty.*

2. Step 4: Set the number of COM ports and baud rate

I’m in quarantine hotel, “made” an AI vision accelerator

In Xilinx’s web page, it is said that there will be 4 ports. But at least in my practice, I saw 4 COM ports. The second lowest numbered port is the UART port.

In addition, you need to pay attention to the setting of the baud rate, otherwise there will be garbled characters. The correct command is as follows:

$ screen /dev/tty.usbserial 115200

where 115200 is the correct baud rate.

3. Step 5: Crash when using Mac Terminal

I’m in quarantine hotel, “made” an AI vision accelerator

When running this command, using the Mac’s default Terminal app crashes and fails to complete a normal installation. I later used an app called serial, which is similar to putty. You can go on normally. In addition, as long as you install it once, you can use the Mac terminal to run the next commands normally.

summary

As can be seen from this small experiment, using this Kria KV260 development kit, the realization of a vision acceleration application can be quickly completed. We can then add our own applications on top of this, or use this as a reference to develop our own acceleration designs.

In this process, it is not necessary to touch the underlying hardware content of the FPGA. If you are a software developer, you can use this platform to directly develop and accelerate upper-layer software and algorithms, which greatly reduces the threshold for using FPGA. This process is also very interesting, and at the same time, you can gradually come into contact with the knowledge details of the collaborative development of software and hardware, and exercise your skills in this area.

I’m in quarantine hotel, “made” an AI vision accelerator

I have uploaded all the learning materials about Kria KV260 to Knowledge Planet, and friends who want to learn can start here. Xilinx also has many official training courses online and offline, you can pay attention to them. Also welcome to take a look at the FPGA learning route I wrote before, it should also be helpful to you.

The Links:   QM10TE-HB EP4CE40F23I7N