This is a reprint of a Xilinx white paper also found here (PDF).
Xilinx and its Alliance partner Sensor to Image have created hardware, software, IP, and whole turn-key system solutions for the growing high-performance Machine Vision market.
By Mark Timmons
and Werner Feith
Sensor to Image, GmbH
The Machine Vision market is experiencing unparalleled growth of pixel rates in high-end vision systems. This accelerated growth, already exceeding the Moore's Law trend to which semiconductor manufacturers and markets have become accustomed over the years, presents an immediate and increasing demand for high-performance connectivity that provides, with a minimal number of cables, seamless support for 10G+ link speeds over distances in the range of 100m.
This white paper examines the system requirements of such high-performance, leading-edge technology, and highlights the universe of available standards-compliant turnkey machine vision systems designed around the Xilinx® 7 series FPGAs. Such Xilinx based turnkey designs are available today from Xilinx's machine vision partner, Sensor to Image GmbH (S2I).
This white paper describes multiple Xilinx solutions for customers to consider, fulfilling critical application requirements like reduced power consumption, lower overall system cost, and easy scalability as system load requirements eventually exceed their current trends.
The results of this creative collaboration of Xilinx and S2I provide industry-leading, end-to-end solutions for prospective customers, including camera- and frame-grabber hardware and solid, standards-compliant software that address the exciting machine vision challenges emerging in today’s market.
Many new applications—for example, the quality measurement of large flat-panel displays—are driving machine vision applications to require higher camera resolutions and higher frame rates to meet volume production needs. Other applications with similar demands for high-resolution, high-speed image capture include semiconductor wafer inspection, PCB inspection, postal and parcel identification, and many others. The expansion of all such applications results in greatly increased bandwidth demand.
As the need for machine vision is very wide — covering low-, mid-, and high-end systems — the focus of this white paper is on applications that require link speeds of 10 Gb/s aggregate bandwidth or greater. Designers of such systems must face highly challenging technological obstacles, and the Xilinx 7 series (including the Zynq® All Programmable SoC platform) addresses these challenges with powerful new technology. The scalable solutions discussed in this white paper give the designer a robust system platform whose performance has been proven to handle very high pixel rates over distances of up to 100m.
The three main aspects of overall system design were considered when creating the next generation of technology for machine vision connectivity: cost, performance, and power consumption (including remote powering and thermal considerations). The solutions detailed in this white paper meet these needs, and also give the designer a high level of confidence in the longevity of the solutions. Xilinx device families are known to have a long lifetime in the market, that is, Xilinx products are typically available in excess of ten years from initial production to end of life. Additionally, all the machine vision solutions presented here are implemented in compliance with well-accepted communications standards.
With the introduction of the Zynq-7000 All Programmable SoC family, designers can now support 10G+ connectivity technologies in an intelligent, programmable device that can run extensive, high-performance machine vision software, such as HALCON from MVtec. A system can be configured with the machine vision software in place and advanced imaging processing using the programmable logic (PL) of the Zynq device to accelerate vision-based processing, and then couple this capability to the high-performance ARM® dual CortexTM-A9 processing system (PS) within the Zynq device. This combination of technology can be used in the following compact vision systems application with Zynq devices:
- Embedded receiver with high-performance 10G+ connectivity
- Low-power, customizable, embedded platform with no need for a PC
- Long lifetime
High efficiency programmable platform:
- Accelerated vision processing in the Zynq device’s PL
- High-performance serial processing up to 1 GHz in the Zynq device's PS
Established Connectivity Solutions for Machine Vision Systems
In standard machine vision systems, images are captured by a machine vision camera, which includes the CMOS/CCD imager and pre-processing of the image; these images are then transferred in real time (with low latency) to a PC-based frame grabber or embedded frame grabber (compact video system). See Figure 1.
Figure 1: Basic Machine Vision System
Such systems rely heavily on well-established interfacing technologies such as GigE Vision, Camera Link, Firewire, USB 2.0, and newer standards such as USB3 Vision. These standards meet the needs of the majority of low- and mid-end applications quite well, providing enough usable link bandwidth to transfer images reliably to the frame grabber.
One other important consideration is the need in some applications to position the camera at considerable distance from the frame grabber/vision processing setup. For example, standards such as Camera Link and USB3 Vision offer very high transfer rates of 5+ Gb/s, but distance is limited to a few meters (in the absence of special, expensive cabling). This is where new machine vision standards, such as CoaXPress (CXP) and GigE Vision V2, supporting 10G bandwidth on Ethernet, are providing a solution that can support cable distances in the range of 100m and beyond, offering greater flexibility for the system integrator of the vision system.
Another alternative for high-speed communication over copper cable is Camera Link HS, which can provide 10G connectivity over multiple pairs (four); the distance capability of this technology, however, is limited to 15m. The cabling solution for this standard is based on CX-4 Infiniband, which is expensive compared to other copper solutions, and is best suited to a stationary application; otherwise, reliability can be compromised. As with GigE Vision, it is also possible to use fiber optic cables and extend the supported distance to beyond 300m.
10G GigE Vision, a part of GEV2.x, is now supporting any bit rate and cabling/bundling technique supported by Ethernet standards. For example, 10G Ethernet technology with four-lane link aggregation (LAG) can use a QSFP+ connector today for distances greater than one kilometer. Of course, a trade-off between cost and performance must be made.
The requirements and capabilities of these and other connectivity solutions are shown in Table 1.
Table 1: Comparison of Machine Vision Connectivity Standards
The remainder of this white paper focuses on these new, higher-bandwidth technologies.
Xilinx already possesses technology that provides in excess of 28G per transmit/receive wire pair with Virtex®-7 devices. These can be combined to provide super-high bandwidth links of 100 Gb/s. Such technology is currently aimed at the very high end of applications, but it is likely to become more mainstream as the technology matures and new generations of Xilinx all-programmable devices are introduced into the market.
10G+ Technologies: High-Performance Vision Systems
Technologies such as CoaXPress and GigE Vision 2.x at 10 Gb/s (abbreviated to 10G GigE Vision) have been described already as available options for addressing high-bandwidth needs. Alternative solutions are considered next, along with the possible concerns of trying to scale up to higher bandwidth from what these technologies easily allow. For example, it is possible to use link aggregation (LAG) within the GigE Vision standard to combine 1G links together to produce a 2G link, but scaling above that level would prove to be difficult for designers.
Without high-speed communication links, such as CXP and 10GE Vision, system designers are forced to transfer the image over many more links to get the required throughput, thus creating further challenges and considerations, as summarized in Table 2.
Table 2: Using Multiple Links to Attain High Throughput
Designers requiring high-throughput rates in their machine vision equipment find that systems such as such as CXP and 10GE Vision meet the needs of high-bandwidth video transfer without the complexity and cost of multiple cables and connectors, and can support a platform that is easily adapted and scaled to meet rapidly changing market requirements. Due to the continuing rise in pixel rate demands, it is important to design such systems with an element of “future proofing.” For example, CXP offers support for adding additional links to support 25 Gb/s (even in a single hybrid cable), and 10 GigE Vision allows for dual-port LAG.
Furthermore, to simplify the integration (and lower the cost) of 10 GigE Vision, the Xilinx Kintex®-7 and Virtex-7 families support direct connection to SFP+/CFP fiber optic modules using Xilinx's 10GBase-R IP block.
These technologies have reduced pre-processing requirements at the camera, and images can be transmitted without complex image processing needed to reduce bandwidth. All image processing can be handled in the frame grabber/PC, thus reducing the footprint of the overall system.
Example Systems with 10G+ Technology
The figures in this section illustrate how example systems benefit from the available high-speed technologies that are described in this white paper. Such systems are realized with Xilinx 7 series All Programmable technologies; they are readily available from S2I for technology evaluation and include licensed IP building blocks or full product implementation through Design Services.
Xilinx 7 Series Families
Xilinx 7 series All Programmable technologies address the scalable needs of high-performance systems with a range of devices that can be used to produce a flexible platform with IP and function portability between different camera types. Xilinx addresses the need for low-cost systems with the Artix®-7 family of devices, scalable to the high-performance Kintex-7 family, and even up to the Virtex-7 family for the very highest level of performance and integration — for example, 100G.
The Artix-7 Family
The Artix-7 family is the first to provide low-end FPGAs with high-speed serial connectivity as a standard feature on all the family’s devices, from the smallest to the largest. With transceiver support for 6.6 Gb/s, the devices are well suited for low-cost cameras, including CoaXPress and low-cost frame grabbers supporting up to 12 channels, where 4 transceiver channels can be used in conjunction with the hard PCI Express® block to provide PCIe Gen2 connectivity.
Offering the lowest power of the 7 series family with integrated temperature- and power-rail monitoring and an in-built System Monitor function, the Artix-7 family is an ideal choice for power-challenged designers of machine vision cameras.
With high-performance logic that can be easily clocked in excess of 30% faster than Spartan®-6 devices, the Artix-7 family offers a wide range of logic resources for most designs, ranging from 35K logic cells (LCs) to 200K LCs.
The Kintex-7 Family
The Kintex-7 family is ideally suited for high-performance (Virtex-6 device class) frame-grabber solutions for CoaXPress and Camera Link implementations. Kintex-7 devices deliver the best cost-to-performance ratio, supporting up to 478K LCs. A peak DSP performance of 2,845 GMAC/s provides the designer with extensive vision-processing performance, and on-chip memory of up to 34 Mb assists with intensive DSP processing and low-latency buffering.
The low-power HLP 28 nm process used in manufacture of Kintex-7 devices means that it is well suited to provide a platform for the highest performance high-bandwidth cameras; its serial connectivity of up to 32 transceivers at 12.5 Gb/s provides solid, high-end connectivity support for camera interfacing and frame-grabber receiver links, as well as PCIe support to Gen2 x8 with hardened, power-optimized layer 2 functionality.
The Virtex-7 Family
Virtex-7 FPGAs contain up to 2 million logic cells and provide more than 5TMACS DSP throughput, providing the highest vision-processing level for Smart Video applications. These resources enable massively parallel data processing architectures that can perform more work with each clock cycle. With up to 88 advanced serial transceivers, Virtex-7 FPGAs can be utilized to provide more than 4 Tb/s of serial bandwidth.
With the highest performance transceivers in FPGA technology at the 28 nm node, up to 28 Gb/s, this family can support advanced 100G Ethernet solutions with minimal footprint and board space with advanced optical transceivers.
Up to 16 lanes of integrated PCIe Gen3 provides the components needed to create high-performance frame grabbers.
CoaXPress (CXP) and 10G+ Systems
CoaXPress (CXP) is an asymmetric high-speed, point-to-point serial communication standard for transmitting video and still images over RG59-type 75Ω coaxial cable (with a lower-speed control channel on the uplink). Originally specified by a consortium of camera and frame-grabber vendors, it was adopted, and is now maintained by the Japan Industrial Imaging Association (JIIA). It has been approved as an international standard through joint approval by the Automated Imaging Association (AIA), European Machine Vision Association (EMVA), and JIIA.
CXP delivers advantages (e.g., greater bandwidth) over competitive technologies, while addressing other vital requirements such as reach, determinism, robustness, ease of upgrade, low complexity, and low cost.
Figure 2 illustrates the main building blocks of a CXP system.
Figure 2: CXP Single-Connection Detail
Previously, Camera Link has provided the most bandwidth (850 MB/s, 10 tap mode). As sensor technology continues to evolve, however, this data bandwidth is unlikely to be sufficient for the new generation of larger and faster image sensors.
Camera Link is often criticized for having a maximum cable length (without the added burden of repeaters) of only 10m. This distance cannot be supported even at the maximum operating speed of 85 MHz. CXP, on the other hand, uses coaxial cabling and new transceiver technology to cover distances over 100 meters without the need for repeaters.
Gigabit Ethernet (used by GigE Vision) has a cable range similar to that of CXP, but it lacks the low latency and low jitter-trigger characteristics of CXP. These characteristics are required for applications that demand deterministic image capture and camera control.
The real achievable link distance depends on:
- The desired link rate
- The cable size (diameter)
- The manufacturer of the cable
EqcoLogic, supplier of the PHY (EQCO62T20/R20), provides an extensive overview of the different combinations at http://www.eqcologic.com/products2.asp?title=EQCO62T/R20&PID=8.
Such systems are available using Xilinx 7 series FPGA technology. Figure 3 shows an example of what can be integrated into an Artix-7T device. The diagram to the right depicts what goes in the FPGA in terms of building blocks (HDL).
Figure 3: 4-Lane CXP Camera Using EqcoLogic PHY Implemented on an Artix-7T FPGA and IP Package from S2I
When designing high-performance cameras, engineers can be faced with challenges concerning issues of:
- Memory interface performance
Key features in the low-cost Xilinx Artix-7T FPGAs address these challenges:
Low-power, highly integrated system
- Implementation of 32-bit RISC soft controller (Xilinx MicroBlazeTM embedded processor), image signal processing blocks, and imager interfacing in one device
- Maximum power provided over CXP cable is 13W (but camera manufacturers are trying to stay below 3W). Xilinx Artix-7T devices help achieve this by leveraging the latest low-power 28 nm technology
- XADC (AMS) System Monitor tracks PCB voltages and temperature for high reliability
- Due to the extensive I/O capabilities of the Artix-7 family, it is possible to interface to a wide variety of imager types (CCD, CMOS) with a variety of interface standards (LVDS, MIPI-CSI-2)
- Multiple memory interface support (DDR2/3 up to 1,066 Mb/s), as needed, for buffering and image processing
Integrated high-speed transceivers (GTPs), up to 6.6 Gb/s per lane
- Highest-performance transceivers in a low-end FPGA family
- Interface to the CXP EqcoLogic PHY
- CXP Tx core optimized for FPGA resources (see Table 3) available from S2I
Table 3: Artix-7 Device Resource Table, S2I Implementation: CXP Camera Supporting Two Lanes at 6.25 Gb/s (CXP6)
1. This design includes a MicroBlaze processor.
CXP IP for the transmit (Tx) function, normally a camera, is available today from S2I, supporting the functional blocks called for by the machine vision standard. This implementation is optimized for Xilinx technology, providing high performance while using minimal programmable logic resources. The main functional blocks are shown in Figure 4.
Figure 4: Block Diagram of CXP Tx IP
Such a two-channel design can provide some additional advantages in terms of effective, high-speed camera design. The effective throughput for a two-channel CXP6 solution is about 9.5 Gb/s while operating at around 200 mW of interface power.
The CXP connectivity footprint is highly optimized for board space and cost. The physical interface can be supported in a PCB space envelope of around 2 cm x 1 cm; in comparison, an optical Ethernet solution using an SFP+ module cage requires a PCB space envelope of around 3 cm x 1 cm.
To get a system view from the hardware perspective, the CXP link’s frame grabber (Rx) must be considered. Typical frame grabbers support multiple camera connections and require very high bandwidths; therefore, they are likely to require an FPGA family offering higher levels of performance. Xilinx Kintex-7 FPGAs offer designers a highly integrated, low-power solution with support for:
- Up to 32 transceiver inputs
- Multiple very-high-bandwidth memory controllers at 1,866 Mb/s
- PCIe Gen2 x8 integrated block for high-speed PC connectivity
See Figure 5.
Figure 5: CXP System Block Diagram with 7 Series
As with the Tx IP S2I support, a CXP receiver IP block (Rx) can be implemented in a single- or multi-channel configuration, with the latter being more typical in a frame-grabber implementation. Figure 6 details the main functional blocks in the Rx IP. See Table 4 for device resource information.
Figure 6: CXP Rx IP Block Diagram
Table 4: Kintex-7 Device Resource Table, S2I Implementation: CXP Receiver Supporting Four Lanes at 6.25 Gb/s (CXP6)
1. Resource estimation is only for the interface IP.
Given the functionality provided by the hardware blocks, the design must be supported with the necessary software components. CoaXPress allows easy interfacing between cameras and frame grabbers because the protocol is well defined and supports the GeniCam software standard. The GeniCam software standard is already well established for connectivity standards such as GigE Vision, Camera Link, etc. It is independent of the physical connectivity layer, and therefore makes it much easier for camera designers to swap between different connectivity standards.
Figure 7 shows an example of the software components required to provide full system support (Tx and Rx) for CXP. The component diagram is based on a PC frame grabber (Rx); the blocks shown are available from S2I as part of their portfolio of CXP offerings for Xilinx 7 series technology.
Figure 7: CXP System Software Components from S2I
GigE Vision and 10G+ Systems
Ethernet-based systems are becoming commonplace in many market areas due to their high scalability (10 Mb/s to 100 Gb/s) and economies of scale, which help drive down cost. Machine vision is another industry that has been quick to leverage the advantages of Ethernet-based technology.
10G GigE Vision, a part of GEV2.x, is now supporting any bit rate and cabling/bundling technique supported by Ethernet standards. It presents thereby an excellent opportunity for the industry to piggyback on both established and emerging communication technologies, while maintaining strict conformance to the GeniCam software standard across required camera ranges.
One distinct additional advantage of Ethernet-based systems: they typically do not demand a frame grabber card at the PC; a standard Ethernet NIC card and cabling can be used in most cases. This means that the Rx part of the vision link provides a significant cost advantage. The Tx section, of course, must still be designed as a necessarily costlier module, because optical systems for 10G and above still command a premium over legacy Gigabit Ethernet PHY solutions.
High-performance 10 GigE Vision cameras can be implemented on the Xilinx Kintex-7 family. This FPGA family provides the ideal platform for designs requiring both very-high-performance FPGA fabric (in excess of 400 MHz) and high-speed transceiver support that allow direct connectivity to fiber-optic SFP+ modules.
Key features of the high-performance, cost-optimized FPGAs in the Kintex-7 family address the designer's challenges:
Low-power, highly integrated system
- Implementation of one or more 32-bit RISC soft controllers (MicroBlaze CPUs), image signal processing blocks, and imager interfacing in one device
- Provide camera manufacturers a solution that stays within Power over Ethernet (PoE) budgets. Kintex-7 family achieves this by leveraging the latest 28 nm low-power technology while providing high-end FPGA performance
- XADC (AMS) System Monitor tracks PCB voltages and temperature for high reliability
- The 10 GigE Vision IP core (available from S2I) is optimized in terms of FPGA resources. See Table 5 for further details
- Due to the extensive I/O capabilities of the Kintex-7 family, it is possible to interface to a wide variety of imager types (CCD, CMOS) with a variety of interface standards (LVDS, MIPI-CSI-2)
- Multiple memory interface support (DDR2/3 up to 1,866 Mb/s), as needed, for buffering and image processing
Integrated high speed transceivers (GTXs), up to 12.5 Gb/s per lane
- Highest performance transceivers in a mid-range FPGA family
- Direct connection to SFP+ fiber module, removing the need for external XAUI to SFP+ PHY
- CXP Tx core optimized for FPGA resources (see Table 5) available from S2I
Table 5: Kintex-7 Device Resource Table, S2I Implementation: 10G GigE Vision Camera
Note: This design may appear larger than an equivalent CXP design. However, the current design contains a soft RISC CPU (MicroBlaze embedded processor and ARM AXI bus infrastructure) implemented to run the basic Ethernet protocols like ARP, DHCP, and the GEV Control Protocol. With no additional FPGA resources, this soft RISC CPU can be easily extended with FTP and WWW servers under Xilinx PetaLogix Linux. Such an implementation leverages the desirable benefits of an Ethernet-based standard system, including the hardware and software components necessary for seamless integration of video streaming with standard Ethernet functionality at bandwidths greater than 1 GB/s.
When looking at this example for a 10 GE Vision camera, it can be seen that a high-performance solution can be easily implemented, leveraging standard Ethernet physical-layer components over a single fiber. Such components are commonplace in the world of communication networking, and as such can be purchased at a reasonably low cost (e.g., 10G SFP fiber module and pluggable cage).
The main advantages of adopting Ethernet as the communication medium are:
- Off-the-shelf 10G NIC can be used, so no frame grabber is required
- Multi-mode fiber is both robust in high-noise environments and cost-effective over long distances
- Kintex-7T FPGA's embedded transceivers connect directly to SFP+ modules with no additional PHY
These are important considerations when balancing up the pros and cons of adopted technology for high-speed machine vision systems.
The Xilinx and S2I complete hardware-and-software platform, which meets the industry standard for machine vision systems, can be seen in Figure 8.
Figure 8: 10G GigE Vision System Software and Hardware Components from S2I
High-Performance 10G+ System Using Zynq-7000 SoC
In non-PC applications requiring high-bandwidth image processing support — e.g., Embedded Compact Vision Systems (CVS) — the overall system power can be dramatically reduced by removing the PC and replacing it with dedicated embedded modules highly customized to the application. The Xilinx Zynq-7000 All Programmable SoC family can be used in such applications.
A block diagram of the Zynq-7000 All Programmable SoC is shown in Figure 9.
Figure 9: Xilinx Zynq-7000 All Programmable SoC Block Diagram
For embedded applications, Xilinx has introduced the new Zynq-7000 family, a high-performance processing/analytical platform that supports:
- Up to 1 GHz ARM Dual Cortex-A9 processor system (PS) with hardened peripherals very tightly coupled to the 28 nm class of FPGA programmable logic (PL) fabric. Both Artix-7 and Kintex-7 performance are supported in the FPGA, as well as pin compatibility and scalability between the two
- A Linux-based symmetrical (SMP) or asymmetrical (AMP) multiprocessing system, with support for MVTec's powerful machine vision library HALCON, is available. This platform can also be combined with Silicon Software's Visual Applets tool flow to allow software engineers to target their vision-base algorithms into the FPGA fabric at levels of performance (hardware acceleration) beyond that of conventional DSPs and microprocessors
- High-performance transceivers for CoaXPress or 10GE Vision embedded frame grabbing
High-speed DDR2/3 memory support in the PS and PL:
- Up to 32-bit 1,333 Mb/s DDR3 in PS
- Up to 128-bit, 1,866 Mb/s in PL
- Additionally, industrial Ethernet support may be needed for systems integrating high-performance vision systems and industrial networking in automation systems. Xilinx SoC technology can be used to support multi-standard industrial networking protocols in a single-chip integrated solution: e.g., EtherCAT, PROFINET, Ethernet Powerlink.
The availability of the MVTec HALCON machine vision library on the Zynq-7000 platform means that one of the market’s most powerful and established machine vision software libraries can be integrated into a powerful and flexible embedded system, as shown in Figure 10. With performance being critical in such systems, the ability to accelerate machine vision preprocessing in the PL of the Zynq-7000 SoC using Silicon Software's Visual Applets tool means that a new level of embedded performance can be realized.
Figure 10: Zynq-7000 Platform CVS incorporating HALCON and Visual Applets
Xilinx has demonstrated high-performance systems of this kind at industry shows, and a video can be seen on this YouTube page (http://www.youtube.com/watch?v=vyBfKvis2lY) that showcases accelerated vision processing with visual applets and HALCON beyond 90 f/s with a machine vision part inspection application. This demonstration proves the powerful combination of closely coupled programmable logic with the dual ARM Cortex-A9 processing system through accelerating image processing by approximately 20x over processor-only based approaches.
The rapidly growing demands for high-performance machine vision systems are pushing design engineers of cameras and frame grabbers to look for new, efficient, cost-effective ways to realize high-bandwidth connectivity and processing.
The Xilinx 7 series All Programmable device families have solutions in partnership with S2I, offering 10G+ solutions ready for production implementations that satisfy these market needs.
The solutions highlighted in this paper offer options to machine vision designers for high-bandwidth systems, allowing them to weigh up the choice of technology in terms of cost, space, and power consumption.
Such system solutions can be offered in IP block format or as a turn-key solution from Xilinx Alliance partner S2I.
Please contact Xilinx for further information and queries.