Image processing - Environment Sensing (Perception) Layer

Chapter 3. Environment Sensing (Perception) Layer

4. Image processing

After the image acquisition with the image sensor, the image processing should be started. In computer vision systems there are some typical processing steps which generally cover the operation of the automotive cameras.

In the following listing, which is based on [34], these steps are detailed complemented with the automotive specialities.

• Pre-processing – Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are

• Re-sampling in order to assure that the image coordinate system is correct.

• Noise reduction in order to assure that sensor noise does not introduce false information.

• Contrast enhancement to assure that relevant information can be detected.

• Scale space representation to enhance image structures at locally appropriate scales.

• Feature extraction – Image features at various levels of complexity are extracted from the image data.

Typical examples of such features are

• Lines, edges. Primarily used in lane, road sign and object detection functions.

• Localized interest points such as corners.

• More complex features may be related to shape or motion.

• Detection/segmentation – At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing. Examples are

• Segmentation of one or multiple image regions which contain a specific object of interest. Typically used in night vision to separate the ambient light sources from the vehicle lights.

• High-level processing – At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example:

• Verification that the data satisfy model-based and application specific assumptions.

• Estimation of application specific parameters, such as object size, relative speed and distance.

• Image recognition – classifying a detected object into different categories. (E.g. passenger vehicle, commercial vehicle, motorcycle.)

• Image registration – comparing and combining two different views of the same object in stereo cameras.

• Decision making – Making the final decision required for the application. These decisions can be classified as mentioned in Section 1.3, for example:

• Inform about road sign

• Warn of the lane departure

• Support ACC with lanes, objects‘ speed and distance (adding complementary information to the radar)

• Intervene with steering or breaking in case of lane departure

5. Applications

The automotive cameras can be classified several ways. The most important aspects are the location (front, rear), colouring (monochrome, monochrome+1 colour, RGB) and the spatiality (mono, stereo).

The rear cameras are usually used for parking assistant functions, whilst the front cameras can provide several functions such as:

• Object detection

• Road sign recognition

• Lane detection

• Vehicle detection and environment sensing by dark for headlight control

• Road surface quality measurement

• Supporting and improve radar-based functions (Adaptive Cruise Control, Predictive Emergency Braking System, Forward Collision Warning)

• Traffic jam assist, construction zone assist

The colouring capabilities influence the reliability and the precision of some camera functions. Basically most of the main functions can be implemented with a mono camera, which detects only the intensity of light on each pixel. On the other hand only one plus colour can significantly help to reach better performance. For example with red sensitive pixels the road sign recognition could be more reliable.

The mono or stereo design of the camera has strong influence on the 3D vision, which is important for measuring the distances of the objects and to detect the imperfections of the road.

Figure 3.13. Bosch Multi Purpose Camera (Source:

http://www.bosch-automotivetechnology.com/)

Figure 3.14. Bosch Stereo Video Camera (Source: http://www.bosch-automotivetechnology.com/)

To deal with the aforementioned tasks the output of the imager is processed by a high computing performance CPU-based control unit. The CPU is often supported with an FPGA which is fast enough to perform the pre-processing and feature extraction tasks.

The output of the camera can be divided into two classes, which also means two different architectures:

• Low-level (image) interfaces: analogue, LVDS (low-voltage differential signalling)

• High-level (identified objects description): CAN, FlexRay

In the first case the imager with the lens and the control unit are installed in different places inside the vehicle, whilst in the second case all of the necessary parts are integrated into a common housing and placed beyond the windshield. This integrative approach doesn‘t require a separate control unit.

As an example of a modern automotive camera the Stereo Video Camera of Bosch has the following main properties. Its two CMOS (complementary metal oxide semiconductor) colour imagers have a resolution of 1280 x 960 pixels. Using a powerful lens system, the camera records a horizontal field of view of 45 degrees and offers a 3D measurement range of more than 50 meters. The image sensors, which are highly sensitive in terms of lighting technology, can process very large contrasts and cover the wavelength range that is visible to humans.

This stereo camera enables a variety of functions that has already been mentioned before. The complete three-dimensional recording of the vehicle's surroundings also provides the basis for the automated driving functions of the future. (Source: [35]

Compared to the aforementioned Bosch Stereo Camera the following table shows the main properties of the Continental CSF200 mono camera.

Range Maximum: 40m … 60m

The digital imaging and computer vision have become more and more important on the road towards the fully autonomous car. But the imagers in these cameras have got a hard constraint which is the absence of light. To overcome these shortcomings several night vision systems were developed in the past decades, which is summarized in the followings based on [36].

Nowadays there are two main technologies are available for the vehicle manufacturers:

• Far Infrared (FIR) technology

• Near Infrared (NIR) technology

An FIR system is passive, detecting the thermal radiation (wavelength of around 8–12 um). Warm objects emit more radiation in this region and thus have high visibility in the image. NIR systems use a near-infrared source to shine light with a wavelength of around 800 nm at the object and then detect the reflected illumination. The main advantage of NIR systems is that the cost is lower, because the sensor technology at these wavelengths is already well developed for other imaging applications such as video cameras. A NIR hardware can also potentially be combined with other useful functions such as lane departure warning.

In contrast, FIR systems offer a superior range and pedestrian-detection capability, but their sensors cannot be mounted behind the windscreen or other glass surfaces. University of Michigan Transport Research Institute studies comparing the ability of drivers to spot pedestrians using both the NIR and FIR devices showed that under identical conditions, the range of an FIR system was over three times further than that obtained with NIR

— 119 m compared with 35 m.

Table 3.2. Comparison of FIR and NIR systems (Source: Jan-Erik Källhammer)

Pros NIR Pros FIR

Lower sensor cost Superior detection range

Higher image resolution Emphasizes objects of particular risk for example, pedestrians and animals

Potential for integrating into other systems Images with less visual clutter (unwanted features that may distract driver)

In document Highly Automated Vehicle Systems (Pldal 45-48)