Due to the all-weather data acquisition capabilities, highresolution space borne Synthetic Aperture Radar (SAR) plays an important role in remote sensing applications like change detection. However, because of the complex geometric mapping ofbuildings in urban areas, SARimages are often hard to interpret. SARsimulation techniques ease the visual interpretationofSARimages, while fully automaticinterpretation is still a challenge. This paper presents a method for supporting the interpretationofhighresolutionSARimages with simulated radar images using a LiDAR digital surface model (DSM). Line features are extracted from the simulated and real SARimages and used for matching. A single building model is generated from the DSM and used for building recognition in the SARimage. An application for the concept is presented for the city centre of Munich where the comparison of the simulation to the TerraSAR-X data shows a good similarity. Based on the result ofsimulation and matching, special features (e.g. like double bounce lines, shadow areas etc.) can be automatically indicated in SARimage.
Related works reported in the literature mostly compare two SARimages with the same imaging geometry for change detection. In comparison to LiDAR-SAR change detection, the SAR-SAR change detection is free from modeling of real world and assumption of reflection parameters, thus they avoid these modeling errors which may affect change detection results. However, the traditional SAR-SAR change detection also has limitations. On one side, the interpretationof the detected changes is difficult, especially for very highresolution data. They may only provide pixels with decreased or increased signals. But the interpretationof these detected changed pixels is still a challenging task. Recently, some articles (Ferro et al. 2013; Marin et al. 2015) have presented methods to solve this problem, but the provided methods can work only for specific buildings (e.g., isolated rectangular buildings). The second limitation is the requirement ofSARimages with the same imaging geometry. This requirement might not be fulfilled if we want to use the first available SAR data in crisis situation. For example, the TerraSAR-X satellite has a period of 11 days (which can fulfil the same imaging geometry) and a global access time of maximum three days (which cannot guarantee the same imaging geometry).
A variety of algorithms have been published in the literature that aims at the detection and reconstruction ofbuildings from SARimages. Typically, most of the developed approaches rely on auxiliary information e.g., the multi-sensor data provided by the optical   and LiDAR  sensors, Geographic Information System (GIS) data e.g., 2-D building footprints , multi-dimensional data e.g., polarimetric SAR (PolSAR) , or multi-view/multi-aspect data such as interferometric SAR (InSAR) . These approaches improved the feature ex- traction process by providing the complimentary information. To our knowledge, the literature using only singleSARimage in the context of building detection is rather sparse. Among few existing approaches, Quartulli and Datcu  employed an automatic stochastic algorithm to reconstruct buildings from a singleSAR intensity image by modeling strong signals originated via dihedral scattering at the bottom and the layover at the roof edges of the building. Zhao et al.  proposed a building detection method based on marker controlled water- shed algorithm. A similar approach that exploited layover and double bounce echoes to detect and determine the number ofbuildings from a singlehighresolutionimage was provided in . Ferro et al.  also developed a method that was primarily based on extracting a set of low-level bright (lines) and dark (shadows) primitives. Chen et al.  introduced a more recent 1-D range detector to determine the 2-D building footprints. The method could potentially reconstruct simple symmetrical building footprints but might fail for scenes containing more complex non-symmetrical building shapes.
With the launch of the German TerraSAR-X system in June 2007, a new generation ofhigh-resolution spaceborne SAR data is available. This opens new perspectives and chal- lenges for the automaticinterpretationof urban environments. A rich information con- tent, previously hidden or not clearly distinguishable in low resolutionimages such as urban structures (small buildings, vehicles, etc), is now disclosed. However, only proper approaches are able to retrieve automatically this new detailed information. In fact, in- side urban areas, the electromagnetic scattering is characterized by a variety ofsingle or multiple scattering mechanisms with a wide range of scattering amplitudes. This makes the interpretation and information extraction over such areas from SARimages quite complex to perform.
Modern spaceborne synthetic aperture radar (SAR) sensors, such as TerraSAR-X/TanDEM-X and COSMO-SkyMed, can deliver very highresolution (VHR) data beyond the inherent spatial scales (on the order of 1m) ofbuildings, constituting invaluable data source for large-scale urban mapping. Processing this VHR data with advanced interferometric techniques, such as SAR tomography (TomoSAR), enables the generation of 3-D (or even 4-D) TomoSAR point clouds from space. In this paper, we present a novel and generic workflow that exploits these TomoSAR point clouds in a way that is capable to automatically produce benchmark annotated (buildings/non- buildings) SAR datasets. These annotated datasets (building masks) have been utilized to construct and train the state-of- the-art deep Fully Convolution Neural Networks with an additional Conditional Random Field represented as a Recurrent Neural Network to detect building regions in a single VHR SARimage. The resultsof building detection are illustrated and validated over TerraSAR-X VHR spotlight SARimage covering approximately 39 km 2 ̶ almost the whole city of Berlin ̶ with mean pixel accuracies of around 93.84%.
provided in literature. This dissertation contributes a pixel-based algorithm to detect increased backscattering in SARimages by analyzing the SAR pixel values according to simulated layers. To detect demolished buildings, simulated images are generated using LiDAR data. Two comparison operators (normalized mutual information and joint histogram slope) are used to compare image patches related to same buildings. An experiment using Munich data has shown that both of them provide an overall accuracy of more than 90%. A combination of these two comparison operators using decision trees improves the result. The fourth objective is to detect changes between SARimages acquired with different incidence angles. For this purpose, three algorithms are presented in this dissertation. The first algorithm is a building-level algorithm based on layer fill. Image patches related to the same buildings in the two SARimages are extracted using simulation methods. For each extracted image patch pair, the change ratio based on the fill ratio of building layers is estimated. The change ratio values of all buildings are then classified into two classes using the EM-algorithm. This algorithm works well forbuildings with different size and shape in complex urban scenarios. Since the whole building is analyzed as one object, buildings with partly demolished walls may not be detected. Under the same idea, a wall-level change detection algorithm was developed. Image patches related to the same walls in the two SARimages were extracted and converted to have the same geometry. These converted patch pairs are then compared using change ratios based on fill ratio or fill position. Lastly, the wall change results are fused to provide building change result. Compared to the building-level change detection algorithm, this method is more time consuming, but yields better resultsfor partly demolished buildings. A combination of these two algorithms is therefore suggested, whereby the building-level method is used for all buildings and wall-level method additionally for selected large buildings. The third developed algorithm is a wall-level change detection algorithm based on point-feature location. To this end, local maximum points in two SARimages corresponding to the same building façade are compared. This method provides promising result for the present data. It may work better for future data with increased resolution to detect changes of detailed façade structures.
black. This might appear due to signal diffraction at tower crossbeams which is not considered in the simulation process. Compared to the real SAR case, objects in the neighborhood of the tower are neglected since the ground underneath the tower only has been modelled by means of a flat plane. Hence, overlay effects with other objects are not obtained in the simulation but clearly visible in the real SARimage, where features C to G are overlaid with diffuse backscattering contributions caused by vegetation in the park surrounding the tower. Further double bounce effects caused by the interaction between the tower and the surrounding ground show up at the bottom (I) and at both sides of the tower (J ,K). While effect (I) is also apparent in the SARimage effects (J ) and (K) likely dissappear because ofbuildings adjacent to the tower avoiding any backscattering to the sensor. Geometrical and radiometrical interpretation is supported by using the bounce level information in order to display different bounce levels in separated image layers (Figure 14b and 14c). While single bounce and double bounce cause strong intensities and hence clear structures in the reflectivity map, contributions of higher bounce levels are weakened by the decreasing weight factor of the ray followed through the modelled scene (see Section III-B). For this reason, threefold, fourfold and fivefold bounces are not displayed since their contributions show too little contrast to the background.
sual perception of layover varies significantly while the intensity distributions intend to follow log-normal distributions (even for local maxima as a rough approximate). The polarization mode HH dominates VV in intensity (mean, median) but also tends to show stronger intensity variation (standard deviation). In con- trast, no tendency with respect to the proportion between HH and VV is observed on the number of prominent point signatures. As a conclusion, distribution-based methods for layover analysis seem to be favored which may be optionally supported by feature- based concepts, e.g., focused on lines or point-like signatures. The prominent appearance of facade layover motivates the identi- fication of the related SARimage parts in order to extract building- related information. In this context, a concept has been proposed for identifying building layover based on simulation methods in- cluding CityGML data. To this end, a simulation processing chain has been extended to fuse TerraSAR-X images and prior information provided by CityGML data sets. In this context, the data transformation from CityGML to the POV-Ray data struc- ture used by the simulator is fully automated using the spatial ETL (extract, transform, and load) software FME. As a first ex- ample, a case study of the Munich city center has been shown where the extent of simulated building layover is directly super- posed on a geocoded TerraSAR-X image. The simulation concept based on CityGML data indicates that the representation of real world entities by semantic objects has a number of advantages over the purely geometric representation in a DSM. In particular, change detection applications may benefit from CityGML data sets if they are kept up to date and assigned with meta informa- tion.
As a result of the imaging concepts, the off-nadir perspec- tives of the sensors lead to different and contrary distortion effects in the image data, especially for 3-D objects like buildings or trees (see e.g.  or ). Considering this fundamental difference in imaging, it is clear that a straight- forward overlay ofhigh-resolution optical data and SAR data is only possible for scenes without elevated objects. However, typical scenes of interest are often comprised of man-made structures, vegetation, and a variation of the ground level. On the one hand, this prohibits the application of classical, pixel- based image fusion methods, which are well-established for remote sensing imagery with similar characteristics . On the other hand, the matching of corresponding image parts, which is often needed forimage registration or stereogrammetry, becomes a non-trivial task , . By proposing the multi- sensor simulation framework SimGeoI (simulator of geo- referenced interpretation layers), this paper provides the basis for the joint exploitation ofSAR and optical data for objects of interest. In more detail, SimGeoI considers the geometric projection effects pertinent to both sensors, which allows for the extraction of corresponding image parts in optical and SARimages despite the difference of sensor type and perspective. Based on that, the image parts aligned through SimGeoI can be used in subsequent data fusion steps, which focus on an object-based (where objects refer to, e.g. individual building facades) rather than a pixel-based analysis. This actual fusion step, however, is not within the scope of this paper.
providing artificial images and interpretation layers for opti- cal data (see details in ). The name SimGeoI (Simulator of Geo-referenced Interpretation Layers) refers to the integrated framework where the type ofsimulation and scene interpreta- tion is triggered by input information. The first input is the image meta file, corresponding to a geo-referenced optical image or SARimage with UTM coordinates. The file con- tains the necessary information for defining the simulation set-up (sensor perspective, image properties, scene average height) and is interpreted automatically. The second input is a scene DSM in UTM coordinates which is required to generate the scene model for ray tracing.
The Staggered-SAR concept overcomes this drawback by continuously varying the PRF along the track, so shifting the blind ranges across the swath (lower right of Fig. 2). HRWS imaging is thus achieved without the need for a long antenna with multiple apertures. This concept has been already analysed in , showing that a 350 km wide swath can be imaged with highresolution, although the proposed PRF variation schemes lead to a degradation of the azimuth impulse response, which in some cases is even dependent on the azimuth position of the target. In this paper, a new PRF variation scheme is proposed, which overcomes these drawbacks. Moreover, the proposed new scheme is able to ensure an overlap between the transmitted pulses and the echoes coming back from nadir, thus avoiding the image quality degradation associated with the incomplete suppression of the latters. Systems with continuously-varied PRF require that a uniformly sampled signal is recovered from a non-uniformly sampled one. This is usually done by using interpolation. In this paper multi-channel reconstruction is exploited as well. The performances are finally evaluated, comparing different processing approaches. Finally, the effects associated to the use of a continuously varied PRI are artificially introduced into real TerraSAR-X data acquired over Borneo, Indonesia, showing the resulting image degradation.
tion, are likely to fail as high-level semantic description should be considered for better discrimination. Therefore, we propose to use image-patch-based analysis method forSARimage interpreta- tion. Inspired by ratio edge detector, in this letter, a new feature extraction method represented by the mean ratios in different directions is proposed for VHR SARimage content characteri- zation. Based on the mean ratio, two simple yet powerful and robust features are proposed forSARimage patch indexing. One is the bag-of-word model using not only the basic statistics, i.e., local mean and variance, but also the mean ratios in different directions. The second one is an adaptation of the Weber local descriptor to SARimages by substituting the gradient with the ratio of mean differences in vertical and horizontal directions. To evaluate the proposed features, image patch indexing based on active learning using a SARimage database consisting ofhigh-resolution TerraSAR-X patches is performed. Comparison with the state-of-the-art features, particularly texture features, has shown improved performance forSARimage categorization.
beginning, a coarse-to-fine annotation scheme was proposed, which was initially implemented according to the urban op- eration functionalities and then hierarchically divided into more detailed categories (see Fig. 2). The dataset, comprising 33,358 SARimage patches (i.e., image chips) with a size of 100 × 100 pixels each, supports 10 different functional urban types. The design of the OpenSARUrban dataset follows the idea of annotation transition from the optical domain to the SAR domain. Owing to five essential properties, namely large- scale, diversity, specificity, reliability, and sustainability, Open- SARUrban achieves several goals. The first one is to support urban target characterization analysis. The second one is to foster applicable and advanced classification algorithms for Sentinel-1 urban targets. The third one is to explore content- based image retrieval – of this kind of data. The visualization of this dataset is performed from the perspective of manifolds via a combination of fast compression distances (FCDs) and t-distributed stochastic neighbor embedding (t- SNE), which offers an intuitive way to understand the structure within the given dataset. In the case ofimage classification of this dataset, some representative benchmarking algorithms are provided.
The organization of this paper is as follows: Section II presents the basic TerraSAR-X products, and based on their characteristics, we proposed a configuration of the TerraSAR-X product to be used for evaluation. Using this configuration, two test data sets are built and described separately in the next sections for each use case. Section III explains in detail the following: 1) the preprocessing of the data before being used for evaluation; 2) the feature extraction methods to be applied such as statistical methods (GLCM) and spectral methods (e.g., GAFs, QMFs, and NSFT); and, finally, 3) the methodology to be used for the evaluation of two use cases. The first use case (see Section IV) investigates the dependence of the resolution, pixel spacing, and PF (for all four methods) for a multiresolu- tion pyramid and proposes a configuration of the data that can be used further for other evaluation. This configuration is refer- ring to the type, mode, and geometric resolution configuration of the product and features to be used in order to describe the data. The second use case (see Section V) gives the optimal patch size, orbit direction, and incidence angle for a higher accuracy of the classification. Using the knowledge from the first use case (e.g., standard product, RE, and features GAFs or QMFs), we built a new data set in order to find the following: 1) the optimal patch size for products having SM mode or HS mode; 2) the orbit direction considering the same product (e.g., SM mode), the same area, and the optimal size of the patch previously defined; and 3) the incidence angle having the same orbit direction (e.g., ascending/descending), mode (e.g., HS mode), and optimal patch size. In the end of this section, we will have an idea about what configuration of product to use and what parameters (orbit direction and incidence angle) to be selected. This paper ends with conclusions and future work.
SARimages and optical images are usually very different in their geometric behavior, which result in different positions of objects even after orthorectification. The reason is that the images are normally orthorectified using a DEM with low accuracy (e.g. derived from the Shuttle Radar Topography Mission (SRTM)) in which highbuildings are not represented. However, in some special acquisition geometry, the SAR and optical images are very similar. In  an optical imageof WorldView-1 and a SARimageof TSX are shown with a similar displacement of objects. The prerequisite for that is that the two sensors fly in the same direction and the sum of the incidence angles should be approximately 90 degree.
When we attach text labels to elements ofimages that explain their content or meaning, we call this step “labeling” or “seman- tic annotation.” If this annotation can be automated, we have a tool for automated image content understanding and many researchers have tried to make some progress in this field. Traditionally, the conceptions and the performance of seman- tic annotations have been verified based on publicly available collections of typical image data sets. Most of them have been built by individual groups with the intention to solve specific problems. Thus, for the different domains of multimedia (e.g., press and television archives) and remote sensing applications, the available data sets contain a large variety of retrievable objects and categories. While most multimedia applications aim at the recognition ofsingle objects in front of mostly irrelevant background (a cow on a meadow), typical remote sensing applications call for the identification of land cover or sea surface details covering the full image area (icebergs
The registration approaches for different SARimages like different viewing geometry or different sensors is a difficult topic and currently an ongoing research. An approach quite similar to ours is suggested by Auquière et al., 1998. They perform a simple transformation, i.e. a (2D-)shift, applied to the PRI images. Läbe and Förstner, 2006 used the SIFT operator (Lowe, 2004) for an automatic orientation of aerial images, which is even for optical images still a topic of research. The development of a mostly automatic procedure for precise geocoding of near real-time satellite SARimages is on our focus in this paper. In the first part of the paper we explain an automatic approach for the generation of ground control points (GCPs) for precise orbit determination. It is based on image-to- image matching with a new developed feature extraction algorithm from computer vision (Lowe, 2004). After GCPs are found a least squares adjustment of the orbit parameters is performed. In the second part of the paper we present some resultsfor this registration method and show in detail the found matches.
It should be noted that, as discussed in , the pulse extension effect itself is not visible in the NESZ calculation. This motivates the definition of the Pulse Extension Loss as a separate quantity, of special interest for narrownbeam systems as e.g. those using SCORE. In this case, the degradation was found to be non-critical in terms of the fulfilment of the requirements. A compensation of this effect is however possible if necessary by separating the signal into RF sub-bands and applying sub-band dependent beamforming weights , . This technique is referred to as dispersive frequency beamforming and exploits the variation of the RF frequency of the chirp in time, which leads to a correspondence of the instantaneous frequency and the delay of the locally illuminated area on ground. The potential implications for the hardware are the need to implement additional filters and an increase of the number of digital data streams to be handled by the on-board electronics.
M arYaM S oHeili M aJD , e liSabetH S iMonetto & l aUrent P oliDori , Le Mans, France
Keywords: PolSAR, urban area, supervised classiication, maximum likelihood Summary: In this work, our aim is to assess the
potential of a single polarimetric radar imageofhigh spatial resolutionfor the classiication of ur- ban areas. For that purpose, we concentrate on a ine deinition of urban land cover types including ground classes corresponding to different roof types and we test several supervised classiication algorithms. In particular, we deal with maximum likelihood classiication using several polarimetric and textural indices. At irst, we propose a state-of- the-art statistical analysis of polarimetric synthetic aperture radar (SAR) data to study the statistical behaviours of these indices. We consider the Gauss, log-normal, Beta 1, Weibull, Gamma, K, and Fisher statistical models and estimate their parameters us- ing two methods: maximum likelihood estimation (MLE), and method of log-moment (MoLM). The Fisher probability density function (pdf) is able to properly model all the descriptors. Then, we pro- pose to introduce this information in an adapted supervised classiication scheme based on maxi- mum likelihood and the Fisher pdf. We compare the classiication results with the Wishart-based maxi- mum likelihood algorithm, a Gaussian-based one and SVM (support vector machine). Our experi- ments are based on an imageof a suburban area, acquired by the airborne RAMSES SAR sensor of ONERA, the French Aerospace Lab. The results highlight the potential of such data to discriminate urban land cover types, and the overall accuracy reaches 84 %. However, the results from the tested classiication methods show a problematic confu- sion between roofs and trees. Some possible solu- tions are discussed at the end of this paper.
V. F REQUENCY DEPENDENCE OF COHERENCE The assumption that there is no variation of the coherence across the spectrum may not hold in some cases, for example for the azimuth direction because of the antenna pattern. In this case an additional degradation is expected, since the signal quality is worse where it is most useful, i.e. at the edges of the spectrum. In this scenario the optimum estimator will have to weigh each frequency component according to the position in the spectrum and the noise level. Introducing a dependence of the coherence matrix on the angular frequency (Γ = Γ(ω)), it is possible to obtain an expression for the Fisher information: