Automatic buildingreconstruction from Digital Surface Models (DSMs) with or without using other data sources is still an active research area in Photogrammetry and GIS institutions. In this context, providing a 3D CAD model that represents the overall shape of the buildingand containing the most significant parts has boosted many applications in the GIS area such as urban planning. In the past few years, several algorithms have been proposed for automated 3D buildingreconstruction. The algorithms comprise methods that only employ elevation data such as high resolution airborne LIDAR for model generation while some methods use other additional sources of data. An additional data source plus DSM is usually employed when the quality or resolution of the elevation data is not appropriate for model generation. Segmentation based approaches for a 3D building model generation from grid data are proposed by [ 1 , 2 ] to find planar regions that determine a polyhedral model. Gorte [ 3 ] employed another segmentation approach using Triangulated Irregular Network (TIN) structures for the data and the segments are generated by iteratively merging the triangles based on similarity measurements. Verma et al. [ 4 ] also proposed a method based on integration of model-driven and data-driven algorithms to detect and construct a 3D building models from LIDAR data. The method consists of three main steps, namely, segmentation of building roofs, region adjacency graph for shape roof topology, and parametric roof reconstruction.
Unfortunately, building models derived from ALS data are restricted by the point spacing of datasets. Hence, it is difficult to achieve high planimetric accuracy of a reconstructed scene. While ridge lines inherit a good precision from the redundancy in extracted plane intersection processes, the precision of the outer perimeter of reconstructed roofs suffers from the limited point spacing. An alternative 3D modeling approach is to make use of aerial imagery [10–14]. Compared to laser scanning, optical imagery with its much higher spatial resolution allows for a more accurate extraction of building edges . On the other hand, a major shortcoming of image-based modeling relates to their 2D nature, requiring reliable image matching techniques to achieve automation. Moreover, common problems encountered by image processing (such as shadows, occlusions, and poor contrast) can hinder effective reconstruction . For the two aforementioned groups of methods that are based on one type of data, it is hard to obtain both, planimetric and height accuracy at the same level . All of these facts are the motivation for the third type of buildingreconstruction approach, which benefits from synergetic properties of LiDAR and image data and uses both of these sources. Employing multiple data types enables the combination of modeling cues and covers shortcomings inherited from the acquisition technique. It can be expected that the limitations coming from one sensor (such as data gaps, occlusions, shadows, and resolution issues) will be compensated for by the information provided by the second sensor.
Reconstruction: Coarse outline of an individual roof segment is then reconstructed using alpha shapes algorithm. Due to varying and lower point density of TomoSAR points, alpha shapes however only define the coarse outline of an individual building which is usually rough and therefore needs to be refined/smoothed (or generalized). To this end, taking into account the average roof polygon complexity (APC), a regularization scheme based on either model fitting (i.e., minimum bounding ellipse/rectangle) or quadtree is adopted to simplify the roof polygons obtained around each segmented (or distinct) roof segment. The simplified roof polygons are then tested for zig-zag line removal using Visvalingam -Whyatt algorithm. Finally, height is associated to each regularized roof segment to obtain the 3-D prismatic model of individual buildings. The proposed approach is illustrated and validated over scenes containing two large buildings in the city of Las Vegas using TomoSAR point clouds generated from a stack of 25 imagesusing Tomo-GENESIS software developed at German Aerospace Center (DLR).
For evaluation of our approach, we use World View-2 imagery of Munich, Germany. An overview of area of interest (AOI) is illustrated in Figure 3 . We subdivided the AOI into training, validation and evaluation parts keeping the margin of 320 px in between to ensure datasets independence. The training data consists of stereo DSM, PAN both 0.5 m GSD and 2 m GSD MS images with 8 channels tiled into a collection of 32.500 patches with a size of 320 × 320 px and overlap 160 × 160 px, where 20% are kept back for validation and the rest for training. A 1280 × 2560 px area, which does not overlap with the training data, is used for testing phase (see Figure 4 ). The satellite images are orthorectified, because we want to obtain building footprints that appear as if they are viewed from nadir. In order to show the generalization capability of our model, we include small parts from World View-2 imagery of urban areas of Tunis, Tunisia (see Figure 5 ). To compensate for the missing ground truth in this area, we use building footprints from OSM. However, there are only a few areas, which are densely covered by OSM building footprint data. The test regions are acquired by selecting areas where high quality DSM data as well as OSM building footprints are available.
monitoring method is presented. This method combines the multispectral imagery and DSMs from stereo matching to obtain three kinds of changes. The proposed method contains three basic steps. The first step is to segment the panchromatic images to get the smallest possible homogeneous regions. In the second step, based on a rule based classification using change information from Iteratively Reweighted Multivariate Alteration Detection (IR-MAD) and height, the changes are classified to ruined buildings, new buildings, and changes without height change (mainly temporary residential area, etc. tents). In the last step, a region based grey level co-occurrence matrix texture measurement is used to refine the third change class. The method is applied to building change detection after the Haiti earthquake.
Development of automatic methods for reconstruction of buildings and other urban objects from synthetic aperture radar (SAR) images is of great practical interest for many remote sensing applications due to their independence from solar illumination and all weather capability. In addition to it, very high resolution (VHR) SAR images acquired from spaceborne sensors are also capable of monitoring greater spatial area at significantly reduced costs. These benefits have motivated many researchers and therefore several methods have been developed that use SAR imagery for detection andreconstruction of man- made objects in particular buildings. For instance, (Quartulli, 2004) and (Ferro, 2009) present approaches for buildingreconstruction based on single-aspect SAR images. However, use of single SAR images only poses greater challenges especially in dense urban areas where the buildings are located closely together resulting in occlusion of smaller buildings from higher ones (Wegner, 2009). To resolve this, interferometric SAR acquisitions (InSAR) are acquired which implies imaging area of interest more than once with different viewing configurations. (Gamba, 2000) proposed an approach that uses such InSAR configuration to detect and extract buildings based on a modified machine vision approach. (Thiele, 2007) also presented a model based approach that employed orthogonal InSAR images to detect and reconstruct building footprints. An automatic approach based on modeling building objects as cuboids using multi-aspect polarimetric SAR images is presented in (Xu, 2007). (Sportouche, 2011) and (Wegner,
In order to validate our approach, we run the algorithm over TomoSAR point clouds generated from TerraSAR-X high spotlight images. Fig. 1(b) shows the result of applying SD estimation procedure. The two parameters r (radius of the neighborhood cylinder) and d are empirically set to 5m and 2m respectively according to the point density of the data set. One can observe that TH value influences the number of extracted façade points. Lower TH value results in higher completeness but lower correctness. To extract lower façades and to automate the procedure, the threshold TH is set to the maximum of SD histogram value. This includes not only the façade points but additionally also some nonfaçade points with relative high SD, e.g., roof points. To reject these points from the set of extracted points after SD thresholding, surface normals information is utilized. Fig. 1(c) shows the extracted façade points by retaining only those points having normals between ±15 degrees from the horizontal axis.
This paper presents an approach that automatically (but parametrically) reconstructs 2-D/3-D building footprints using 3-D synthetic aperture radar (SAR) tomography (TomoSAR) point clouds. These point clouds are generated by processing SAR image stacks via SAR tomographic inversion. The proposed approach reconstructs the building outline by exploiting both the roof and façade points. Initial building footprints are derived by applying the alpha shapes method on pre-segmented point clusters of individual buildings. A recursive angular deviation based refinement is then carried out to obtain refined/smoothed 2-D polygonal boundaries. A robust fusion framework then fuses the information pertaining to building façades to the smoothed polygons. Afterwards, a rectilinear building identification procedure is adopted and constraints are added to yield geometrically correct and visually aesthetic building shapes. The proposed approach is illustrated and validated using TomoSAR point clouds generated from a stack of TerraSAR-X high-resolution spotlight images from ascending orbit covering approximately 1.5 km 2 area in the city of Berlin, Germany.
Both algorithms, line detection andbuilding outline detection, re- sult in sets of lines in both data sets, HSI and hight data. Accord- ing to the imperfect geo-referencing of the data and inaccurate line detection, a tipical mismatch between the line segments ex- ists. The range of this mismatch depends mostly on the accuracy of geo-referencing. We propose a method which can be applied also for data where the relative error of geo-referencing between the data sets amounts to 20-30 image pixels or even larger. We implemented an automatic approach using a 3D accumulator for line matching. We use one of the data sets as reference and move line segments from the second data set over the line seg- ments from the first data set. For each position we check how many lines correspond to each other and fill accumulator with this number. At the same time we store the line correspondences for this position. We repeat this procedure rotating line segments from the second data set by small angles in range of few degrees. This algorithm results in an 3D accumulator filled with number of fitting lines for each line correspondences assigned to each cell of accumulator. Then we search for the cell with maximal number of correspondences in accumulator and use line correspondences assigned to this cell for transformation parameter calculation. The correspondence between the lines for every position and an- gle of the accumulator is determined using statistical tests. We use homogeneous coordinates to implement the problem. As shown in (Heuel, 2002) we calculate the distance vector d and test the hypothesis H 0
In the present article, we propose an original method for object-based coregistration refinement of DSMand HSI imagesand compare it to the line-based coregistration. Focusing on urban areas, we base the image-to-image coregistration on spa- tial features of buildings objects, which are described by out- lines. For line-based coregistration, line segments are extracted independently of each other, whereas for object-based coregis- tration these are extracted using knowledge about the topology and regularities within approaches dealing with (entire) build- ing outlines. Thus, building outlines represent an object and extend line-based approaches by incorporating topology knowl- edge (Avbelj, 2012). Furthermore, the geometric accuracy of the detected line segments of building outlines is taken into ac- count. A stochastic model is introduced to compute optimal transformation parameters between the HSI and the DSM. The coregistration problem is formulated using homogeneous coor- dinates, as points and lines have the same representation. The method enables registration of HSI andDSM of di fferent spec- tral and spatial resolution.
direct measurements (e.g. as in contrast to the UltraCam camera with integrated INS/GPS system in the form of the UltraNAV system). To solve this problem we developed a software package that is tailored to be used with UAV imagery, the MAVMAP software . The key steps of this software are automatic wide baseline image matching, outlier-robust camera pose estimation and 3D feature point triangulation. The last step is a bundle adjustment robust to still remaining outliers from mis-matches. Automatic feature matching, in particular matching of wide baseline images, is in almost all cases producing outliers from mis-matches. All the pose estimation algorithms need to be robust against such outliers. In MAVMAP this is achieved using robust estimators, e.g. RANSAC and robust cost functions. In addition to robust algorithms, additional constraints from data of an inertial measurement unit (IMU) are also used. Basically all UAV’s are equipped with accelerometers and gyroscope that form an IMU for flight control. These systems, due to scale and weight limits, cannot be used to track the full pose of the UAV but it is possible to measure the orientation of the UAV for each image shot. This orientation can be used as additional constraint in pose computation and makes the optimization more robust against outliers. These constraints can be used for pose optimization by minimizing the cost function Eq. 1.
In this paper a hybrid method is proposed based on top-down and bottom-up strategies. This method finds the roofs optimally fitting the DSM derived from stereo satellite images. The satellite data derived DSM has low quality in comparison with LiDAR data andDSM from aerial images. In a bottom-up approach building roof types are firstly classified to pitched and flat roofs. Ridge lines are extracted and roof primitives’ parameters are initialized according to the extracted ridge lines. In the proposed scheme, the results of bottom-up approaches help to speed up the following top-down step and improve the efficiency and robustness. In the top-down approach, first a primitives library is defined, which contains the five most popular roof types (flat, shed, gable, hipped and mansard roofs). MCMC with simulated annealing is applied for the sampling of parameters until a stop criterion is fulfilled. For each iteration the average of Euclidean distances between model and points clouds with the number of mask points inside the model is computed as fitness function. Fitness values are compared between two sequential iterations. If the fitness values of further iterations do not show any more changes, the algorithm is stopped. The experimental results in two datasets of Munich city prove the potential of the proposed algorithm in dealing with noisy and low quality DSM of satellite data.
A considerable amount of studies addresses DSM-assisted build- ing footprint extraction. Some of them fuse data from different sensors, chiefly multispectral imagesand LiDAR-derived DSMs (Matikainen et al., 2010; Hermosilla et al., 2011; Grigillo and Kanjir, 2012). Those studies, however, rely on different data sources which may imply difficulties concerning the availability and temporal coincidence of the data. Exploiting the potentials of a single platform could yield a solution. Hence, a number of authors directly extracted footprint shapes from LiDAR point clouds (Wang et al., 2006; Zhang et al., 2006; Arefi et al., 2008) or nadir RGB imagery (Shorter and Kasparis, 2009). Stereo images also provide possibilities for detecting building geometries using optical and height information derived from the same data source (Arefi and Reinartz, 2013; Tian et al., 2014) or solely height infor- mation (Weidner, 1997). Photogrammetric techniques have been used to extract three-dimensional line segments for 3D model generation (Zebedin et al., 2008). These line segments, however, require a good perspective coverage of the scene in order to be useful for building extraction. We therefore studied the potentials and limitations of extracting line segments from individual im- ages used for DSM generation and subsequently verifying them on the basis of the DSM.
Random forest is a robust and powerful machine learning classifier, and capable to process large datasets . Here this classifier is used to extract the building class based on the generated training data. All of the channels from multispectral images, together with panchromatic image andDSM are used as input features. Instead using the classification result, we used probability values for each pixel belonging to the building class instead. These probability values from each pixel are averaged to each 2D segments obtained from section 2B. In the end, the final building mask is generated after thresholding.
used to derive off-terrain objects with height thresholds, and can be generated either by subtracting a given digital terrain model (DTM) from the derived DSM or from the DSM alone. For most unsupervised methods, spectral information is usually used for detecting vegetation and sharpening the boundary of buildings. Chen et al. ( 2012 ) proposed a step- wise method combining multispectral ortho-imagery and nDSM: the initial building segments were obtained by truncating the nDSM and normalized difference vegetation index (NDVI) with given thresholds, and the final building masks were generated with rule-based consideration of the region size and relational constraints between buildings and trees. Grigillo, Kosmatin Fras, and Petrovi č ( 2011 ) generated the initial building masks in the same manner, but eliminated tree masks using the homogeneity feature (Zhang 1999 ) and NDVI. Qin and Fang ( 2014 ) proposed a hierarchical method to derive building segments using morphological operations on DSMand NDVI, and they adopted graph cut optimization to refine building boundaries using multispectral images. (Lu, Trinder, and Kubik 2006 ) proposed using the Dempster–Shafer (Shafer 1976 ) algorithm for fusing building probability values extracted from multispectral imagery andDSM. Tian and Reinartz ( 2013 ) computed a building probability map based on random forests (RF) classification, and they adopted panchromatic images to get sharper building boundaries.
This paper demonstrates for the first time the potential of explicitly modelling the individual roof surfaces to reconstruct 3-D prismatic building models using spaceborne tomographic synthetic aperture radar (TomoSAR) point clouds. The proposed approach is modular and works as follows: it first extracts the buildings via DSM generation and cutting-off the ground terrain. The DSM is smoothed using BM3D denoising method proposed in (Dabov et al., 2007) and a gradient map of the smoothed DSM is generated based on height jumps. Watershed segmentation is then adopted to oversegment the DSM into different regions. Subsequently, height and polygon complexity constrained merging is employed to refine (i.e., to reduce) the retrieved number of roof segments. Coarse outline of each roof segment is then reconstructed and later refined using quadtree based regularization plus zig-zag line simplification scheme. Finally, height is associated to each refined roof segment to obtain the 3-D prismatic model of the building. The proposed approach is illustrated and validated over a large building (convention center) in the city of Las Vegas using TomoSAR point clouds generated from a stack of 25 imagesusing Tomo-GENESIS software developed at DLR.
During the last decades, several approaches for the reconstruction of 3D building models have been developed. Starting in the 1980s with manual and semi-automatic reconstruction methods of 3D building models from aerial images, the degree of automation has increased in recent years so that they became applicable to various areas. Some typical applications and examples are shown in section 1.1. Especially since the 1990s, when airborne light detection and ranging (LiDAR) technology became widely available, approaches for (semi-)automatic buildingreconstruction of large urban areas turned out to be of particular interest. Only in recent years, some large cities have built detailed 3D city models. Although much effort has been put into the development of a fully automatic reconstruction strategy in order to overcome the high costs of semi-automatic reconstructions, no solution proposed so far meets all requirements (e.g., in terms of completeness, correctness, and accuracy). The reasons for this are manifold as discussed in section 1.2. Some of them are manageable, for example, either by using modern sensors which provide denser and more accurate point clouds than before or by incorporating additional data sources such as high-resolution images. However, there is quite a big demand for 3D building models in areas where such modern sensors or additional data sources are not available. Therefore, in this thesis a new fully automatic reconstruction approach of semantic 3D building models for low- and high-density airborne laser scanning (ALS) data of large urban areas is presented and discussed. Additionally, it is shown how automatically derived building knowledge can be used to enhance existing buildingreconstruction approaches. The specific research objectives are outlined in section 1.3. It includes an overview of the proposed reconstruction workflows and the contribution of this thesis. In order to have lean workflows with good performance, some general assumptions on the buildings to be reconstructed are imposed and explained in section 1.4. The introduction ends with an outline of this thesis in section 1.5.
In this paper, a method for supporting the visual interpretation of SAR images with simulated optical and SAR imagesusing LiDAR DEM has been presented. Since the location and shape of the objects are similarly represented in the simulated images, acquiring the semantic on a SAR image is eased. The simulated optical image can be used for direct and quick identification of objects in the SAR image. The simulated SAR image has a similar signal reflectivity as the TerraSAR-X image, and it can also separately present single or multiple scattering in the SAR images, which is very useful for building recognition andreconstruction. Both the simulated optical and SAR images are automatically geocoded and enable a direct comparison with the SAR images. Future work will concentrate on learning the semantic relationship between objects in optical and SAR images in order to improve, for instance, methods for change detection.
A digital surface model (DSM) is an important and valuable data source for many remote sensing applications, like build- ing detection andreconstruction, cartographic analysis, ur- ban planning, environmental investigations and disaster as- sessment tasks. The use of DSM for those remote sensing applications is motivated by the fact that it already provides geometric descriptions about the topographic surface. With recent advances in sensor technologies, it became possible to generated DSMs with a ground sampling distance (GSD) smaller than 1 m not only from land surveying, aerial im- ages, laser ranging data, or interferometric synthetic aperture radar (InSAR), but also using satellite stereo images. The main advantages of satellite photogrammetric DSMs are the large land coverage and possibility to access remote areas. However, DSMs generated with the image-based matching approaches miss objects like steep walls in urban areas or feature some unwanted outliers and noise due to temporal changes, matching errors or occlusions. To overcome these problems, algorithms from computer vision have been ana- lyzed and adapted to satellite imagery. For example, the fil- tering techniques such as geostatistical filter integrated with a hierarchical surface fitting technique, a threshold slope-based
Building detection from aerial and satellite images has been a main research issue for decades and is of great interest since it plays a key role in building model generation, map updating, ur- ban planning andreconstruction (Davydova et al., 2016). Various methods have been developed and difference data sources such as aerial images, digital surface/eleviation models, LIDAR data, multi-spectral images, synthetic aperture radar images, have been used for building detection. In this section we briefly review rele- vant methods in the literature on building detection. Decades ago the initial endeavor for building detection was relying on group- ing of low level image features such as edge/line segments and/or corners to form building hypotheses (Ok, 2013). For instance, a generic model of the shapes of building was adopted in (Huer- tas and Nevatia, 1988) and shadows cast by buildings were used to confirm building hypotheses and to estimate their height. A computational techniques for utilizing the relationship between shadows and man-made structures to aid in the automatic extrac- tion of man-made structures from aerial imagery is described in (Irvin and McKeown, 1989). An approach to perceptual grouping for detecting and describing 3-D objects in complex images was proposed in (Mohan and Nevatia, 1989) and was illustrated by ap- plying it to the task of detecting and describing complex buildings in aerial images. The vertical and horizontal lines identified us- ing image orientation information and vanishing point calculation were used in (McGlone and Shufelt, 1994) to constrain the set of possible building hypotheses, and vertical lines are extracted at corners to estimate structure height and permit the generation of three dimensional building models from monocular views. Due to the neglected performance evaluation in building detection, a comprehensive comparative analysis of four building extraction systems was presented in (Shufelt, 1999) and he concluded that none of the developed systems were capable of handling all of