LiDAR points of buildings are extracted from the scene using previously available 2D building bound- ary polygons. Nearby points from terrain and vegetation are removed using filtering procedures. For roof segmentation, a robust region growing technique is developed. A unique feature of the segmentation method is the growing of triangles of a Triangulated Irregular Network (TIN) instead of LiDAR points. This minimizes the gaps between segments, because LiDAR points at segment intersections can be assigned multiple segment labels. Additionally, robust adaptive thresholds are introduced as region growing criteria. These enable the region growing procedure to stop at weak edges, while also segmenting non-planar roof segments. Results show that the proposed segmentation outperforms other methods concerning undersegmentation, and that it recognizes even weak edges. Evaluation and an extensive analysis of the input parameters’ effects on the results have shown that the segmentation is very robust against LiDARpoint cloud characteristics and segment shape. Segment boundaries are cretated by collapsing the convex hull of segment points. Point density variations in across-track and along-track directions are considered in the collapsing procedure. For building modeling, the 2.5D dual contouring approach of Zhou and Neumann  is adapted to model complex roofs. After overlying a 2D grid to the segmented point cloud, vertices of the 3Dbuilding model are estimated for each grid cell by minimizing a Quadratic Error Function (QEF). Each QEF minimization results in a hyperpoint, which consists of one or more vertices of the building model at the same x-y-coordinates. This 2.5D-characteristic enables the connection of building ver- tices at step edges with vertical walls. In contrast to Zhou and Neumann , the proposed method
Automatic generation of 3Dbuilding models is an essential pre- requisite in a wide variety of applications such as tourism, urban planning and automatic navigation. Although over the last decades, many approaches of building detection and reconstructionfrom3Dpointclouds and high resolution aerial images have been reported. The fully 3Dbuildingreconstruction is still a challenging issue due to the complexity of urban scenes. There are basically two strategies for building roof reconstruction: bottom-up/data-driven and top- down/model-driven methods. The bottom-up methods (e.g. region growing (Rottensteiner and Briese, 2003), Hough transform (Vosselman and Dijkman, 2001), RANSAC (Tarsha- Kurdi et al., 2008)) extract roof planes and other geometrical information from the pointclouds. For roof reconstruction, the corresponding planes are assembled and vertices, ridges and eaves are determined (Sohn and Huang, 2008). Sampath and Shan (2010) used a bottom-up approach to segment the LiDAR points to planar and non-planar planes using eigenvalues of the covariance matrix in a small neighborhood. Then, the normal vectors of planar points are clustered by fuzzy k-means clustering. Afterwards, an adjacency matrix is considered to obtain the breaklines and roof vertices of corresponding planes. This method is used for reconstruction of moderately complex buildings. Rottensteiner et al. (2005), presents an algorithm to delineate building roof boundaries fromLIDAR data with high level of detail. In this method, roof planes are initially extracted
In contrast to model-driven approaches, with high density pointclouds, data-driven approaches perform well in complex scenarios by recognizing adjacent planar faces and their relations (e.g. ridges and step-edges) to achieve topologically and geometrically correct 3Dbuilding models as explained in Section 2.3. The main drawback of data-driven approaches, however, is their sensitivity to the incompleteness of data arising from occlusion, data gaps or vegetation clutter. Similarly, intersection of best fitted planes, for instance four planes, may create more than one point which yields extra short edges leading to an erroneous topology. A thorough discussion on the problems relevant to buildingreconstruction scheme, including effects due to data and scene complexities, is given in Oude Elberink (2008). In the reconstruction, defining hypothesis about the building shape is the major difficulty as it has to be automated. Assumptions that could be defined based on data or general knowledge about the scene, of course, help in this regard. Sometimes, constraints may conflict with the actual scene, for instance in the regularization of roof outlines, along with the orthogonality and parallelism constraints, with respect to the fact that the main direction of the building may not always follow the actual direction. Taking into account topology representation, some approaches based on the topology graphs have been introduced (Sampath and Shan, 2010). They may be capable of handling more complicated scenes with a higher level of topology preservation. Some methods relying on RTG also use pre- defined primitive shapes and consequently, library graphs or target graphs (Verma et al., 2006). Although this would be a solution for data gaps, it would however be questionable when the targets are unable to represent a scene. Accordingly, the data-driven methods based on the segmented roof planes can be categorized into two groups: RTG-based and non RTG-based methods.
Space-borne meter resolution SAR data, together with multi-pass InSAR techniques including persistent scatterer interferometry (PSI) and tomographic SAR inversion (TomoSAR), allow us to reconstruct the shape and undergoing motion of individual buildings and urban infrastructures -. TomoSAR in particular offer tremendous improvement in detailed reconstruction and monitoring of urban areas, especially man-made infrastructures . The rich scatterer information retrieved from multiple incidence angles by TomoSAR in particular enables us to generate 4D pointclouds of the illuminated area with a point density comparable to LiDAR. These pointclouds can be potentially used for building façade reconstruction in urban environment from space with few considerations: 1) Side-looking SAR geometry enables TomoSAR pointclouds to possess rich façade information; 2) Temporarily incoherent objects, e.g. trees, cannot be reconstructed from multi-pass space-borne SAR image stacks ; 3) The TomoSAR pointclouds have a moderate 3D positioning accuracy in the order of 1m, while (airborne) LiDAR provides accuracy typically in the order of 0.1m.
Although few approaches, e.g., (Guillaso et al., 2015, Guillaso et al., 2013, D’Hondt et al., 2012), aiming towards information ex- traction exist, 3-D object modeling/reconstructionfrom TomoSAR data is still a new field and has not been explored much. Prelimi- nary investigations towards object modeling/reconstruction using spaceborne TomoSAR pointclouds have been demonstrated in (Zhu and Shahzad, 2014, Shahzad and Zhu, 2015, Shahzad and Zhu, 2016) while TomoSAR pointclouds generated over urban and vegetation areas using airborne SAR datasets have been ex- plored in (D’Hondt et al., 2012, Schmitt et al., 2015) respectively. Taking into consideration special characteristics associated to these pointclouds e.g., low positioning accuracy (in the order of 1m), high number of outliers, gaps in the data and rich fac¸ade informa- tion (due to the side looking geometry), this paper demonstrates for the first time the potential of explicitly modelling the indi- vidual roof surfaces to reconstruct 3-D prismatic building models from TomoSAR pointclouds.
In recent years, by introducing the LiDAR system and using the 3D geo-referenced data, different methods have been proposed for 3Dbuilding modelling. In recent years, several methods have been proposed for 3Dbuildingreconstruction. The algorithms comprise methods that only employ LIDARpoint cloud for model generation while some others use additional data sources such as aerial or satellite imagery. (Ma, 2006) proposed a methodology of 3Dbuilding model reconstruction will be examined based on the integration of aerial photographs and LIDAR data. The methodology is comprised of two elements. The first one is to reconstruct 3Dbuilding models fromLIDAR data. Rough building models are the outcome of this step. The second element is to refine the rough models with information derived from aerial photographs. Cheng et al. proposed an approach by integrating aerial imagery and LiDAR data to reconstruct 3Dbuilding models (Cheng, Gong, Li, & Liu, 2011). In this approach, an algorithm for determination of principal orientations of a building was introduced and 3D boundary segments were then determined by incorporating LiDAR data and the 2D segments extracted from images, a strategy including automatic recovery of lost boundaries was finally used for 3Dbuilding model reconstruction. The focus of this study is to improve the quality of building boundaries, not building roofs. (Satari, 2012) proposed a multi-resolution hybrid approach for the reconstruction of building models fromLiDAR data. The detection of the main roof planes is obtained
synthetic aperture radar inversion (TomoSAR) using multiple data stacks from different viewing angles enables us to gener- ate 4-D (space-time) pointclouds of the illuminated area from space with a point density comparable to LiDAR. They can be potentially used for facade reconstruction and deformation mon- itoring in urban environment. In this paper, we present the first attempt to reconstruct facades from this class of data: First, the facade region is extracted using the density estimates of the points projected to the ground plane, the extracted facade points are then clustered into individual facades by means of orientation analysis, surface (flat or curved) model parameters of the seg- mented building facades are further estimated, and the geometric primitives such as intersection points of the adjacent facades are determined to complete the reconstruction process. The proposed approach is illustrated and validated by examples using TomoSAR pointclouds generated from stacks of TerraSAR-X high-resolution spotlight images from two viewing angles, i.e., both ascending and descending orbits. The performance of the proposed approach is systematically analyzed. To explore the possible applications, we refine the elevation estimate of each raw TomoSAR point by using its more accurate azimuth and range coordinates and the corresponding reconstructed building facade model. Compared to the raw TomoSAR pointclouds, significantly improved elevation positioning accuracy is achieved. Finally, a first example of the reconstructed 4-D city model is presented.
algorithms, 3D objects are reconstructed by surface fitting in the segmented building regions . Numerous methods are employed for building roof segmentation and reconstruction such as unsupervised clustering approaches , region growing algorithms  and graph based matching techniques . These techniques, however, cannot be directly applied to TomoSAR pointclouds due to different object contents illuminated by the side looking SAR. In this paper, we propose an approach for building façade detection and reconstructionfrom TomoSAR pointclouds. The proposed approach is illustrated and validated by examples using TomoSAR pointclouds generated from a stack of 25 TerraSAR-X high spotlight images reported in . Our test building is the Bellagio hotel in Las Vegas. Fig. 1 (b) shows the TerraSAR-X mean intensity map of the area of interest while Fig. 1 (a) is the corresponding optical image. Fig. 1 (c) gives an overview of the input TomoSAR pointclouds in UTM coordinates.
In the past few years, several algorithms have been proposed for automated 3Dbuildingreconstruction. The algorithms comprises methods that only employ elevation data such as high resolution airborneLIDAR for model generation while some methods use other additional sources of data. An additional data source plus DEM is usually employed when the quality or resolution of the elevation data is not appropriate for model generation. Segmen- tation based approaches for a 3Dbuilding model generation from grid data are proposed by Geibel and Stilla (2000) and Rotten- steiner and Jansa (2002) to find planar regions which determine a polyhedral model. Gorte (2002) employed another segmenta- tion approach using TIN structure for the data that the segments are generated by iteratively merging the triangles based on simi- larity measurements. Rottensteiner (2006) described a model for the consistent estimation of building parameters, which is part of the 3Dbuildingreconstruction. Geometric regularities were in- cluded as soft constraints in the adjustment of the model. Robust estimation can be then used to eliminate false hypotheses about geometric regularities. A comparison between data- and model- driven approaches for buildingreconstruction have been made which states that the model-driven approach is faster and does not visually deform the building model. In contrast, the data-driven approach tends to model each building detail to obtain the nearest polyhedral model, but it usually visually deforms the real shape of the building (Tarsha Kurdi et al., 2007).
Modern spaceborne SAR sensors such as TerraSAR-X and COSMO-SkyMed can deliver meter-resolution data that fits well to the inherent spatial scales of buildings. This very high resolution (VHR) data is therefore particularly suited for detailed urban mapping. In particular, using stacked VHR SAR images, advanced multi-pass interferometric techniques such as tomographic SAR inversion (TomoSAR) allow to retrieve not only the 3D geometrical shape but also the undergoing temporal motion of individual buildings and urban infrastructures . The resulting 4D pointclouds have a point (scatterer) density that is comparable to LiDAR. E.g. experiments using TerraSAR-X high-resolution spotlight data stacks show that the scatterer density retrieved using TomoSAR is on the order of 1 million pts/km 2 . Object reconstructionfrom these high quality TomoSAR pointclouds can greatly support the reconstruction of dynamic city models that could potentially be used to monitor and visualize the dynamics of urban infrastructure in very high level of details. Motivated by this, we presented very first results of façade reconstructionfrom single view (ascending stack) and multi-view (fused ascending and
Different platforms and sensors are used to derive 3d models of urban scenes. 3dreconstructionfrom satellite and aerial images are used to derive sparse models mainly showing ground and roof surfaces of entire cities. In contrast to such sparse models, 3d reconstructions from UAV or ground images are much denser and show building facades and street furniture as traffic signs and garbage bins. Furthermore, pointclouds may also get acquired with LiDAR sensors. Pointclouds do not only differ in the viewpoints, but also in their scales and point densities. Consequently, the fusion of such heterogeneous pointclouds is highly challenging. Regarding urban scenes, another challenge is the occurence of only a few parallel planes where it is difficult to find the correct rotation parameters. We discuss the limitations of the general fusion methodology based on an initial alignment step followed by a local coregistration using ICP and present strategies to overcome them.
With data provided by modern meter-resolution SAR sensors and advanced multi-pass interferometric techniques such as tomographic SAR inversion (TomoSAR), it is now possible to generate 4-D (space-time) pointclouds of the illuminated area with point density of approx. 1 million points/km 2 . However, due to side looking geometry, these pointclouds exhibit much higher density of points on building façades in contrast to nadir looking LiDAR geometry (typically used for object reconstruction). Moreover, temporally incoherent objects such as trees cannot be reconstructed from multi-pass spaceborne SAR image stacks and provide moderate 3-D positioning accuracy in the order of 1m as compared to airborneLiDAR systems (around 0.1m). Despite of these special considerations, object reconstructionfrom these high quality pointclouds can greatly support the reconstruction of dynamic city models that could be potentially used to monitor and visualize the dynamics of urban infrastructure in very high level of details. Motivated by these chances, earlier approaches have been proposed to reconstruct building façades from this class of data. E.g., experimental results provided in (Zhu, 2014) and (Shahzad, 2014) over smaller and larger areas demonstrate that façade reconstruction is an appropriate first step to detect and reconstruct building shape when dense points on the façade are available. In particular, when data from multiple views e.g., from both ascending and descending orbits, are available, the full shape of buildings can be reconstructed using extracted façade points. However, there are cases when no or only few
(or use an already existing) a digital terrain model (DTM) by filtering techniques, e.g., morphological filtering (Sithole, 2004), gradient analysis (Vosselman, 2000), or iterative densification of triangular irregular network structure (Sohn, 2002) and then use the DTM to extract non ground points (Rottensteiner, 2002) from the rasterized point cloud data. Nadir looking LiDAR points essentially gives a digital surface model (DSM). Subtracting DSM from DTM provides us a normalized DSM (nDSM) which represent the height variation of non ground points. Building points are then extracted out by exploiting geometrical features such as deviations from surface model, local height measures, roughness and slope variations. Methods based on building boundary tracing from nDSM (Gross, 2005) or directly frompointclouds (Sampath, 2007) (Rottensteiner, 2002) have also been employed for building detection. With them, finer building boundaries are determined by regularization of the coarsely traced boundaries. All points that lie inside the boundary regions are considered as building points. Building points are also extracted out by explicitly labeling every point in the data set. For labeling purpose, features in local neighborhood like height, eigenvalue and plane features can be determined and used in conjunction with supervised (Mallet, 2011), semi supervised (Sampath, 2010) and unsupervised (Dorninger, 2008) classification techniques.
Efficient extraction of features, particularly man-made features, fromairborne data has been a topic of intense research for many years. In this research, aerial photography has been the preferred way to obtain 3D information of buildings. Although very well understood and delivering accurate results, the major difficulty is that automation of the measurement process, which is closely related to image understanding, is a hard task to solve. This research suggests that LiDAR seems to have advantages with regard to image interpretation, as the provided geometry of objects is more capacious. However, the problem shifts from image understanding to geometry recognition, and this is confirmed by the results of many other researchers. To substantiate this, published key approaches to automatic building extraction and reconstruction will be introduced. From the many articles available on the whole procedure of building object extraction and building model reconstruction the articles deemed most relevant to this research have been selected for critical evaluation. To date, a large number of 3Dbuilding object reconstruction methods using aerial imagery were developed, but only a few come close to an automatic building model reconstruction. These methods are based on different ideas; some of them have been copied in order to work with airborne laser scanner data and vice versa. Traditionally, breakline detection in a stereo pair is the primer task, which is performed in order to get basic information which is necessary for 3Dbuilding model reconstruction. These breaklines are then grouped to planes using probabilistic [Heuel2000], semantic [Baillard1999] or model [Gerke2001] approaches or combinations [Brunn1998] of them. The planes are grouped and intersected then to form a reconstructed building model.
Airborne laser scanning (ALS), also referred to as airborneLIDAR (Light Detection And Rang- ing), is a very convenient source of information for extracting Digital Surface Models (DSM). The ALS is an efficient system which can deliver very dense and accurate pointcloudsfrom the ground surface and the objects which are located on it. Providing high quality height informa- tion of the landscape by means of LIDAR systems opens up an extensive range of applications in different subjects in photogrammetry and remote sensing. Moreover, laser scanning data is useful for an increasing number of mapping and GIS data acquisition purposes, including the detection and modeling of 3D objects. There are different types of information returned from the target which provide valuable information of the object and structures around it. Laser pulses have one important advantage: They partially penetrate the vegetation in gaps between leaves and thus make available data reflected from points underneath the vegetation. This property of the laser ray is at the heart of the difference between first- and last-pulse data: while in first-pulse data the vegetation’s surface is represented well, this is not the case in last-pulse data.
1. Region based methods: Region-growing methods are used widely in segmenta- tion. It start from a seed, based on some predefined criteria, each unclassified neigh- bor to the region’s points is added to the region [Rottensteiner and Briese, 2002] [Elberink and Vosselman, 2009] [Verma et al., 2006] [Dorninger and Pfeifer, 2008]. Region growing criteria decide whether a point should be added to a region or not from predefined thresholds on similarities. The most used criterion is the point’s lo- cal plane parameters to the average region’s plane parameters. [Verma et al., 2006]. Region growing seeds can be determined by choosing randomly from the pointclouds. [Alharthy and Bethel, 2002] choose a seed that located at centric within a roof segment. [Awrangjeb and Fraser, 2014] select the midpoints of the building boundary’s edges as seed. Others estimate flatness using a covariance based local curvature measure, or according to the local neighborhood’s RMSE from the local plane [Alharthy and Bethel, 2002].
The number of hyperpoints for a building model should me min- imized. Therefore, the grid cells are treated as leaf cells of a quadtree, which is iteratively collapsed. For this purpose, the grid is designed to have 2 n cells. By this, an iterative collapsing of groups of 4 adjacent cells into one larger quadtree cell is pos- sible. For deciding whether to collaps a group of four quadtree cells, a combined QEF is constructed from the LSP and LPL of these cells. If the non weighted residual error R QEF = AX − b is larger than a threshold R max , the four quadtree cells are col- lapsed to a larger one. The procedure iterates until there is no group of four quadtree cells which can further be collapsed. 2.3.5 Building polygon creation Each hyperpoint vertex car- ries labels for the segments of the local height layer it is estimated from. Therefore, within each pair of adjacent hyperpoints, there is a pair of hyperpoint vertices sharing at least one segment la- bel. Those hyperpoint vertices are connected to form a 3D edge. After creating a 3D edge between all adjacent hyperpoints, the building roof is represented by to two types of connections, i.e. 3D triangles and 3D quads of 3D edges (Fig. 5). In order to rep- resent the roof by a triangulation, each quad is separated into two triangles. If two possibilities for separation are possible (Fig. 5 a), the separation resulting in the best fit to the input point cloud is chosen.
The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing (Shapiro u. Stockman ). It estimates the parameters of a shape from its points. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform. It can be used to detect lines, circles and other primitive shapes if their parametric equation is known. In principle, it works by mapping every point in the data to a manifold in the parameter space. This manifold describes all possible variants of the parametrized primitive. Making the parametrizing simpler or limit the parameter space speed up the algorithm. This is especially true for 3D shape detection, where for example to detect a plane using the plane equation ax+by+cz+d=0 requires 3D Hough space, which will quickly occupy large space of memory and performance since all possible planes in every transformed pointclouds need to be examined. A plane can also be fitted based on normalized normal vectors using only two of the Euler angles and distance from origin, α, β and d. There is no need to the third Euler angle since the information when transforming around the axis in redundant (Hulik u. a. ).
Abstract—Modern spaceborne synthetic aperture radar (SAR) sensors, such as TerraSAR-X/TanDEM-X and COSMO-SkyMed, can deliver very high resolution (VHR) data beyond the inher- ent spatial scales of buildings. Processing these VHR data with advanced interferometric techniques, such as SAR tomography (TomoSAR), allows for the generation of four-dimensional pointclouds, containing not only the 3-D positions of the scatterer location but also the estimates of seasonal/temporal deformation on the scale of centimeters or even millimeters, making them very attractive for generating dynamic city models from space. Motivated by these chances, the authors have earlier proposed ap- proaches that demonstrated first attempts toward reconstruction of building facades from this class of data. The approaches work well when high density of facade points exists, and the full shape of the building could be reconstructed if data are available from multiple views, e.g., from both ascending and descending orbits. However, there are cases when no or only few facade points are available. This usually happens for lower height buildings and renders the detection of facade points/regions very challenging. Moreover, problems related to the visibility of facades mainly facing toward the azimuth direction (i.e., facades orthogonally ori- ented to the flight direction) can also cause difficulties in deriving the complete structure of individual buildings. These problems motivated us to reconstruct full 2-D/3-D shapes of buildings via exploitation of roof points. In this paper, we present a novel and complete data-driven framework for the automatic (parametric) reconstruction of 2-D/3-D building shapes (or footprints) using unstructured TomoSAR pointclouds particularly generated from one viewing angle only. The proposed approach is illustrated and validated by examples using TomoSAR pointclouds generated using TerraSAR-X high-resolution spotlight data stacks acquired from ascending orbit covering two different test areas, with one containing simple moderate-sized buildings in Las Vegas, USA and the other containing relatively complex building structures in Berlin, Germany.
general problem of procedural modeling frameworks is that production rules are frequently assumed to already be available from the outset (van Gool et al., 2013). In order to provide a variety of complex buildings, a large number of sophisticated production rules is needed which is often done in a manual way. The manual set up of rules is, however, often not only laborious and time-consuming but also requires expert knowledge (Becker et al., 2013). To overcome this issue, techniques for the inverse procedural modeling have been invented which automatically derive rules from observations. In (Aliaga et al., 2007) style grammars, an extension of the grammars presented in the Build-by-Number system (Bekins and Aliaga, 2005), are presented that facilitate the interactive creation of new buildings in a style compa- rable to already generated buildings. For the generation of the grammar, images are manually mapped to simple geometric building models, which in turn are then decomposed into smaller facade features (e.g., floors, doors, windows, etc.). Based on the decompositions, repetitive patterns are detected and a style grammar is formulated. Applied on a composition of building blocks, new buildings can be created in a style of existing buildings. Methods for the auto- matic derivation of shape grammar rules for building facades from single facade images are shown in (Müller et al., 2007) and (van Gool et al., 2007). In (Becker, 2009), first a data-driven reconstruction procedure is performed on terrestrial LiDAR data. The result is then used as a knowledge base from which production rules are automatically defined for further processing of partially or completely occluded facades. Another inverse procedural method is proposed in (Bokeloh et al., 2010) that automatically extracts general rewriting systems, context free hierarchical