You are on page 1of 7

Schematic Surface Reconstruction

Abhilash Chandran

Abstract In this paper we discuss an algorithm which


specializes in reconstructing an architectural scene, from a
sparse 3D point cloud generated using structure from motion
scanning technique. This algorithm takes advantage of specific
conditions like profile curves and transport curves that are
common in most of the architectural scenes. Incorporating
several mathematical models and approaches along with the
information obtained from structure from motion point clouds,
it extracts the salient features of a building and reconstructs a
schematic model using only the basic elements like transport
curves, profile curves and generates a floor plan of the building
without any prior knowledge about the layout of the structure
or the scene itself. Later in the evaluation some ideas have been
proposed which might add value to current methodology.

I. INTRODUCTION
In the advent of various technical innovations in the 3D
technology, the necessity of capturing and extracting valuable
information has gained an acute importance from a consumer
domain as well as information domains domain. A clear
shift of focus from a 2D perspective to 3D perspective of
information could be easily noticed in the recent years.
This focus shift from 2D to 3D is necessary, because in
an environment where automated agents interact, the actual
location of objects are important. For e.g. in an industrial
scenario, actual location of objects is necessary for object
grasping or at times evaluation of the build quality of the
end products. In common scenarios for robot, it is necessary
for the robot to know the path and depth of its location to
navigation and perform certain tasks.
Following this line of innovations the authors of the
paper[1] have introduced a technique to automate the
schematic model extraction from a sparse 3D point cloud.
The primary focus of the authors who introduced this algorithm, make use of the input from the SfM, taking into
account, its nature of higher sparsity compared to other
scanning devices like Kinect or laser scanners[1]. Fitting a
swept surface on SfM is very much of a challenging task
due to its nature according to the main authors.
In this paper we will have brief a introduction to the
3D point clouds to acquaint the readers with the relevant
background details to have a better understanding of the
schematic surface reconstruction algorithm which will be
explained further in this paper. This paper also describes
the basic terminologies like transport curves, profile curves
which are necessary to form a solid basis in understanding
the algorithm.
The schematic representation of point clouds is necessary
due to several reasons, namely compactness, easy and intuitive rendering of structures from sparseness. Depicting point
cloud in terms of shapes and curves is necessary for domain

specific applications like construction of floor plans, validity


of building structures. Another way of looking at it is that we
need architectural scenes not as point clouds which have no
meaning but in a rather simplified manner as floors, rooms,
domes, cones. This method of geometric parsing can also
be extended to representations of architectures as rooms,
living area, number of floors. This makes sense intuitively
because this makes addition of 3D models also easy like
adding furniture to rooms or changing the structure of room.
II. RELATED WORK
A. Structure From Motion
A 3D point cloud as the name suggests, is a sparse
collection of points in a three dimensional space. This
is generated with the help of 3D scanners like Structure
from Motion(SfM) technique, Kinect, Time of Flight(TOF)
cameras, laser scanners and images. Because we abundant
images available in the internet namely Flickr, Street view
etc, it is easier and economical to use these images for
constructing 3D models of the most prominent architectural
scenes as done in the modeling[2] as shown in Fig.1 for
Colosseum.
Of these techniques lets discuss briefly how the SfM
technique is implemented. At first the images of a monument
are collected in one place for the algorithm to scan through.
Next, feature detection is performed on these images using
SIFT[3] algorithm. Then the process of feature mapping is
performed to find correspondences among these set of images. These correspondences will coincide with the original
artifacts of the structure/monument in the 3D world. Using
correspondences, the algorithm sorts the features accordingly
in tracks. Tracks is essentially a connectivity graph represented in the form of a matrix. A row of a track represents
the features that are calculated from an image in the original
set of images. A column of the track matrix represents a
particular feature that occurs in all the cameras.
Once this track matrix is constructed, a pair of cameras
which have the most number of features are selected first for
calculation of the camera parameters and reconstruction of
the 3D points. Then cameras are added one by one to the
SfM pipeline according to a specific criteria. This criteria
enforces that the next image added has a certain minimum
number of features which correspond to the previously added
set of cameras and they must not completely overlap one
of the previous images added earlier. The advantage of the
former criteria is that repetitive and non-informative images
are clearly excluded from being considered. In addition we
can infer that this criteria helps in avoiding the local minima
in the optimization as much as possible. This is called Sparse

Fig. 1: This is a point cloud of a Colosseum constructed


from images collected over internet.[1]

Fig. 3: Transport plane, with a transport curve t(u) where bt


is the ground plane normal.

bundle adjustment, which minimizes the re-projection error


of the reconstructed 3D points.

transport plane. bt is the transport plane normal and this is the


common normal for all the profile curves in an architectural
scene.

B. Schematic Representation

B. Profile Planes and Curves

Schematic representation is a compact representation of a


complete three dimensional structures like buildings. From
an architects perspective it is often referred as a blueprint.
One of the basic example for this would be the floor plan
of a building with some rooms and doors as shown in Fig.2.
A schematic representation reduces the complexity of a 3D
structure with the help of simple lines and curves. These
representations are easily grasped by humans. Despite its
simplicity in layout, it is very much capable of expressing
the details of the complex structure.
III. ELEMENTS OF SCHEMATIC SURFACE MODEL

Profile Curves define the shape of an architectural structure. For e.g. imagine an arch inside a church. The shape
of the arch is the profile curve. So the profile curves are
orthogonal to the transport curves and thereby to the transport
plane itself. This is one of the common feature of many
architectures and used by the author[1]. The height of the
profile curve along Zp axis, thus defines the vertical extrusion
of the surface while reconstructing. As shown in the Fig.4 the
circle, square and the polygon could be identified as a simple
profile curve. The plane which is covered by the profile curve
is referred as the profile plane along Yp .

The schematic surface model focused in this paper is


composed of two different types of planes and curves as
listed below.
Transport plane
Transport curve
Profile plane
Profile curve
It is also important to take a note that multiple profile
curves could share the same transport plane yet follow
different transport curves. This will be discussed further in
the upcoming sections.

Fig. 4: Transport plane, with various Profile curves like


circle, square and polygon. Xp is the transport direction axis.

A. Transport Planes and Curves


Transport planes are identical to the ground plane and
usually parallel to the ground plane. This plane holds the
transport curves which drives the profile curves along the
transport curves to form a structure. Transport curves t(u)
as shown in Fig.2 is parallel to transport planes and lie on the

C. Swept Surface
A swept surface is generated by sweeping a profile curve
p(v) along a transport curve t(u) on the transport plane.
While sweeping profile curve, the orthogonal nature of p(v)
and t(u) will be maintained. Swept surface is formulated as
S(u, v) = t(u) + R(u)p(v)

(1)

R(u) = [t0 (u), bt t0 (u), bt ]

(2)

where

Fig. 2: A simple floor plan of a home.

R(u) is the Rotation applied on the profile curve.


t0 (u) is the transport direction.

Fig. 5: A circular profile curve p(v) swept along a transport


curve t(u) to form a torus.[1]

Fig. 6: A point cloud structure with a spanned cavity/hole[left] and the corresponding partial profile curves
generated [1].

IV. SCHEMATIC SURFACE RECONSTRUCTION


A. Ground Plane Normal
The authors [1] begin their approach by pre-processing
the point cloud obtained from the SfM by performing a
Principal Component Analysis. Considering xi as any point
in the given point cloud, the approach tries to identify two
principal directions ci1 and ci2 with a distance threshold of
TR and ni which is the point normal[2]. The TR distance is
chosen based on a statistical approach considering the first
quartile of number of neighboring points is 100 for all the
input points in the point cloud. ni the point normal, directed
towards the camera[2].
After the initial processing, the authors imply a technique
to identify the ground plane normal. That is, within a given
point cloud, majority of the points will have one of its principal directions perpendicular to the ground plane normal bt .
For a given swept surface S(u), the two principal directions
are R(u)p(v) and t(u)[4]. Based on this proof the
P ground
plane normal bt could be deduced by arg maxb i (c1i
b) (c2i b).
This approach is optimized further by the authors, by
taking the point ni into the deduction rule of bt as shown
below.
X
bt = arg maxb
((c1i b)(c2i b)(ni b))(ni b)
i

(3)
Additionally the transport direction of each point is given
as follows, by identifying the unit direction vector for each
point.
(
bt ni
n i bt
t0i = |bt ni |
U ndef ined otherwise
B. Extracting the Transport Curve Points
After deducing the ground plane normal bt which is
a crucial factor of the schematic surface reconstruction
approach, the algorithm moves ahead to choose the transport
curve points. This section discusses how several transport
planes are identified from the point cloud and the statistics
applied to choose an optimal transport plane. The authors
use the strategy of choosing transport curve points of those
planes with minimal curvature and minimal noise.
By definition of schematic surface model for architectural
scenes for any given point xi , the (bt , ni ) angle is always
constant within a transport curve[1]. This acts as the first
selection criteria i.e. given a transport plane t , the noise

level of the plane is measured by calculating the variance of


this angle as
(t ) = stddev{(bt , ni )|xi t }
The second selection criteria begins by projecting the
point normal ni onto the transport plane t , and estimating
the curvature ki of transport curves. Then RMS method is
applied to approximate the curvature of the plane t as
c (t ) = rms{ki |xi t }
This is carried out over several iterations and list of
transport planes are generated. Then a cost of selecting a
plane t is calculated as (t ) + c (t ) and the plane
with the least cost is chosen. This transport plane, also
referred as transport slice, is then intersected with the point
cloud to extract the transport curve points. By analyzing the
connectivity of the points generated from this intersection, a
draft transport curve could be generated.
C. Profile Curve Reconstruction
Each of the transport curve points extracted earlier acts
as a seed for the next step in the algorithm. If we imagine
m transport curve points for a simple square shape building,
then each of these points define m different profile slices
pi where 1 i m. Then each of these profile slice p
contains a large number of points. The authors introduce a
mechanism to filter out specific point from each profile slice
applying the following selection criteria.
Select those points whose point normal ni is orthogonal
to the transport direction. This reduces the angled slices
from being considered.
Choose points based on the non-self intersection assumption which is a general rule in any architectural
scene[1].
Considering a structure with a hole spanning across its
surface, it is easy to imagine that several partial profile slices
are created by the above filtering as shown in the Fig.6.
Once the set of these profile curves are extracted from a point
cloud, a merging mechanism is applied by transforming these
slice onto a canonical plane. Considering a profile slice ip
point xi , any point xj on the other profile slice is transformed
onto the canonical profile plane coordinate yji by[1]
yji = Ri1 (xj xi )

(4)

Fig. 7: An accumulated profile curve from the profile slice


in Fig.6[1].

and the corresponding point normal

nij

is transformed as

nij = Ri1 nj

(5)

This way the profile slices are merged to form an accumulated profile curve as shown in Fig.7.
1) Profile Slice Clustering: Sometimes multiple profile
curves share the same Transport curve. In addition to the
mechanism discussed above for profile curve reconstruction,
a clustering methodology is used to group the profile curves.
Clustering the profile slices helps improve the accuracy of
reconstructed profile curve. A seed slice is chosen repeatedly
to cluster the curve points.It also improves connectivity
within the swept surface.
D. Transport Curve Reconstruction
The next major step in this algorithm is to reconstruct the
Transport curves. Rearranging the swept surface equation.
t(u) = S(u, v) R(u)p(v)

(6)

where S(u, v) is the position of the point.


R(u) is the rotation at the point.
p(v) is the reconstructed profile curve.
Now an interpolation technique is used to robustly extract
the transport curve. This is achieved in a two step process.
For a point xj its corresponding point pij on the profile curve
p(v) is estimated by intersecting the line (y yji ).Zp = 0
with the curve[1]. Once the point pij is estimated, the next
step is to extract the corresponding transport point applying
the Eq.6. That is each point xj on a profile slice pi is
transformed to [1]
zji = xj Rj pij

Fig. 8: An accumulated transport curve and profile curve.[1]


E. Sweeping
Once the transport curves and the profile curves are
extracted from the 3D point cloud the next step is to form
swept surfaces which are generated by sweeping a mesh
of the profile curve along the transport curves. The final
schematic representation of a scene is defined by multiple
number of swept surfaces, wherein the connectivity between
these surfaces are defined according to the extracted transport
curves[1].
Various conditions occur while doing this sweeping mechanism depending on the architectural identity and uniqueness
of the design of the buildings. For e.g. intersections may
occur between profile curves. The authors use the technique
of marking the points which are already swept using another
profile curve. So points which are marked as swept will not
be considered while sweeping a second profile curve, in the
scenario of multiple profile curves which is common in most
of the architectural scenes. The marking of points is carried
out using the distance threshold TR .
Also multiple transport planes are possible in a bigger
structure. In this case the transport planes are chosen in a
decreasing order of their sizes and the sweeping is carried
out.
F. Floor Plan Generation
The transport planes extracted from the point clouds could
help in extracting another crucial aspect of the structure
which is the floor/surface plan of the building itself. This
information is extracted by connecting various transport
curves which are already extracted. the transport planes
which intersect most surfaces are identified and the transport
curves on these planes are used to extract the floor plans for
the scene.
A sample floor plan is shown here in Fig.9 based on
the experiments carried out by the authors. Additionally the
algorithm color codes the information if the transport curves
are obtained form different transport planes as in Fig.9.

(7)

Each point xi on the profile slice is transformed into


zji = xi Rj pij

(8)

Once these points are accumulated several other profile


slices are chosen to interpolate the points between two
continuous profile slices to reconstruct the transport curve
as shown in Fig.8.

Fig. 9: The reconstructed floor plans of the Allen Center and


Uris Library from the experiments by the authors[1].

G. Optimization
By reconstructing the surfaces from the profile curves
extracted earlier many details of the structure are lost, as
the sweeping is applied directly without considering any of
the depth information from the 3D point cloud.
To overcome this issue, the authors apply a technique to
minimize the energy function including some optimization
parameters as follows.
Esweep = Edata + n Etangent + s Esmooth

(9)

Fig. 10: The statistics of reconstruction and the timings of


each step in minutes. The experiments were conducted on a
PC with Intel Xeon X5680 3.33Ghz CPU and 12GB Ram.[1]

Edata is a minimizing function which tries to reduce the


distance between the reconstructed surface points and the
actual 3D points[1].
X
Edata =
|(xi Sd (ui , vi )).Ns (ui , vi )|2
Here Sd is the optimized surface reconstructed from the
originally reconstructed surface and Ns (u, v) is the normal
direction for the error estimation.[1].
Etanget is minimization of tangent fitting costs to both the
profile and transport curves. This is based on the expectation
that the derivatives of the curves are perpendicular to their
corresponding normal fields[1].
X
Etangent =
(|p0d (v).Np (v)|)2 + |t0d (u).Nt (u)|2 )
Further the smoothness of the swept surfaces are optimized
by using the second order derivatives of the transport and
profile curves[1].
X
Esmooth =
(||p00d )||2 + ||t00d (u)||2 )

Fig. 11: Reconstruction of Colosseum. The upper part of


the image shows the curves extraction and lower part shows
reconstruction with optimization.[1]

V. EXPERIMENTS

points obtained from the SfM. Especially for Colosseum the


algorithm was able to reconstruct with only 1% of the actual
3D points as in Fig.11. It could also be noticed that many
holes exists within the reconstructed Colosseum due to the
lack of details from the corresponding SfM. Despite this lack
of details, the algorithm is successful in extracting the surface
plan of the same.
As the authors noted, overarching these efficiency, the
use of the threshold TR in the reconstruction could limit
the outcome, for sparse clouds of complex surfaces where
such surfaces would be broken into small pieces[1]. This
threshold is chosen automatically based on the density of the
cloud during the pre-processing of point clouds using certain
statistics. Since this could vary according to the input point
cloud it has some proportional effect.
An example result with intermediate images for the recon-

The authors implemented and conducted experiments on


the point clouds of various architectural scenes. This involved some prominent monuments like Colosseum1. The
experiments were performed on different amounts/density of
point clouds. The results of this experiment is shown in
Fig.10 which is table with details like the no of points in
the SfM, amount of time spent in minutes for each sector of
the algorithm like sweeping and optimization. The results of
these experiments are provided in the authors website[5].
Based on the results of the experiments conducted, it was
identified that the algorithm performs better even with lesser
amounts, i.e. sometimes even upto 10% of the actual 3D

Fig. 12: Extracted Profile and Transport curves for the St.
Peters Basilica..[1]

In order to obtain some additional details of the depth of


the objects like windows and holes within the architectural
structure, the authors apply a minimized displacement map
on top of the reconstructed swept surface as shown below[1].
Edisp = Edata + d Emesh

(10)

and
Emesh =

(|Su .Ns |2 + |Sv .Ns |2 )

(11)

Here Su and Sv are the two partial derivatives of the displaced swept surface[1].The above approach tries to penalize
the big jumps within the normal directions along which the
surfaces are swept.

Fig. 13: Reconstructed 3D Structure of St. Peters Basilica. The right side of the image shows an optimized
reconstruction.[1]

struction of the St. Peters Basilica is shown in the Fig.12


and 13 respectively.The experiment results furnished(also the
videos in the authors website[5]) demonstrates the ability of
the algorithm to detect the Ground normal consistently and
extract the curves with greater accuracy without loosing any
details of the structure being reconstructed.
VI. EVALUATION AND FUTURE WORK
The extraction of profile and transport curves from the
point cloud is relatively cumbersome and is prone to miss
important details of the scene. Though the approach introduced by the authors[1], handles the effective details of the
structure, as the authors noted for complex structures could
evoke incomplete reconstruction. Especially if the details of
the point cloud is sparse for such structure. Additionally the
calibrations of the cameras, angle of the image, distortion of
the images used etc. could impact the point cloud generated
by SfM techniques leading to lack of details like point
normal.
This could be overcome by introducing prior curves to the
algorithm. Prior curves in this context means that a generic
idea about the profile or transport curves of the architectural
scene being considered for reconstruction. This will help in
verifying the correctness of the profile and transport curves
extracted by matching it against prior curves. It could help
improve the details and correctness of the reconstructed
structures. For e.g. if we are reconstructing the Leaning tower
of Pisa Fig.14 then the inclination of the entire structure
could be considered as a prior for the profile curve extraction.

Fig. 14: The leaning tower of Pisa[6].


One of the bottlenecks of this algorithm is the preprocessing of 3D point clouds to obtain the point normals.

Since the algorithm depends on the point normal details


from the SfM output which is used ground plane normal, the
output of SfM is important which depends on other factors
as mentioned earlier. This part of the algorithm could be
adapted to work with Point normal calculation using Patch
based Multi View Stereo(PMVS)[7]. What PMVS does is
to work on the point cloud and the corresponding images
simultaneously to prune for denser point clouds and to extract
the patches of planes which will have a more precise and
accurate information about the normals. This helps skipping
erroneous way of calculating the point normals from the
direction of the camera, because this algorithm tries to take
voting mechanism of normal calculations from the multiple
points rather than a single point.
Apart from this, the amount of detail available in the
reconstructed 3D structure could be enhanced and used
for indoor mapping of building. The experiments could be
extended to indoor rooms in such a way that new/existing
prefabricated 3D objects like books, chairs, couches can be
added on top of the existing artifacts such as shelves, table
etc, if extracted accordingly.
VII. CONCLUSION
The approach introduced by the authors[1] is one of the
ideal techniques, to represent 3D structures from sparse as
well as dense point clouds. One of the highlights of the
algorithme is the optimization techniques used in extracting
specific details like depth of artifacts from the scene. The
idea of using simple planar curves for the reconstruction,
is a clear method which is applicable in generic man made
architectures. Also this algorithm is found to be working
efficiently of denser point clouds like MVS as noted by the
authors[1].
Certainly there are some limitations to this algorithm
despite its efficiency and correctness of reconstruction. This
algorithm, for instance would perform poorly if the structure
has more curved layouts or details like multiple pillars within
the structure as in Fig.[6]. This technique lags behind for
points of poorly textured regions of the scene. As suggested
in the future works section, a prior knowledge of these planar
curves would imply that the results will be much more
cleaner and precise in nature. Though this might involve
additional computation, it might prove to be effective. Also
to be noted is that, giving prior curves is fruitful only if the
structure is wide as in Fig.12 and has multiple number of
profiles. In other cases like Colosseum the basic algorithm
would suffice.
This whole approach as of now cannot be implemented
as a real time application due to the complexity of the calculations involved, yet could eventually become one, given
the advancements in the processing technology like GPUs.
If a parallel processing enhancement could be embedded
alongside, it could help well in real time reconstruction and
representation of the scenes. Especially the generation of
SfM which is a precursor for this algorithm, consumes a
lot of time for processing millions of images and proves to
be a challenging task for a real time scenario. This technique

in its current form could only be used as a pre-processed add


on into knowledge base of robots/agents, exploring such kind
of environments.
R EFERENCES
[1] C. Wu, S. Agarwal, B. Curless, and S. M. Seitz, Schematic surface
reconstruction, in Computer Vision and Pattern Recognition (CVPR),
2012 IEEE Conference on. IEEE, Jun. 2012, pp. 14981505. [Online].
Available: http://dx.doi.org/10.1109/cvpr.2012.6247839
[2] N. Snavely, S. M. Seitz, and R. Szeliski, Modeling the world from
internet photo collections, International Journal of Computer Vision,
vol. 80, no. 2, pp. 189210, 2008.
[3] M. Brown and D. G. Lowe, Automatic panoramic image stitching using
invariant features, International journal of computer vision, vol. 74,
no. 1, pp. 5973, 2007.
[4] H. Rom and G. Medioni, Part decomposition and description of 3d
shapes, in Pattern Recognition, 1994. Vol. 1-Conference A: Computer
Vision & Image Processing., Proceedings of the 12th IAPR International Conference on, vol. 1. IEEE, 1994, pp. 629632.
[5] C. Wu, S. Agarwal, B. Curless, and S. M. Seitz. (2012) Schematic
surface reconstruction. [Online]. Available: http://grail.cs.washington.
edu/projects/schematic/
[6] A. M. g. Wikimedia Commons. (2015) Creative commons attributionshare alike 3.0. Online;accessed 25-January-2015. [Online]. Available: http://en.wikipedia.org/wiki/Leaning Tower of Pisa#mediaviewer/
File:Leaning tower of pisa 2.jpg
[7] Y. Furukawa and J. Ponce, Accurate, dense, and robust multiview stereopsis, Pattern Analysis and Machine Intelligence, IEEE Transactions
on, vol. 32, no. 8, pp. 13621376, 2010.

You might also like