To evaluate the 3-D queried objects detection of the VIPs, we have prepared
the ground truth data according to the two phases. The first phase is to evaluate
the table plane detection, we prepared as Sec. 2.2.4.2 and using ’EM1’ measurement
for evaluating the table plane detection. To evaluate the objects detection, we also
prepared the ground truth data and compute T1 for evaluating 3-D cylindrical objects
detection and T2 for evaluating 3-D spherical objects detection. They are presented in
the Sec. 4.1.4.2. To detect objects in the RGB images, we utilize the YOLO network
for training the object classifier. The number of classes, iterations are used as Sec.
4.1.4.3. All source code of program is published in the link: 1
                
              
                                            
                                
            
 
            
                 28 trang
28 trang | 
Chia sẻ: honganh20 | Lượt xem: 565 | Lượt tải: 0 
              
            Bạn đang xem trước 20 trang tài liệu 3d object detections and recognitions: Assisting visually impaired people in daily activities, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
narios including data sets collected in lab environments and the public
datasets. Particularly, these research works in the dissertation are composed of six
chapters as following:
 Introduction: This chapter describes the main motivations and objectives of the
study. We also present critical points the research’s context, constraints and
challenges, that we meet and address in the dissertation. Additionally, the general
framework and main contributions of the dissertation are also presented.
 Chapter 1: A Literature Review: This chapter mainly surveys existing aided
systems for the VIPs. Particularly, the related techniques for developing an
aided system are discussed. We also presented the relevant works on estimation
algorithms and a series of the techniques for 3-D object detection and recognition.
 Chapter 2: In this chapter, we describe a point cloud representation from data
collected by a MS Kinect Sensor. A real-time table plane detection technique for
separating the interested objects from a certain scene is described. The proposed
table plane detection technique is adapted with the contextual constraints. The
experimental results confirm the effectiveness of the proposed method on both
self-collected and public datasets.
 Chapter 3: This chapter describes a new robust estimator for the primitive shapes
estimation from a point cloud data. The proposed robust estimator, named GC-
6
SAC (Geometrical Constraint SAmple Consensus), utilizes the geometrical con-
straints to choose good samples for estimating models. Furthermore, we utilize
the contextual information to validate the estimation’s results. In the experi-
ments, the proposed GCSAC is compared with various RANSAC-based variations
in both synthesized and the real datasets.
 Chapter 4: This chapter describes the completed framework for locating and
providing the full information of the queried objects. In this chapter, we exploit
advantages of recent deep learning techniques for object detection. Moreover, to
estimate full 3-D model of the queried-object, we utilize GCSAC on point cloud
data of the labeled object. Consequently, we can directly extract the object’s
information (e.g., size, normal surface, grasping direction). This scheme outper-
forms existing approaches such as solely using 3-D object fitting or 3-D feature
learning.
 Chapter 5: We conclude the works and discuss the limitations of the proposed
method. Research directions are also described for future works.
CHAPTER 1
LITERATURE REVIEW
In this chapter, we would like to present surveys on the related works of aid systems
for the VIPs and detecting objects methods in indoor environment. Firstly, relevant
aiding applications for VIPs are presented in Sec. 1.1. Then, the robust estimators and
their applications in the robotics, computer vision are presented in Sec. 1.3. Finally,
we will introduce and analyses the state-of-the-art works with 3-D object detection,
recognition in Sec. 1.2.
7
1.1 Aided systems supporting for visually impaired people
1.1.1 Aided systems for navigation service
1.1.2 Aided systems for obstacle detection
1.1.3 Aided systems for locating the interested objects in scenes
1.1.4 Aided systems for detecting objects in daily activities
1.1.5 Discussions
1.2 3-D object detection, recognition from a point cloud data
1.2.1 Appearance-based methods
1.2.2 Geometry-based methods
1.2.3 Discussions
1.3 Fitting primitive shapes: A brief survey
1.3.1 Linear fitting algorithms
1.3.2 Robust estimation algorithms
1.3.3 RANdom SAmple Consensus (RANSAC) and its variations
1.3.4 Discussions
CHAPTER 2
POINT CLOUD REPRESENTATION AND THE
PROPOSED METHOD FOR TABLE PLANE
DETECTION
A common situation in activities of daily living of visually impaired people (VIPs)
is to query an object (a coffee cup, water bottle, so on) on a flat surface. We assume
that such flat surface could be a table plane in a sharing room, or in a kitchen. To
build the completed aided-system supporting for VIPs, obviously, the queried objects
should be separated from a table plane in current scene. In a general frame-work that
consists other steps such as detection, and estimation full model of the queried objects,
the table plane detection could be considered as a pre-processing step. Therefore, this
chapter is organized as follows: Firstly, we introduce a representation of the point
clouds which are combined the data collected by Kinect sensor in Section 2.1. We then
present the proposed method for the table plane detection in Section 2.2.
8
2.1 Point cloud representation
2.1.1 Capturing data by a Microsoft Kinect sensor
In order to collect the data from the environment for building an aid system for
the VIPs to detect, grasp objects that have simple geometrical structure on the table
in the indoor environment. The color image and depth image are captured from MS
Kinect sensor version 1.
2.1.2 Point cloud representation
The result of calibration images is a the camera’s intrinsic matrixHm for projecting
pixels in 2-D space to 3-D space as follows:
Hm =
fx 0 cx0 fy cy
0 0 1
where (cx, cy) is the principle point (usually the image center), fx and fy are the focal
lengths.
2.2 The proposed method for table plane detection
2.2.1 Introduction
Plane detection in 3-D point clouds is a critical task for many robotics and com-
puter vision applications. In order to help visually impaired/blind people find and
grasp interesting objects (e.g., coffee cup, bottle, bowl) on the table, one has to find
the table planes in the captured scenes. This work is motivated by such adaptation
in which acceleration data provided by the MS Kinect sensor to prune the extrac-
tion results. The proposed algorithms achieve real-time performance as well as a high
detection rate of the table planes.
2.2.2 Related Work
2.2.3 The proposed method
2.2.3.1 The proposed framework
Our research context aims to develop object finding and grasping-aided services
for VIP. The proposed framework, as shown in Fig. 2.6, consists of four steps: down-
sampling, organized point cloud representation, plane segmentation and table plane
classification. Because of our work utilizing only depth feature, a simple and effective
method for down-sampling and smoothing the depth data is described below.
9
Depth Down 
sampling
Organized 
point cloud 
representation
Plane 
classification
Plane 
segmentation
Table 
plane
Acceleration 
vector
Microsoft 
Kinect
Figure 2.6: The proposed frame-work for table plane detection.
Given a sliding window (of size n × n pixels), the depth value of a center pixel
D(xc, yc) is computed from the Eq. 2.2:
D(xc, yc) =
∑N
i=1D(xi, yi)
N
(2.2)
where D(xi, yi) is depth value of i
th neighboring pixel of the center pixel (xc, yc); N is
the number of pixels in the neighborhood n× n (N=(n× n) -1).
2.2.3.2 Plane segmentation
The detailed process of the plane segmentation is given in (Holz et al. RoboCup,
2011).
2.2.3.3 Table plane detection/extraction
The results of the first step are planes that are perpendicular to the acceleration
vector. After rotating the y axis such that it is parallel with the acceleration vector.
Therefore, the table plane is highest plane in the scene, that means the table plane is
the one with minimum y-value.
2.2.4 Experimental results
2.2.4.1 Experimental setup and dataset collection
The first dataset called ’MICA3D’ : A Microsoft Kinect version 1 is mounted on
the person’s chest, the person then moves around one table in the room. The distance
between the Kinect and the center of the table is about 1.5 m. The height of the
Kinect compared with table plane is about 0.6 meter. The height of table plane is
about 60 → 80 cm. We capture data of 10 different scenes which include a cafeteria,
showroom, and kitchen and, so on. These scenes cover common contexts in daily
activities of visually impaired people. The second dataset is introduced of (Richtsfeld
et al. IROS, 2012) . This dataset contains calibrated RGB-D data of 111 scenes. Each
scene has a table plane. The size of the image is 640x480 pixels.
2.2.4.2 Table plane detection evaluation method
Therefore, three evaluation measures are needed and they are defined as below.
Evaluation measure 1 (EM1): This measure evaluates the difference between the
10
Table 2.2: The average result of detected table plane on our own dataset(%).
Approach
Evaluation Measurement Missing
rate
Frame per
secondEM1 EM2 EM3 Average
First Method 87.43 87.26 71.77 82.15 1.2 0.2
Second Method 98.29 98.25 96.02 97.52 0.63 0.83
Proposed Method 96.65 96.78 97.73 97.0 0.81 5
Table 2.3: The average result of detected table plane on the dataset [3] (%).
Approach
Evaluation Measurement Missing
rate
Frame per
secondEM1 EM2 EM3 Average
First Method 87.39 68.47 98.19 84.68 0.0 1.19
Second Method 87.39 68.47 95.49 83.78 0.0 0.98
Proposed Method 87.39 68.47 99.09 84.99 0.0 5.43
normal vector extracted from the detected table plane and the normal vector extracted
from ground-truth data.
Evaluation measure 2 (EM2): By using EM1, only one point was used (center
point of the ground-truth) to estimate the angle. To reduce the noise influence, more
points for determining the normal vector of the ground truth are used. For the EM2,
3 points (p1, p2, p3) are randomly selected from the ground-truth point cloud.
Evaluation measure 3 (EM3): The two evaluation measures presented above
do not take into account the area of the detected table plane. Therefore, it is to propose
EM3 that is inspired by the Jaccard index for object detection.
r =
Rd ∩Rg
Rd ∪Rg (2.6)
2.2.4.3 Results
The comparative results of three different evaluation measures on two datasets are
shown in Tab. 2.2 and Tab. 2.3, respectively.
2.2.5 Discussions
In this work, a method for table plane detection using down sampling, accelerom-
eter data and organized point cloud structure obtained from color and depth images
of the MS Kinect sensor is proposed.
11
2.3 Separating the interested objects on the table plane
2.3.1 Coordinate system transformation
2.3.2 Separating table plane and interested objects
2.3.3 Discussions
CHAPTER 3
PRIMITIVE SHAPES ESTIMATION BY A NEW
ROBUST ESTIMATOR USING GEOMETRICAL
CONSTRAINTS
3.1 Fitting primitive shapes: By GCSAC
3.1.1 Introduction
The geometrical model of an interested object can be estimated using from two to
seven geometrical parameters as in (Schnabel et al. 2007). A Random Sample Consen-
sus (RANSAC) and its paradigm attempt to extract as good as possible shape parame-
ters which are objected either heavy noise in the data or processing time constraints. In
particular, at each hypothesis in a framework of a RANSAC-based algorithm, a search-
ing process aims at finding good samples based on the constraints of an estimated model
is implemented. To perform search for good samples, we define two criteria: (1) The
selected samples must ensure being consistent with the estimated model via a roughly
inlier ratio evaluation; (2) The samples must satisfy explicit geometrical constraints of
the interested objects (e.g., cylindrical constraints).
3.1.2 Related work
3.1.3 The proposed new robust estimator
3.1.3.1 Overview of the proposed robust estimator (GCSAC)
To estimate parameters of a 3-D primitive shape, an original RANSAC paradigm,
as shown in the top panel of Figure 3.2, selects randomly an (Minimum Sample Subset-
MSS) from a point cloud and then model parameters are estimated and validated. The
algorithm is often computationally infeasible and it is unnecessary to try every possible
sample. Our proposed method (GCSAC - in the bottom panel of Figure 3.2) is based
on an original version of RANSAC, however it is different in three major aspects: (1)
At each iteration, the minimal sample set is conducted when the random sampling
procedure is performed, so that probing the consensus data is easily achievable. In
other words, a low pre-defined inlier threshold can be deployed as a weak condition
of the consistency. Then after only (few) random sampling iterations, the candidates
12
Randomly 
sampling 
a minimal 
subset
Randomly 
sampling a 
minimal subset
Geometrical 
parameters 
estimation M
Model evaluation
M; Update the best 
model
Terminate 
?
RANSAC/
MLESAC
paradigm
Proposed
Method
(GCSAC)
Geometrical 
parameters 
Estimation M
Model evaluation M via 
(inlier ratio or Negative 
log-likelihood); 
Update the best model
Update the number 
of iterations K
adaptively (Eq. 3.2)
Randomly 
sampling a 
minimal 
subset
Searching good 
samples using 
geometrical 
constraints
Geometrical 
parameters 
estimation M
Model evaluation M via 
Negative 
Log-likehood; 
Update the best model
Update the number 
of iterations K
adaptively (Eq. 3.2)
RANSAC Iteration
A point 
cloud
Estimated 
Model
Update the number 
of iterations K
adaptively (Eq. 3.2)
No
yes
A point cloud
Compute Negative log-
lihood L, update the best 
model
Estimation model; 
Compute the inlier ratio w
Search good sampling 
based on Geometrical 
constraint based on (GS)
Random sampling
w≥wt
k≤K
Estimated mode
Good samples
(GS)
Yes
No
No
k=0: MLESAC
k=1:w≥ wt: Yes
k=1:w≥ wt: No
As MLESAC
Figure 3.2: Top panel: Over view of RANSAC-based algorithm.
Bottom panel: A diagram of the GCSAC’s implementations.
of good samples could be achieved. (2) The minimal sample sets consist of qualified
samples which ensure geometrical constraints of the interested object. (3) Termination
condition of the adaptive RANSAC algorithm of (Hartley et al. 2003) is adopted so
that the algorithm terminates as soon as the minimal sample set is found for which
the number of iterations of current estimation is less than that which has already been
achieved.
To determine the termination criterion for the estimation algorithm, a well-known
calculation for determining a number of sample selection K is as Eq. 3.2.
K =
log(1− p)
log(1− ws) (3.2)
where p is the probability to find a model describing the data, s is the minimal number
of samples needed to estimate a model, w is percentage of inliers in the point cloud.
13
(a) (b)
p1p2
n1
n2
PlaneY
L1
L2
Ic
Estimated 
cylinder
(c)
(e) (f)(d)
p1p2
p3
n1
n2
n3
γ2
γ1
p1
p2
γ
n1
n2
γc
Figure 3.3: Geometrical parameters of a cylindrical object. (a)-(c) Explanation of the
geometrical analysis to estimate a cylindrical object. (d)-(e) Illustration of the
geometrical constraints applied in GCSAC. (f) Result of the estimated cylinder from
a point cloud. Blue points are outliers, red points are inliers.
3.1.3.2 Geometrical analyses and constraints for qualifying good samples
In the following sections, the principles of 3-D the primitive shapes are explained.
Based on the geometrical analysis, related constraints are given to select good samples.
The normal vector of any point is computed following the approach in (Holz et al.
2011) At each point pi, k-nearest neighbors kn of pi are determined within a radius r.
The normal vector of pi is therefore reduced to analysis of eigenvectors and eigenvalues
of the covariance matrix C, that is presented as in Sec. 2.2.3.2.
a. Geometrical analysis for cylindrical objects
The geometrical relationships of above parameters are shown in Fig. 3.3 (a). A cylinder
can be estimated from two points (p1, p2) (two blue-squared points) and their corre-
sponding normal vectors (n1,n2) (marked by green and yellow line). Let γc be the
main axis of the cylinder (red line) which is estimated by:
γc = n1 × n2 (3.3)
To specify a centroid point I, we project the two parametric lines L1 = p1 + tn1 and
L2 = p2 + tn2 onto a plane specified by PlaneY (see Figure 3.3(b)). The normal
vector of this plane is estimated by a cross product of γc and n1 vectors (γc×n1). The
centroid point I is the intersection of L1 and L2 (see Figure 3.3 (c)). The radius Ra
is set by the distance between I and p1 in PlaneY . A result of the estimated cylinder
from a point cloud is illustrated in Figure 3.3 (f). The height of the estimated cylinder
is normalized to 1.
14
Figure 3.4: (a) Setting geometrical parameters for estimating a cylindrical object
from a point cloud as described above. (b) The estimated cylinder (green one) from
an inlier p1 and an outlier p2. As shown, it is an incorrect estimation.
(c) Normal vectors n1 and n
∗
2 on the plane pi are specified.
We first built a plane pi that is perpendicular to the plane PlaneY and consists of
n1. Therefore its normal vector is npi = (nPlaneY×n1) where nPlaneY is the normal vector
of PlaneY , as shown in Figure 3.4 (a). In the other words, n1 is nearly perpendicular
with n∗2 where n
∗
2 is the projection of n2 onto the plane pi. This observation leads to
the criterion below:
cp = arg min
p2∈{Un\p1}
{n1 · n∗2} (3.4)
3.1.4 Experimental results of robust estimator
3.1.4.1 Datasets for evaluation of the robust estimator
The first one is synthesized datasets. These datasets consists of cylinders, spheres
and cones. In addition, we evaluate the proposed method on real datasets. For the
cylindrical objects, the dataset is collected from a public dataset [1] which contains
300 objects belonging to 51 categories. It named ’second cylinder’. For the spherical
object, the dataset consists of two balls collected from four real scenes. Finally, point
cloud data of the cone objects, named ’second cone’, is collected from dataset given in
[4].
3.1.4.2 Evaluation measurements of robust estimator
To evaluate the performance of the proposed method, we use following measure-
ments:
- Let denote the relative error Ew of the estimated inlier ratio. The smaller Ew is,
the better the algorithm is. Where wgt is the defined inlier ratio of ground-truth;
w is the inlier ratio of the estimated model.
- The total distance errors Sd is calculated by summation of distances from any
point pj to the estimated model Me.
15
Table 3.2: The average evaluation results of synthesized datasets.
The synthesized datasets were repeated 50 times for statistically representative results.
Dataset/
Method
Measure RANSAC PROSAC MLESAC MSAC LOSAC NAPSAC GCSAC
’first
cylinder’
Ew
(%)
23.59 28.62 43.13 10.92 9.95 61.27 8.49
Sd 1528.71 1562.42 1568.81 1527.93 1536.47 3168.17 1495.33
tp(ms) 89.54 52.71 70.94 90.84 536.84 52.03 41.35
Ed(cm) 0.05 0.06 0.17 0.04 0.05 0.93 0.03
EA(deg.) 3.12 4.02 5.87 2.81 2.84 7.02 2.24
Er(%) 1.54 2.33 7.54 1.02 2.40 112.06 0.69
’first
sphere’
Ew(%) 23.01 31.53 85.65 33.43 23.63 57.76 19.44
Sd 3801.95 3803.62 3774.77 3804.27 3558.06 3904.22 3452.88
tp(ms) 10.68 23.45 1728.21 9.46 31.57 2.96 6.48
Ed(cm) 0.05 0.07 1.71 0.08 0.21 0.97 0.05
Er(%) 2.92 4.12 203.60 5.15 17.52 63.60 2.61
’first
cone’
Ew(%) 24.89 37.86 68.32 40.74 30.11 86.15 24.40
Sd 2361.79 2523.68 2383.01 2388.64 2298.03 13730.53 2223.14
tp(ms) 495.26 242.26 52525 227.57 1258.07 206.17 188.4
EA(deg.) 6.48 15.64 11.67 15.64 6.79 14.54 4.77
E r(%) 20.47 17.65 429.44 17.31 20.22 54.44 17.21
Table 3.3: Experimental results on the ’second cylinder’ dataset.
The experiments were repeated 20 times, then errors are averaged.
Dataset/
Measure
Method
w
(%)
Sd
tp
(ms)
Er
(%)
’second cylinder’
(coffee mug)
MLESAC 9.94 3269.77 110.28 9.93
GCSAC 13.83 2807.40 33.44 7.00
’second cylinder’
(food can)
MLESAC 19.05 1231.16 479.74 19.58
GCSAC 21.41 1015.38 119.46 13.48
’second cylinder’
(food cup)
MLESAC 15.04 1211.91 101.61 21.89
GCSAC 18.8 1035.19 14.43 17.87
’second cylinder’
(soda can)
MLESAC 13.54 1238.96 620.62 29.63
GCSAC 20.6 1004.27 16.25 27.7
- The processing time tp is measured in milliseconds (ms). The smaller tp is the
faster the algorithm is.
- The relative error of the estimated center (only for synthesized datasets) Ed is
Euclidean distance of the estimated center Ee and the truth one Et.
3.1.4.3 Evaluation results of new robust estimator
The performances of each method on the synthesized datasets are reported in
Tab. 3.2. For evaluating the real datasets, the experimental results are reported in
Tab. 3.3 for the cylindrical objects. Table 3.4 reports fitting results for spherical and
cone datasets.
16
Table 3.4: The average evaluation results on the ’second sphere’, ’second cone’ datasets.
The real datasets were repeated 20 times for statistically representative results.
Dataset/
Method
Measure RANSACPROSACMLESACMSAC LOSAC NAPSAC GCSAC
’second
sphere’
w(%) 99.77 99.98 99.83 99.80 99.78 98.20 100.00
Sd 29.60 26.62 29.38 29.37 28.77 35.55 11.31
tp(ms) 3.44 3.43 4.17 2.97 7.82 4.11 2.93
Er(%) 30.56 26.55 30.36 30.38 31.05 33.72 14.08
’second
cone’
w(%) 79.52 71.89 75.45 71.89 80.21 38.79 82.27
Sd 126.56 156.40 147.00 143.00 96.37 1043.34 116.09
tp(ms) 10.94 7.42 13.05 9.65 96.37 25.39 7.14
EA(deg.) 38.11 40.35 35.62 25.39 29.42 52.64 23.74
Er(%) 77.52 77.09 74.84 75.10 71.66 76.06 68.84
3.1.5 Discussions
In this work, we have proposed GCSAC that is a new RANSAC-based robust esti-
mator for fitting the primitive shapes from point clouds. The key idea of the proposed
GCSAC is the combination of ensuring consistency with the estimated model via a
roughly inlier ratio evaluation and geometrical constraints of the interested shapes.
This strategy aimed to select good samples for the model estimation. The proposed
method was examined with primitive shapes such as a cylinder, sphere and cone. The
experimental datasets consisted of synthesized, real datasets. The results of the GC-
SAC algorithm were compared to various RANSAC-based algorithms and they confirm
that GCSAC worked well even the point-clouds with low inlier ratio. In the future,
we will continue to validate GCSAC on other geometrical structures and evaluate the
proposed method with the real scenario for detecting multiple objects.
3.2 Fitting objects using the context and geometrical constraints
3.2.1 Finding objects using the context and geometrical constraints
Let’s consider a real scenario in common daily activities of the visually impaired
people. They come to a cafeteria then give a query ”where is a coffee cup?”, as shown
in Fig. 1.
3.2.2 The proposed method of finding objects using the context and geo-
metrical constraints
In the context of developing object-finding-aided systems for the VIPs (as shown
in Fig. 1).
17
3.2.2.1 Model verification using contextual constraints
3.2.3 Experimental results of finding objects using the context and geo-
metrical constraints
3.2.3.1 Descriptions of the datasets for evaluation
The first dataset is constructed from a public one used in [3].
3.2.3.2 Evaluation measurements
3.2.3.3 Results of finding objects using the context and geometrical constraints
Table 3.5 compares the performances of the proposed method GCSAC and MLE-
SAC.
Table 3.5: Average results of the evaluation measurements using GCSAC and MLESAC
on three datasets. The fitting procedures were repeated 50 times for statistical evaluations.
Dataset/ Method
without the context’s
constraint
Ea(deg.) Er(%) tp(ms)
First
dataset
MLESAC 46.47 92.85 18.10
GCSAC 36.17 81.01 13.51
Second
dataset
MLESAC 47.56 50.78 25.89
GCSAC 40.68 38.29 18.38
Third
dataset
MLESAC 45.32 48.48 22.75
GCSAC 43.06 46.9 17.14
3.2.4 Discussions
CHAPTER 4
DETECTING AND ESTIMATING THE FULL
MODEL OF 3-D OBJECTS AND DEPLOYING THE
APPLICATION
4.1 3-D object detection
4.1.1 Introduction
The interested objects are placed on the table plane and the objects have simple
geometry structure (e.g. coffee mugs, jars, bottles, soda cans are cylindrical, soccer-
balls are spherical). Our method exploited the performance of YOLO [2] as a state-of-
the-art method for objects detection in the RGB images because it is a method that
has the highest performance for objects detection. After that, the detected objects are
projects into the point cloud data (3-D data) to generate the full objects model for
grasping, describing objects.
18
Table 4.1: The average result detecting spherical objects on two stages
Measure/
dataset
First
stage
Second stage Average
Processing
time
tp(s)/sceneFirst
Dataset
Method
Recall
(%)
Precision
(%)
Recall
(%)
Precision
(%)
PSM 62.23 48.36 60.56 46.68 1.05
CVFGS 56.24 50.38 48.27 42.34 1.2
DLGS 88.24 78.52 76.52 72.29 0.5
4.1.2 Related Work
4.1.3 Three different approaches for 3-D objects detection in a complex
scene
4.1.3.1 Geometry-based method for Primitive Shape detection Method (PSM)
This method used the detecting Primitive Shape Method (PSM) of (Schnabel et
al) in point cloud of the objects.
4.1.3.2 Combination of Clustering objects, Viewpoint Features Histogram, GCSAC
for estimating 3-D full object models - CVFGS
4.1.3.3 Combination of Deep Learning based, GCSAC for
            Các file đính kèm theo tài liệu này:
 3d_object_detections_and_recognitions_assisting_visually_imp.pdf 3d_object_detections_and_recognitions_assisting_visually_imp.pdf