3 - D object detections and recognitions: Assisting visually impaired people

DECLARATION OF AUTHORSHIP i

ACKNOWLEDGEMENT ii

CONTENTS v

SYMBOLS vi

LIST OF TABLES viii

LIST OF FIGURES xvii

1 LITERATURE REVIEW 8

1.1 Aided-systems for supporting visually impaired people . . . . . . . . . 8

1.1.1 Aided-systems for navigation services . . . . . . . . . . . . . . . 8

1.1.2 Aided-systems for obstacle detection . . . . . . . . . . . . . . . 9

1.1.3 Aided-systems for locating the interested objects in scenes . . . 11

1.1.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2 3-D object detection, recognition from a point cloud data . . . . . . . . 13

1.2.1 Appearance-based methods . . . . . . . . . . . . . . . . . . . . 13

1.2.1.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.2.2 Geometry-based methods . . . . . . . . . . . . . . . . . . . . . . 16

1.2.3 Datasets for 3-D object recognition . . . . . . . . . . . . . . . . 17

1.2.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.3 Fitting primitive shapes . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.3.1 Linear fitting algorithms . . . . . . . . . . . . . . . . . . . . . . 18

1.3.2 Robust estimation algorithms . . . . . . . . . . . . . . . . . . . 19

1.3.3 RANdom SAmple Consensus (RANSAC) and its variations . . . 20

1.3.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2 POINT CLOUD REPRESENTATION AND THE PROPOSED METHOD

FOR TABLE PLANE DETECTION 24

2.1 Point cloud representations . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.1.1 Capturing data by a Microsoft Kinect sensor . . . . . . . . . . . 24

2.1.2 Point cloud representation . . . . . . . . . . . . . . . . . . . . . 25

2.2 The proposed method for table plane detection . . . . . . . . . . . . . 28

2.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

pdf159 trang | Chia sẻ: honganh20 | Ngày: 15/03/2022 | Lượt xem: 400 | Lượt tải: 1download
Bạn đang xem trước 20 trang tài liệu 3 - D object detections and recognitions: Assisting visually impaired people, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
NSAC-based estimator which aims to reduce the computational costs and improve the quality of the estimated models with contaminated data. The estimating parameters of a primitive geographic shape such as plane, sphere, cylinder, cone, from 3-D point cloud data is a fundamental research topic in the fields of computer vision and robotics. The geometrical model of an interested object can be estimated using from two to seven geometrical parameters [131]. A Random Sample Consensus (RANSAC) [46] and its paradigm attempt to extract as good as possible shape parameters which are objected either heavy noise in the data or processing time constraints. For being more accurate, faster and more robust, the RANSAC family focuses on either a better hypothesis from random samples or higher accuracy of data satisfying the estimated model. In this work, we propose to exploit geometrical constraints to obtain a qualified minimal sample set (MSS), i.e, good samples for model estimation. This sample set can be used to generate better hypotheses, and as a result, which result in a better estimated model. This study also demonstrates a real case where the proposed method is deployed for fitting cylindrical objects. Originally, a RANSAC paradigm randomly draws 3-D points from an input data set without any prior assumption on the data. Theoretically, RANSAC must run a relatively large number of iterations to find an optimal solution before a stopping criterion is met. Chum et al. (in [27]) argued that ”not all all-inlier samples are good”. They proposed the LO-RANSAC algorithm which is inspired by an idea that using an uncontaminated minimal sample is almost sufficiently or near perfect agreement 52 with theoretical performance. Therefore, the most crucial question is how to select the uncontaminated or so-called qualified or good samples from a set of data points. Motivated by this argument, we attempt to search for the qualified samples that could be better selected when geometrical constraints of the interested object(s) are used within a RANSAC-based algorithm. In particular, at each hypothesis in a frame- work of a RANSAC-based algorithm, a searching process aimed at finding good samples based on the constraints of an estimated model is implemented. To perform searches for good samples, we define two criteria: (1) The selected samples must ensure being con- sistent with the estimated model via a roughly inlier ratio evaluation; (2) The samples must satisfy explicit geometrical constraints of the interested objects (e.g., cylindrical constraints). The strategy of the proposed method first samples a minimal sample set that achieves a given inlier ratio among the samples and then adds samples that meet geometrical constraints derived from the primitive in concern. The seed generation is iterated until the convergence criterion is satisfied. In the other words, the key idea is guiding minimal sample set search using normal constraints of the geometric models (e.g., cylindrical objects). The constraints come from geometrical properties of a cylin- drical model. We tackle that at each iteration, thanks to the good samples, thus, an optimal model (with the maximal inlier ratio) is highly expected. Consequently, the number of iterations can be adaptively updated according to the certain inlier rate, results in an earlier termination. The proposed method is different from LO-RANSAC because only minimal sample set is utilized for the model estimation. It is also sepa- rated from the USAC (A Universal Framework for Random Sample Consensus) [112], where the sample check stage and/or model check are implemented after each random sampling procedure. Our search for good samples is actively burned at the start of each iteration. To evaluate a hypothesis, we utilize the maximum likelihood as MLE- SAC algorithm [143]. For a termination condition, we adopt adaptive RANSAC [55] in which the number of iterations is adjusted whenever a better model is generated. Fi- nally, the effectiveness of the proposed method is confirmed by fitting common shapes such as a sphere, cylinder and cone in both synthesized and public datasets. In these evaluations, GSSAC performances are compared with common RANSACs as original RANSAC [46], PROSAC [27], MLESAC [143], LO-SAC [29], NAPSAC [94]. The im- plementations of the proposed method and the datasets are made publicly available in their fanpage. 3.1.2 Related work For a general introduction and performances of RANSAC family, readers can refer to good surveys in [113, 26]. In the context of this research, we briefly survey related works which are categorized into two topics. First, efficient schemes on the selection 53 of minimal subset of samples for RANSAC-based robust estimators; and second, tech- niques for estimating parameters of the primitive shapes. For the first category, because the original RANSAC is very general with a straight- forward implementation, it always requires considerable computational time. Many RANSAC variants have been proposed with further optimization for a minimal sample set (MSS) selection. Progressive Sample Consensus or PROSAC [27] orders quality of samples through a similarity function of two corresponding points in the context of finding good matching features between a pair of images. In PROSAC algorithm, the most promising hypotheses are attempted earlier, therefore drawing the samples is implemented in a more meaningful order. However, PROSAC faces critical issues for defining the similarity function. LO-RANSAC [28] and its fixed version LO+-RANSAC [80] add local optimization steps within RANSAC to improve accuracy. To speed up the computation, adaptive RANSAC [55] probes the data via the consensus sets in order to adaptively determine the number of selected samples. The algorithm is imme- diately terminated when a smaller number of iterations has been obtained. With the proposed method, the good samples are expected to generate the best model as fast as possible. Therefore, the termination condition of the adaptive RANSAC [55] should be explored. Recently, USAC [112] introduces a new framework for a robust estimator. In the USAC framework, some strategies such as the sample check (Stage 1b) or the model check (Stage 2b), before and after model estimation, respectively, are similar to our ideas in this work. However, USAC does not really deploy an estimator for primitive shape(s) from a point cloud. A recent work [71] proposes to use geometric verification within a RANSAC framework. The authors deployed several check procedures such as sample relative configuration check based on the epipolar geometry. Rather than the ”check” procedures, our strategies anticipate achieving the best model as soon as possible. Therefore, the number of iterations is significantly reduced thanks to the results of the search for good sample process. The RANSAC-based algorithm used in the method of Chen et al. [25] and Aiger et al. [3] for registering of partially overlap- ping range images and partial surfaces of a 3D object. For primitive shape estimation from 3-D point clouds, readers can refer to a survey on feature-based techniques [69]. Relevant fitting techniques, for instance, multiscale super-quadric fitting in [36], Hough transform in [104], are commonly used. Marco et al. [87], Anas et al. [9] used the 3-D Hough Transform to estimate and extract sphere from point cloud data. However, the robust estimators (e.g., RANSAC family [26]) are always preferred techniques. Origi- nal RANSAC [46] demonstrates itself robust performances in estimating cylinders from range data. Recently in [144], the authors utilize normal vectors and curvature information 54 used for parameter estimation and extraction of cylinders. The cylindrical objects are also interested in the analytic geometrical techniques. The same group proposed method for extracting spheres and other primitives shaves in [145] and [146]. [136] focus on cylinder parameter estimation in three-dimensional point clouds, proposing a mathematical formulation based on angular distance to determine the cylinder ori- entation. [108] proposed a voting-based statistical cylinder detection framework. The framework then is applied to fallen tree mapping. In these works, the geometrical anal- ysis and datasets are different from the proposed method. We focus on selecting ”good samples” scheme in a RANSAC-based paradigm. Therefore, a geometrical analysis of a cylinder in [131] is adopted for defining criteria of the qualified samples as well as for estimating parameters of the interesting model from a 3-D point cloud. 3.1.3 The proposed a new robust estimator 3.1.3.1 Overview of the proposed robust estimator (GCSAC) To estimate parameters of a 3-D primitive shape, an original RANSAC paradigm, as shown in the top panel of Figure 3.2, selects randomly a MSS from a point cloud and then model parameters are estimated and validated. The algorithm is often compu- tationally infeasible and it is unnecessary to try every possible sample. Our proposed method (GCSAC - in the bottom panel of Figure 3.2) is based on an original version of RANSAC, however it is different in three major aspects: (1) At each iteration, the minimal sample set is conducted when the random sampling procedure is performed, so that probing the consensus data is easily achievable. In other words, a low pre- defined inlier threshold can be deployed as a weak condition of the consistency. Then after only (few) random sampling iterations, the candidates of good samples could be achieved. (2) The minimal sample sets consist of qualified samples which ensure geo- metrical constraints of the interested object. (3) Termination condition of the adaptive RANSAC algorithm [55] is adopted so that the algorithm terminates as soon as the minimal sample set is found for which the number of iterations of current estimation is less than that which has already achieved. For evaluating and updating the best estimated model, different from an original RANSAC-based algorithm whose inlier ratio is the criterion for updating the best model, GCSAC adopts the Negative Log-Likelihood criteria of MLESAC algorithm [143]. The best estimated model is chosen when the minimum value of the log-likelihood (denoted as −L) is found. At each iteration, −L is calculated by: −L = −log ( γr ( 1√ 2piσ )n exp (− e2 2σ2 ) + ( 1− γr ) 1 v ) (3.1) 55 A point cloud data Random or semi- random or using constraint for sampling a minimal subset Get a model M Update the number of iterations K, adaptively (Eq.1) Get the best model, update the via (inlier ratio, negative loglikelihood, cost function) best model Estimated model M*Terminate RANSAC Iteration Yes No A point cloud Compute Negative log- lihood L, update the best model Estimation model; Compute the inlier ratio w Search good sampling based on Geometrical constraint based on (GS) Random sampling w≥wt k≤K Estimated mode Good samples (GS) Yes No No k=0: MLESAC k=1:w≥ wt: Yes k=1:w≥ wt: No As MLESAC Figure 3.2 Top panel: Over view of RANSAC-based algorithms. Bottom panel: A diagram of the GCSAC’s implementations. where σ is the standard deviation of the error; v is the parameter space which outliers are expected to fall in; γr is the mixing parameter that is estimated based on Expec- tation Maximization (EM) from a set of indicator variables ηi with (i = 1, 2, ..., N). To determine the termination criterion for the estimation algorithm, a well-known calculation for determining a number of sample selection K is as Eq. 3.2. K = log(1− p) log(1− wm) (3.2) where p where p is the probability that ensures at least one of the random samples of m points is free from outliers. In other words, p is a confident score to successfully find a model describing the data, m is the minimal number of samples to estimate a model (to estimate a cylinder, a sphere, a cone is m = 2; m = 2 and m = 3, respectively), w is percentage of inliers in the point cloud. While p is usually set by a fixed value (e.g., p = 0.99 as a conservative probability), K therefore depends on w and m. The 56 algorithm terminates as soon as when the recomputed number of iterations K as Eq. 3.2 at the current estimation is less than the value of the current iteration k. A notice from Eq 3.2 is that the inlier w is usually affected by noise with an unknown bound. In the literature, some robust estimators (as RANSAC’s variations) have been proposed to avoid the need for user-defined parameters. For instance, [150] utilize a kernel density estimation as is adaptive scale sample consensus. They assumes that the inliers are located within some special structure of the density distribution. ASSC can provide a very high breakdown point, around 80%, when applying the proper bandwidth for the KDE. ASSC has subsequently been improved as ASKC (adaptive scale kernel consensus) ([149]). ASKC improves the objective function of ASSC and the robustness in the case of a high outlier rate. However, the fact that noise from a point cloud depends on the specific sensor, the sensor setting, and captured data in surrounding scenes. In this study, we do not take into account estimation of the w (inliers ratio). We adapt idea of the ”probing” the data via the consensus sets can be applied repeatedly in order to adaptively determine the number of samples required. When selecting good samples for estimating a model, the estimated model will be better and the inlier ratio w will be high. Based on Eq. 3.2 in our manuscript, the number of iterations of the estimation algorithm will decrease, and it will converge faster. Obviously, defining the geometrical criteria, which are to search for the good samples, is the most important task. Let denote U∗n storing m sample points to estimate a model. Based on the idea of the adaptive RANSAC [55] to probe initial samples, GCSAC starts from a rough selection of initial good samples. To initialize a set U∗n, we assume that the worst case of inlier ratio (or a weak-requirement of the inlier ratio, wt = 0.1 or 10% inlier) is pre-determined. As expectedly, a consensus set containing at least 10% inlier is easily found. Once a MSS is found, m− 1 samples are selected from m samples that are retained in the previous iteration in U∗n when the estimated model is satisfied (w ≥ wt). And the remaining one, mth, will be replaced by the better one so that the set U∗n is the best satisfied geometrical constraints of the interested shape. Consequently, the model estimated from good samples directly effects the estimation of the inlier ratio at current iteration. At next iteration, U∗n is reset (U ∗ n = ∅) for other estimations. If U∗n could not be initialized (or there is no operation when the condition wi ≥ wt is not satisfied), GCSAC algorithm degrades to the original RANSAC. The geometrical principles and constraints of a primitive shape are explained in Section 3.1.3.2. The number of iterations is computed from the number of inlier samples and the number of samples of point cloud N at each iteration. Each sample is an inlier sample when the Euclidean distance of it to the estimated model is less than a threshold T . 57 If the threshold is too large then many samples are far from the estimated model but they are inlier samples. While the threshold is too small, many samples are near from the estimated model but they are outlier samples. And p is often set to a fixed value (e.g., p = 0.99 as a conservative probability), K therefore depends on w and s. At each iteration k, if the estimated model is better (more accurate), then all of the w can be estimated more confidently. As result of using the good samples, the proposed method obviously tends to estimate an optimal model at each iteration, instead of randomly finding out the best one as the original RANSAC. Therefore, at each iteration k, w is confidently estimated and K is updated by Eq. 3.2. It may occur an immediate termination when at a certain iteration, w determines a K, which is smaller than that already performed. At each iteration, a sample point is specified as an inlier whose distance to the estimated model is smaller than a threshold T . In real datasets, this distance threshold T is usually chosen empirically. As explained in [55], when distribution of the data is a Gaussian with zero mean and standard deviation δ, the threshold distance T can be estimated by T 2 = 5.99δ2 for a probability of α = 0.95 as in Sec. 1.3.3. 3.1.3.2 Geometrical analyses and constraints for qualifying good samples In the following sections, the principles of geometrical analysis for 3-D primitive shapes are explained. Based on this analysis, the related constraints are defined to select good samples. The normal vector of any point is computed following the approach in Holz et al. [33]. At each point pi, k-nearest neighbors kn of pi are determined within a radius r. The normal vector of pi is therefore reduced to analysis of eigenvectors and eigenvalues of the covariance matrix C, which is presented as Sec. 2.2.3.2. a. Geometrical analysis for cylindrical objects: For geometrical analyzing a cylinder object, we utilize the method proposed by [131]. A cylinder is determined by four parameters: A center point on the cylinder axis, denoted as I(x0, y0, z0); A vector of the main direction, denoted as γc; Radius Ra of the cylinder; Height of the cylinder, set to 1 by default. The geometrical relationships of above parameters are shown in Fig. 3.3(a). A cylinder can be estimated from two points (p1, p2) (two blue-squared points) and their corresponding normal vectors (n1,n2) (marked by green and yellow line). Let γc be the main axis of the cylinder (red line) which is estimated by: γc = n1 × n2 (3.3) 58 (a) (b) p1p2 n1 n2 PlaneY L1 L2 Ic Estimated cylinder (c) (e) (f)(d) p1p2 p3 n1 n2 n3 γ2 γ1 p1 p2 γ n1 n2 γc Figure 3.3 Geometrical parameters of a cylindrical object. (a)-(c) Explanation of the geometrical analysis to estimate a cylindrical object. (d)-(e) Illustration of the geometrical constraints applied in GCSAC. (f) Result of the estimated cylinder from a point cloud. Blue points are outliers, red points are inliers. To specify a centroid point I, we project the two parametric lines L1 = p1 + tn1 and L2 = p2 + tn2 onto a plane specified by PlaneY (see Fig. 3.3(b)). As shown in Fig. 3.3(b), PlanY goes through a point p2. Its normal vector is a cross product of γc and n2 vectors (γc × n2). The PlanY therefore is defined by: XplaneY (x− xp2) + YplaneY (y − yp2) + ZplaneY (z − zp2) = 0 (3.4) where the cross product γc×n2 = (XplanY , YplanY , ZplanY ), and point p2 at (xp2 , yp2 , zp2). The normal vector of this plane is estimated by a cross product of γc and n1 vectors (γc × n1). The centroid point I is the intersection of L1 and L2 (see Fig. 3.3(c)). The radius Rc is set by the distance between Ic and p1 on that plane. The estimated cylinder from a point cloud is illustrated in Fig. 3.4(b). The height of the estimated cylinder is normalized to 1. It is noticed that the estimated cylinder in Fig. 3.4(b) is a wrong estimation because the hypothesis in this case consists of an inlier p2 and an outlier p1. The proposed framework is deployed for fitting a cylindrical object from a point cloud data. The corresponding normal vectors are already prepared in advance. The fitting procedure as shown in Figure 3.2 consists of the corresponding steps of the proposed framework. The algorithm starts by roughly selecting initial good samples. We adopt the idea of adaptive RANSAC [55] to probe initial samples. At each iteration, 59 Figure 3.4 (a) Setting geometrical parameters for estimating a cylindrical object from a point cloud as described above. (b) The estimated cylinder (green one) from an inlier p1 and an outlier p2. As shown, it is an incorrect estimation. (c) Normal vectors n1 and n∗2 on the plane pi are specified. we assume that the worst case estimate of wt determines an initial sample. In the worst case (let’s assume wt = 0.1), a consensus set containing more than 10% of the data is found, that is at least the proportion of inliers. The initial inlier ratio wt is set to 0.1 (or 10% inlier rate) is to easily find out a candidate of the estimated model. If a large value of wt is set, a larger size of the consensus set will be found. As a consequent, more computational time before searching good samples is required. We tackle that the good samples selection helps to estimate the better model even with an initial probing the data via a small consensus set. Once the stack U∗n of initial samples is conducted, we then search for good samples which satisfy the geometrical principles constraints of a 3-D cylindrical object as following. A calculating in the cost/score functions (e.g., our GCSAC adopted a Maximal Log-Likelihood function in MLESAC or counting the number of inlier points in origi- nal RANSAC algorithms) for the estimated cylindrical object (as described above) is suffered from two free parameters. While the first parameter, threshold distance T , it is a fixed and pre-determined value, the second one, an angle restricts the deviation of a points’ normal from that of the estimated shape [131]. These conditions are in- vestigated as following analyses. The first condition is illustrated in Fig. 3.4. In this example, a pair of inlier p1 and an outlier p2 is selected. This is a common case when at a certain iteration, two samples are randomly drawn. As shown in Fig. 3.4(a), γc - the main axis of the cylindrical object, is incorrectly estimated. Consequently, the estimated cylinder is a wrong estimation (as shown in Fig. 3.4(b)). This case can be eliminated using the first condition when a threshold distance T specifying inliers is applied. For instance, a standard cost function of RANSAC counts the number of inliers. As a consequent, yielding is a very low in this case. For the second parameter, let’s denote random inlier samples as p1, p2 and p3, as 60 shown in Fig. 3.3(d). In case of drawing two random points p1, p3, obviously, the first criteria is quickly satisfied because both of these samples are inliers (wi is larger than wt = 0.1). However, as shown in Fig. 3.3(d), the direction of the axis γ2 is totally different from ground-truth. Our second criterion (or search good samples) aims to update the initial samples (for example, p3 should be updated by p2). To obtain this, a centroid point I is a point on the main axis of the cylinder that the radius is the Euclidean distance from p1 to I. We first built a plane pi that is perpendicular to the plane PlaneY and consists of n1. Therefore its normal vector is npi = (nPlaneY × n1) where nPlaneY is the normal vector of PlaneY , as shown in Fig. 3.4(a). In other words, n1 is nearly perpendicular with n ∗ 2 where n ∗ 2 is the projection of n2 onto the plane pi. This observation leads to the criterion below: cp = arg min p2∈{Un\p1} {n1 · n∗2} (3.5) If cp is close to 0 then n1 and n ∗ 2 are orthogonal (perpendicular) vectors. It is noticed that in the example as shown in Fig. 3.3(e). In case, if the projection of n3 onto plane pi should be parallel with n1 then the dot product n1 · n∗3 is a large scalar value. b. Geometrical analysis for a spherical object: A sphere is determined by the following parameters: a centroid point denoted as Isp(x0, y0, z0) and its radius Rsp. To estimate sphere’s parameters, Schnabel et al. [131] propose to use two points (p1, p2) with their corresponding normal vectors (n1,n2) (see Fig. 3.5(left top panel)). The centroid Isp (a pink point Fig. 3.5(right top panel)) is the middle point of the shortest line segment that connects the two lines (n1, p1) and (n2, p2) (a green line of Fig. 3.5(middle top panel)). Identifying two lines pa = p1 + ma ∗ n1 and pb = p2 + mb ∗ n2 are shown in Fig. 3.5(middle top panel). ma,mb are the pa- rameters of two lines pa, pb formulations. The radius Rsp is determined by averaging the distance of Isp to p1 and Isp to p2. The estimated sphere is shown in Fig. 3.5(left bottom panel). The geometrical constraints for fitting spherical objects: As above denoted, a sphere is estimated from two points (p1, p2) and their normal vectors (n1, n2). In GCSAC, once set U∗n consisting of the initial good samples, say p1, is conducted, we store it and search for the other (i.e. p2) and search for p2 in the whole point cloud. We observe that to generate a sphere, the triangle (p1Ispp2) should be isosceles, as shown in Fig. 3.5(right bottom panel). Consequently, this observation forms a geometrical con

Các file đính kèm theo tài liệu này:

  • pdf3_d_object_detections_and_recognitions_assisting_visually_im.pdf
Tài liệu liên quan