|
Home |
Faculties & Divisions |
Search |
|
|
Constrained generalised principal component analysisGeneralised Principal Component Analysis (GPCA) is a recently devised technique for fitting a multi-component structure to data. Unlike other methods which intertwine the processes of estimating structure components and segmenting data points into clusters associated with putative components, GPCA estimates a multi-component structure with no recourse to data clustering. The standard GPCA algorithm searches for an estimate by minimising an appropriate misfit function. The underlying constraints on the model parameters are ignored. Here we promote a variant of GPCA that incorporates the parameter constraints and exploits constrained rather than unconstrained minimisation of the error function. The output of any GPCA algorithm hardly ever perfectly satisfies the parameter constraints. The new version of GPCA greatly facilitates the final correction of the algorithm output to satisfy perfectly the constraints, making this step less prone to error in the presence of noise. The method is applied to the problem of fitting a pair of lines to noisy image points, but has potential for use in more general multi-component structure fitting. The ProblemOne of the challenges of image analysis and computer vision is to develop effective ways to fit a multi-component structure to data. A classical example problem is fitting multiple lines to data. Several methods have been proposed for solving this particular task, including those based on the Hough transform, K-means, EM, LMedS and RANSAC algorithms. More recently, there has been interest in fitting multiple linear manifolds to data. This more general problem arose in the analysis of dynamical scenes in computer vision in connection with the recovery of multiple motion models from image data. Image: Rene Vidal, Yi Ma, Shankar Sastry To tackle this fitting problem, a new approach has been put forth under the label of the generalised principle component analysis (GPCA). The GPCA method employs a parametric model in which parameters describe a multi-component structure to which various parts of a data set adhere. The number of components is assumed to be fixed and known beforehand. The relationship between data and components is encoded in a multivariate polynomial. In the special, but representative, case of multi-line fitting, the components are lines, the order of the polynomial coincides with the number of the components, and the recovery of the components is achieved by factoring the polynomial into a product of multivariate monomials, each corresponding to a separate line. The success of the whole procedure rests upon generation of a meaningful polynomial to factor. The ApproachCGPCA is a variant of GPCA based on the use of constrained optimisation, and the approximated maximum likelihood cost function. A statistically well justified cost functionThe standard GPCA algorithm minimises the product of the algebraic residuals of each sub-component, which is a heteroskedastic error measure if you expect the noise in the data to be approximately Gaussian when measured in the image plane.
Constrained estimationThe parameter estimation process is based on the following update equation
where
is the Hessian of the approximated maximum likelihood cost function evaluated at the current estimate, and
More details The resultsData was generated on two hyperplanes in R^3 and the fitting methods applied. The data points are plotted in yellow (although they're hard to see), the GPCA estimate in red, the unconstrained approximated maximum likelihood estimate in green, and the constrained estimate in blue.
Obviously the CGPCA estimate is a significantly better fit to the data. And just to prove that it wasn't a fluke
Anton van den Hengel September, 2005
|
|||||||||||||||||