Middleware for Distributed Video Surveillance
- Henry Detmold, Anton van den Hengel, Anthony Dick, Katrina Falkner, David Munro, Ron Morrison
Video surveillance networks are a class of sensor networks that serve several purposes, including protecting major facilities
from terrorism and other threats. At the hardware level, standard IP networking devices and IP video cameras enable building
thousand-camera networks at a reasonable cost. However, monitoring surveillance networks through human inspection is expensive
and remarkably ineffective. Trained operators lose concentration and miss a high percentage of significant events after only
10 minutes. Consequently, surveillance users are turning to software for automated video surveillance.1 Most research in this
area concentrates on the computer vision algorithms required to detect and interpret activity in video. Such work is limited
to networks of less than 100 cameras. We need to address the real-world issues raised by scaling to thousands of cameras and
integrating a diverse, evolving collection of surveillance approaches into continuously operating surveillance networks.
IEEE Distributed Systems Online, vol. 9, no. 2, 2008, art. no. 0802-o2001
The paper is available as
PDF
Fast global kernel density mode seeking: Applications to localisation and tracking
- C. Shen, M.J. Brooks, A. van den Hengel
Tracking objects in video using the mean shift (MS)
technique has been the subject of considerable attention. In this
work, we aim to remedy one of its shortcomings. MS, like other
gradient ascent optimization methods, is designed to find local
modes. In many situations, however, we seek the global mode
of a density function. The standard MS tracker assumes that
the initialization point falls within the basin of attraction of the
desired mode. When tracking objects in video this assumption
may not hold, particularly when the target’s displacement between
successive frames is large. In this case, the local and global
modes do not correspond and the tracker is likely to fail. A novel
multibandwidth MS procedure is proposed which converges to the
global mode of the density function, regardless of the initialization
point. We term the procedure annealed MS, as it shares similarities
with the annealed importance sampling procedure. The
bandwidth of the procedure plays the same role as the temperature
in conventional annealing. We observe that an over-smoothed
density function with a sufficiently large bandwidth is unimodal.
Using a continuation principle, the influence of the global peak
in the density function is introduced gradually. In this way, the
global maximum is more reliably located. Since it is imperative
that the computational complexity is minimal for real-time applications,
such as visual tracking, we also propose an accelerated
version of the algorithm. This significantly decreases the number
of iterations required to achieve convergence.We show on various
data sets that the proposed algorithm offers considerable promise
in reliably and rapidly finding the true object location when
initialized from a distant point.
IEEE Transactions in Image Processing, 16(5), pp. 1457-1469, May 2007
The paper is available as
PDF 1.7M
VideoTrace: Rapid interactive scene modelling from video
- A. van den Hengel, A. Dick, T. Thormählen, B. Ward, and P. H. S. Torr
VideoTrace is a system for interactively generating realistic 3D
models of objects from video—models that might be inserted into a
video game, a simulation environment, or another video sequence.
The user interacts with VideoTrace by tracing the shape of the object
to be modelled over one or more frames of the video. By interpreting
the sketch drawn by the user in light of 3D information
obtained from computer vision techniques, a small number of simple
2D interactions can be used to generate a realistic 3D model.
Each of the sketching operations in VideoTrace provides an intuitive
and powerful means of modelling shape from video, and executes
quickly enough to be used interactively. Immediate feedback
allows the user to model rapidly those parts of the scene which are
of interest and to the level of detail required. The combination of
automated and manual reconstruction allows VideoTrace to model
parts of the scene not visible, and to succeed in cases where purely
automated approaches would fail.
ACM Transactions on Graphics, 26(3), Article No. 86, July 2007
The paper is available as
PDF 541kB
Thrift: Local 3D Structure Recognition, ,
- Alex Flint, Anthony Dick, Anton van den Hengel
This paper presents a method for describing and recognising
local structure in 3D images. The method extends
proven techniques for 2D object recognition in images. In
particular, we propse a 3D interest point detector that is
based on SURF, and a 3D descriptor that extends SIFT.
The method is applied to the problem of detecting repeated
structure in range images, and promising results are reported.
9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications
(Dec. 2007 : Glenelg, Australia), DICTA '07, Adelaide, Australia, December 2007.
The paper is available as
PDF 1.3MB
Interactive 3D Model Completion, Digital Image Computing: Techniques and Applications, Adelaide, Australia, December 2007.
- A. van den Hengel, A. Dick, T. Thormaehlen, B. Ward, P.H.S. Torr
A common problem when using automated structure
from motion techniques is that the object to be modelled can
only be partially reconstructed from the video. This can occur
because not all of the object is visible in the video, or
because of featureless or ambiguous regions on the object’s
surface. In this paper we present an interactive method
for rapidly and intuitively generating a complete 3D model
from the output of a structure and motion algorithm. The
method combines information obtained from the video data
with the partial 3D model and user interaction. It is demonstrated
on video containing partially seen objects, including
planar and curved surfaces, and indoor and outdoor settings.
9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications
(Dec. 2007 : Glenelg, Australia), DICTA '07, Adelaide, Australia, December 2007.
The paper is available as
PDF 1.2MB
A Shape Hierarchy for 3D Modelling from Video, ,
- A. van den Hengel, A. Dick, T. Thormaehlen, B. Ward, P. H. S. Torr
This paper describes an interactive method for generating a model
of a scene from image data. The method uses the camera parameters
and point cloud typically generated by structure-and-motion
estimation as a starting point for developing a higher level model,
in which the scene is represented as a set of parameterised shapes.
Classes of shapes are represented in a hierarchy which defines
their properties but also the method by which they are localised
in the scene, using a combination of user interaction, sampling
and optimisation. Relations between shapes, such as adjancency
and alignment, are also specified interactively. The method thus
provides a modelling process which requires the user to provide
only high level scene information, the remaining detail being provided
through geometric analysis of the image set. This mixture of
guided, yet automated, fitting techniques allows a non-expert user
to rapidly and intuitively create a visually convincing 3D model of
a real world scene from an image set.
5th International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia (GRAPHITE’07),
December, 2007
The paper is available as
PDF 560kB
RATSAC: a method for adaptive accelerated robust estimation, and its application to video synchronisation
- D.W. Pooley, M.J. Brooks, A. van den Hengel
A new method for robust estimation is introduced. The presented algorithm seeks to unify adjustable per-iteration speedup
methods with an adaptive assumption regarding the number of inliers. This is achieved by assuming a prior distribution on
the true number of inliers, and using Bayesian inference to adjust the speedup whenever a best-so-far model estimate is found.
Convincing results are obtained for both synthetic and real cases of the robust synchronisation of video pairs generated by
independently moving cameras
9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications
(Dec. 2007 : Glenelg, Australia), DICTA '07, Adelaide, Australia, December 2007.
The paper is available as
PDF
An adaptive Bayesian technique for tracking multiple objects
- P. Kumar, M.J. Brooks, A. van den Hengel
Robust tracking of objects in video is a key challenge in computer vision with applications in automated surveillance, video
indexing, human-computer-interaction, gesture recognition, traffic monitoring, etc. Many algorithms have been developed for
tracking an object in controlled environments. However, they are susceptible to failure when the challenge is to track multiple
objects that undergo appearance change to due to factors such as variation in illumination and object pose. In this paper
we present a tracker based on Bayesian estimation, which is relatively robust to object appearance change, and can track multiple
targets simultaneously in real time. The object model for computing the likelihood function is incrementally updated and uses
background-foreground segmentation information to ameliorate the problem of drift associated with object model update schemes.
We demonstrate the efficacy of the proposed method by tracking objects in image sequences from the CAVIAR dataset.
Int. Conf. Patt. Recog. and Mach. Intell. (ICPRMI '07), Kolkata, India, December 2007.
The paper is available as
PDF 2.9M
Finding Camera Overlap in Large Surveillance Networks
- Anton van den Hengel, Anthony Dick, Henry Detmold, Alex Cichowski and Rhys Hill
Recent research on video surveillance across multiple cam-
eras has typically focused on camera networks of the order of 10 cameras.
In this paper we argue that existing systems do not scale to a network of
hundreds, or thousands, of cameras. We describe the design and deploy-
ment of an algorithm called exclusion that is specifically aimed at finding
correspondence between regions in cameras for large camera networks.
The information recovered by exclusion can be used as the basis for other
surveillance tasks such as tracking people through the network, or as an
aid to human inspection. We have run this algorithm on a campus net-
work of over 100 cameras, and report on its performance and accuracy
over this network.
8th Asian Conference on Computer Vision, Tokyo, Japan, November 2007
The paper is available as
PDF 600kB
Preparing for Post-catastrophe Video Processing
- A. van den Hengel, H. Detmold, A. Dick, R. Hill
: The spread of video surveillance systems and video phones means that video cameras are more ubiquitous than ever before.
In the event of a catastrophe the video captured by these cameras is an important source of information for those directing
the response. This information is currently either ignored because it is seen as inaccessible, filtered through the media,
or requires a huge commitment of human resources in order to perform the required processing. We present here an approach
towards acquiring and processing this video in order to extract as much value as possible within the shortest time period.
The approach presented allows flexible, intelligent video processing to be carried out quickly and securely.
RNSA Security Technology Conference, Melbourne, Australia, September 2007
The paper is available as
PDF 166kB
Topology estimation for thousand-camera surveillance networks
- Henry Detmold, Anton van den Hengel, Anthony Dick, Alex Cichowski, Rhys Hill, Ekim Kocadag, Katrina Falkner and David S. Munro
Surveillance camera technologies have reached the point whereby networks of a thousand cameras are not uncommon. Systems for
collecting and storing the video generated by such networks have been deployed operationally, and sophisticated methods have
been developed for interrogating individual video streams. The principal contribution of this paper is a scalable method for
processing video streams collectively, rather than on a per camera basis, which enables a coordinated approach to large-scale
video surveillance. To realise our ambition of thousand camera automated surveillance networks, we use distributed processing
on a dedicated cluster. Our focus is on determining activity topology -the paths objects may take between cameras' fields
of view. An accurate estimate of activity topology is critical to many surveillance functions, including tracking targets
through the network, and may also provide a means for partitioning of distributed surveillance processing. We present several
implementations using the exclusion algorithm to determine activity topology. Measurements reported for the key system component
demonstrate scalability to networks with a thousand cameras. Whole-system measurements are reported for actual operation on
over a hundred camera streams (this limit is based on the number of cameras and computers presently available to us, not scalability).
Finally, we explore how to scale our approach to support multi-thousand camera networks.
First ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC-07), Vienna, Austria, September 2007
The paper is available as
PDF 1.14MB
Determining the Translational Speed of a Camera from Time-Varying Optical Flow
- Anton van den Hengel, Wojciech Chojnacki, and Michael J. Brooks
Under certain assumptions, a moving camera can be self-calibrated solely on the basis of instantaneous optical flow. However,
due to a fundamental indeterminacy of scale, instantaneous optical flow is insufficient to determine the magnitude of the
camera’s translational velocity. This is equivalent to the baseline length indeterminacy encountered in conventional stereo
self-calibration. In this paper we show that if the camera is calibrated in a certain weak sense, then, by using time-varying
optical flow, the velocity of the camera may be uniquely determined relative to its initial velocity. This result enables
the calculation of the camera’s trajectory through the scene over time. A closed-form solution is presented in the continuous
realm, and its discrete analogue is experimentally validated.
1st International Workshop on Complex Motion, Gunzburg, Germany,
Springer-Verlag LNCS 3417, pp. 190-197, 2007
The paper is available as
PDF 300kB
Middleware for Video Surveillance Networks
- Henry Detmold, Anthony Dick, Katrina Falkner, David Munro, Anton van den Hengel, Ron Morrison
Automated video surveillance networks are a class of sensor networks with the potential to enhance the protection of facilities
such as airports and power stations from a wide range of threats. However, current systems are limited to networks of tens
of cameras, not the thousands required to protect major facilities. Realising thousand camera automated surveillance networks
demands middleware and architectural support; replacing the ad hoc approaches used in current systems with robust and scalable
methods.This paper introduces middleware supporting both computation and communication in automated video surveillance networks.
The computational approach is based on the Blackboard architectural style, which is widely used in signal processing and AI.
Communication on the surveillance network follows the service oriented model, with publish/subscribe messaging; providing
scalability, availability and the ability to integrate separately developed surveillance services. The middleware is demonstrated
through its application to an important class of surveillance algorithms.
Middleware for Sensor networks (MidSens2006), November, 2006, Melbourne, Australia.
The paper is available as
PDF 477kB
Generalised Principal Component Analysis: Exploiting Inherent Parameter Constraints
- W. Chojnacki, A. van den Hengel, M. Brooks
Generalised Principal Component Analysis (GPCA) is a recently devised
technique for fitting a multi-component, piecewise-linear structure to data
that has found strong utility in computer vision. Unlike other methods which intertwine
the processes of estimating structure components and segmenting data
points into clusters associated with putative components, GPCA estimates amulticomponent
structure with no recourse to data clustering. The standard GPCA
algorithm searches for an estimate by minimising a simple algebraic misfit function.
The underlying constraints on the model parameters are ignored. Here we
promote a variant of GPCA that incorporates the parameter constraints and exploits
constrained rather than unconstrained minimisation of a statistically motivated
error function. The output of any GPCA algorithm hardly ever perfectly
satisfies the parameter constraints. Our new version of GPCA greatly facilitates
the final correction of the algorithm output to satisfy perfectly the constraints,
making this step less prone to error in the presence of noise. The method is applied
to the example problem of fitting a pair of lines to noisy image points, but
has potential for use in more general multi-component structure fitting in computer
vision.
Advances in Computer Graphics and Computer Vision
International Conferences VISAPP and GRAPP 2006, Setúbal, Portugal, February 25-28, 2006, Revised Selected Papers
The paper is available as
PDF 600kB
-
Rapid Interactive Modelling from Video with Graph Cuts
- Anton van den Hengel, Anthony Dick, Thorsten Thormählen, Ben Ward, and Philip H. S. Torr
We present a method for generating a parameterised model of a scene from a set of images. The method is novel
in that it uses information from several sources—video, sparse 3D points and user input—to fit models to a scene.
The user drives the process by providing selected high-level scene information, for instance selecting an object
in the scene, or specifying the relationship between a pair of objects. The system combines this information with
image and 3D data to dynamically update its model of the scene. In doing so it avoids common pitfalls of both
automatic structure and motion algorithms, and image-based modelling packages.
Eurographics 2006, September 2006, Vienna, Austria.
The paper is available as
PDF 1.4M
Activity Topology Estimation for Large Networks of Cameras
- Anton van den Hengel, Anthony Dick, Rhys Hill
Estimating the paths that moving objects can take through the fields of view of possibly non-overlapping cameras, also known as their activity topology, is an important step in the effective interpretation of surveillance video. Existing approaches to this problem involve tracking moving objects within cameras, and then attempting to link tracks across views. In contrast we propose an approach which begins by assuming all camera views are potentially linked, and successively eliminates camera topologies that are contradicted by observed motion. Over time, the true patterns of motion emerge as those which are not contradicted by the evidence. These patterns may then be used to initialise a finer level search using other approaches if required. This method thus represents an efficient and effective way to learn activity topology for a large network of cameras, particularly with a limited amount of data.
IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2006), November, 2006, Sydney, Australia.
The paper is available as
PDF
Scalable Surveillance Software Architecture
- Henry Detmold, Anthony Dick, Katrina Falkner, David S. Munro, Anton van den Hengel, Ron Morrison
Video surveillance is a key technology for enhanced protection
of facilities such as airports and power stations from
various types of threat. Networks of thousands of IP-based
cameras are now possible, but current surveillance methodologies
become increasingly ineffective as the number of
cameras grows. Constructing software that efficiently and
reliably deals with networks of this size is a distributed information
processing problem as much as it is a video interpretation
challenge. This paper demonstrates a software
architecture approach to the construction of large scale
surveillance network software and explores the implications
for instantiating surveillance algorithms at such a scale. A
novel architecture for video surveillance is presented, and
its efficacy demonstrated through application to an important
class of surveillance algorithms.
IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2006), November, 2006, Sydney, Australia.
The paper is available as
PDF
Building Models of Regular Scenes from Structure-and-Motion,
- Anton van den Hengel, Anthony Dick, Thorsten Thormählen, Ben Ward, Philip H. S. Torr
This paper describes a method for generating a model-based reconstruction of
a scene from image data. The method uses the camera models and point cloud
typically generated by a structure-and-motion process as a starting point for
developing a higher level model of the scene. The method relies on the user
to provide a minimal amount of structural seeding information from which
more complex geometry is extrapolated. The regularity typically present in
man-made environments is used to minimise the interaction required, but also
to improve the accuracy of fit. We demonstrate model based reconstructions
obtained using this method.
The Seventeenth British Machine Vision Conference (BMVC 2006), September 2006, Edinburgh, United Kingdom
The paper is available as
PDF 2.7M
Hierarchical model fitting to 2D and 3D data
- A. van den Hengel, A. Dick, T. Thormaehlen, B. Ward, P. H. S. Torr
We propose a method for interactively generating a model-based reconstruction of a scene from a set of images. The method
facilitates the fitting of multiple object models to the data in a manner that provides the best overall fit to the image
set. This requires that models are not fit independently, but rather collectively, each potentially impacting upon the fit
of the other.
Third International Conference on Computer Graphics, Imaging and Visualisation, IEEE Computer Society Press, July 2006, Sydney,
Australia
The paper is available as
PDF 264kB
-
Utilising the broad range of information which human observers
bring to bear when interpreting their visual environment is currently
infeasible for artificial vision systems.We propose instead a method
for modelling compound structures which intelligently divides this prior
information into that which may be applied by the system and that
which may not. Models are fitted to the input data on the basis of 2D
and 3D image-based measures, but also as directed by a prior which is
split between the human and the system. Importantly this split is carried
out in a manner which minimises the human input required.
International Workshop on the Representation and Use of Prior Knowledge in Vision (WRUPKV) held in association with ECCV’06,
May 2006, Graz, Austria. Also selected for publication in LNCS
The paper is available as
PDF 1.6M
Fast global kernel density mode seeking: applications to localisation and tracking, ,
- C. Shen, M. J. Brooks, A. van den Hengel
We address the problem of seeking the global mode of a density function using the mean shift algorithm. Mean shift, like other
gradient ascent optimisation methods, is susceptible to local maxima, and hence often fails to find the desired global maximum.
In this work, we propose a multi-bandwidth mean shift procedure that alleviates this problem, which we term annealed mean
shift, as it shares similarities with the annealed importance sampling procedure. The bandwidth of the algorithm plays the
same role as the temperature in annealing. We observe that the over-smoothed density function with a sufficiently large bandwidth
is uni-modal. Using a continuation principle, the influence of the global peak in the density function is introduced gradually.
In this way the global maximum is more reliably located. Generally, the price of this annealing-like procedure is that more
iterations are required. Since it is imperative that the computation complexity is minimal in real-time applications such
as visual tracking. We propose an accelerated version of the mean shift algorithm. Compared with the conventional mean shift
algorithm, the accelerated mean shift can significantly decrease the number of iterations required for convergence. The proposed
algorithm is applied to the problems of visual tracking and object localisation. We empirically show on various data sets
that the proposed algorithm can reliably find the true object location when the starting position of mean shift is far away
from the global maximum, in contrast with the conventional mean shift algorithm that will usually get trapped in a spurious
local maximum.
IEEE International Conference on Computer Vision (ICCV'05), Beijing, China, Oct. 2005.
The paper is available as
PDF
Computing Surface-based Photo-Consistency on Graphics Hardware
- J Bastian, A. van den Hengel
This paper describes a novel approach to the problem of recovering information from an image set by comparing the radiance
of hypothesised point correspondences. Our algorithm is applicable to a number of problems in computer vision, but is explained
particularly in terms of recovering geometry from an image set. It uses the idea of photo-consistency to measure the confidence
that a hypothesised scene description generated the reference images. Photo-consistency has been used in volumetric scene
reconstruction where a hypothesised surface is evolved by considering one voxel at a time. Our approach is different: it represents
the scene as a parameterised surface so decisions can be made about its photo-consistency simultaneously over the entire surface
rather than a series of independent decisions. Our approach is further characterised by its ability to execute on graphics
hardware. Experiments demonstrate that our cost function minimises at the solution and is not adversely affected by occlusion.
Proceedings of Digital Image Computing: Techniques and Applications, December 2005, Cairns, Australia.
The paper is available as
PDF
Constrained Generalised Principal Component Analysis,
- W. Chojnacki, A. van den Hengel, M. Brooks
Generalised Principal Component Analysis (GPCA) is a
recently devised technique for fitting a multi-component
structure to data. Unlike other methods which intertwine the
processes of estimating structure components and segmenting
data points into clusters associated with putative components,
GPCA estimates a multi-component structure with
no recourse to data clustering. The standard GPCA algorithm
searches for an estimate by minimising an appropriate
misfit function. The underlying constraints on the model parameters
are ignored. Here we promote a variant of GPCA
that incorporates the parameter constraints and exploits constrained
rather than unconstrained minimisation of the error
function. The output of any GPCA algorithm hardly ever
perfectly satisfies the parameter constraints. The new version
of GPCA greatly facilitates the final correction of the
algorithm output to satisfy perfectly the constraints, making
this step less prone to error in the presence of noise. The
method is applied to the problem of fitting a pair of lines to
noisy image points, but has potential for use in more general
multi-component structure fitting.
Proceedings of International Conference on Computer Vision Theory and Applications, February 2006, Setúbal, Portugal
The paper is available as
PDF 164kB
Augmented particle filtering for efficient visual tracking
-
Chunhua Shen, Michael J. Brooks, and Anton van den Hengel
Visual tracking is one of the key tasks in computer vision. The
particle filter algorithm has been extensively used to tackle this
problem due to its flexibility. However the conventional particle
filter uses system transition as the proposal distribution, frequently
resulting in poor priors for the filtering step. The main reason is
that it is difficult, if not impossible, to accurately model the target's
motion. Such a proposal distribution does not take into account
the current observations. It is not a trivial task to devise a satisfactory
proposal distribution for the particle filter. In this paper
we advance a general augmented particle filtering framework for
designing the optimal proposal distribution. The essential idea is
to augment a second filter's estimate into the proposal distribution
design. We then show that several existing improved particle filters
can be rationalised within this general framework. Based on this
framework we further propose variant algorithms for robust and
efficient visual tracking. Experiments indicate that the augmented
particle filters are more effcient and robust than the conventional
particle filter.
IEEE International Conference on Image Processing (ICIP'05)
The paper is available as
PDF 915K
Visual tracking via efficient kernel discriminant subspace learning
-
Chunhua Shen, Anton van den Hengel, Michael J. Brooks
Robustly tracking moving objects in video sequences is one of the key
problems in computer vision. In this paper we introduce a
computationally efficient nonlinear kernel learning strategy to find a
discriminative model which distinguishes the tracked object from the
background. Principal Component Analysis and Linear Discriminant
Analysis have been applied to this problem with some success. These
techniques are limited, however, by the fact that they are capable
only of identifying linear subspaces within the data. Kernel based
methods, in contrast, are able to extract nonlinear subspaces, and
thus represent more complex characteristics of the tracked object and
background. This is a particular advantage when tracking deformable
objects and where appearance changes due to the unstable illumination
and pose occur. An efficient approximation to Kernel Discriminant
Analysis using QR decomposition proposed by Xiong
et al. makes possible real-time updating of the
optimal nonlinear subspace. We present a tracking method based on
this result and show promising experimental results on real videos
undergoing large pose and illumination changes.
IEEE International Conference on Image Processing (ICIP'05)
The paper is available as
PDF 447K
Enhanced importance sampling: unscented auxiliary particle filtering for visual tracking
-
Chunhua Shen, Anton van den Hengel, Anthony Dick and Michael J. Brooks
The particle filter has attracted considerable attention in vi-
sual tracking due to its relaxation of the linear and Gaussian restrictions
in the state space model. It is thus more flexible than the Kalman filter.
However, the conventional particle filter uses system transition as the
proposal distribution, leading to poor sampling efficiency and poor
performance in visual tracking. It is not a trivial task to design satisfactory
proposal distributions for the particle filter. In this paper, we introduce
an improved particle filtering framework into visual tracking, which combines the unscented Kalman filter and the auxiliary
particle filter. The
efficient unscented auxiliary particle filter (UAPF) uses the unscented
transformation to predict one-step ahead likelihood and produces more
reasonable proposal distributions, thus reducing the number of particles
required and substantially improving the tracking performance.
Experiments on real video sequences demonstrate that the UAPF is computationally
efficient and outperforms the conventional particle filter and the
auxiliary particle filter.
17th Australian Joint Conference on Artificial Intelligence (AI'04)
The paper is available as
PDF 588K
2D articulated tracking with dynamic Bayesian networks
-
Chunhua Shen, Anton van den Hengel, Anthony Dick and Michael J. Brooks
We present a novel method for tracking the motion of an
articulated structure in a video sequence. The analysis of
articulated motion is challenging because of the potentially
large number of degrees of freedom (DOFs) of an articulated
body. For particle filter based algorithms, the number
of samples required with high dimensional problems can
be computationally prohibitive. To alleviate this problem,
we represent the articulated object as an undirected graphical
model (or Markov Random Field, MRF) in which soft
constraints between adjacent subparts are captured by conditional
probability distributions. The graphical model is
extended across time frames to implement a tracker. The
tracking algorithm can be interpreted as a belief inference
procedure on a dynamic Bayesian network. The discretisation
of the state vectors makes it possible to utilise the
efficient belief propagation (BP) and mean field (MF) algorithms
to reason in this network. Experiments on real video
sequences demonstrate that the proposed method is computationally
efficient and performs well in tracking the human
body.
4th International Conference on Computer and Information Technology (CIT'04)
The paper is available as
PDF 353K
From FNS to HEIV: a link between two vision parameter estimation methods
- Wojciech Chojnacki, Michael J. Brooks, Anton van den Hengel, D. Gawley
Problems requiring accurate determination of parameters
from image-based quantities arise often in computer vision. Two
recent, independently developed frameworks for estimating such
parameters are the FNS scheme of the authors, and the HEIV scheme of
Leedan and Meer. In this paper, it is shown that the two schemes
constitute intimately related but different means of numerically
solving a common underlying equation characterising the
minimiser. The analysis is driven by the search for a non-degenerate
form of a certain generalised eigenvalue problem, and this
effectively leads to a new derivation of the HEIV algorithm. This
work may be seen as an extension of the authors' previous efforts to
rationalise and inter-relate a spectrum of estimators, including the
renormalisation method of Kanatani and the normalised eight-point
method of Hartley.
IEEE Trans. Pattern Analysis Machine Intelligence, 26, 2, pp. 264-268, Feb. 2004.
The paper is available as
PDF 111K
A new approach to constrained parameter estimation applicable to some
computer vision problems
- Wojciech Chojnacki, Michael J. Brooks, Anton van den Hengel, D. Gawley
Previous work of the authors developed a theoretically
well-founded scheme (FNS) for finding the minimiser of a class of cost
functions. Various problems in video analysis, stereo vision,
ellipse-fitting, etc, may be expressed in terms of finding such a
minimiser. However, in common with many other approaches, it is
necessary to correct the minimiser as a post-process if an ancillary
constraint is also to be satisfied. In this paper we develop the first
integrated scheme (CFNS) for simultaneously minimising the cost
function and satisfying the constraint. Preliminary experiments in the
domain of fundamental-matrix estimation show that CFNS generates
rank-2 estimates with smaller cost function values than rank-2
corrected FNS estimates. Furthermore, when compared with the Hartley-
Zisserman Gold Standard method, CFNS is seen to generate results of
comparable quality in a fraction of the time.
Image and Vision Computing Volume 22, Issue 2 , 1 February 2004, Pages 85-91
The paper is available as
PDF 277K
FNS, CFNS and HEIV: Extending Three Vision Parameter Estimation Methods
- Wojciech Chojnacki, Michael J. Brooks, Anton van den Hengel, and Darren Gawley
Estimation of parameters from image tokens is a central problem in computer vision. FNS, CFNS and HEIV are three recently
developed methods for solving special but important cases of this problem. The schemes are means for .nding unconstrained
(FNS, HEIV) and constrained (CFNS) minimisers of cost functions. In earlier work of the authors, FNS, CFNS and a version of
HEIV were applied to a speci.c cost function. Here we outline an extension of the approach to more general cost functions.
This allows the FNS, CFNS and HEIV methods to be placed within a common framework.
Proceedings of Digital Image Computing: Techniques and Applications, December 2003, Sydney, Australia, pp 663-672
The paper is available as
PDF 148K
Computing Image-Based Reprojection Error on Graphics Hardware
- J Bastian, A. van den Hengel
This paper describes a novel approach to the problem of recovering information from an image set by comparing the radiance
of hypothesised point correspondences. This method is applicable to a number of problems in computer vision, but is explained
particularly in terms of recovering geometry and camera parameters from image sets. The algorithm employs a cost-function
to represent the probability that a hypothesised scene description and camera parameters generated the reference images and
is characterised by its ability to execute on graphics hardware. Experiments show that minimisation of the cost-function
converges to a valid solution provided there are adequate geometric constraints and projective coverage.
Proceedings of Digital Image Computing: Techniques and Applications, December 2003, Sydney, Australia, pp 663-672
The paper is available as
PDF 1335K
Probabilistic Multiple Cue Integration for Particle Filter Based Tracking
- Chunhua Shen, Anton van den Hengel, and Anthony Dick
Robust visual tracking has become an important topic in the field of
computer vision. The integration of cues such as color, edge strength and
motion has proved to be a promising approach to robust visual tracking in
situations where no single cue is suitable. In this paper, an algorithm is
presented which integrates multiple cues in a probabilistic manner.
Specifically the likelihood of each cue is calculated and weighted before
Bayes' rule is applied to obtain the resultant posterior. This posterior
is generally not well represented analytically, and is therefore
represented as a set of weighted particles, which is updated at each frame
by a particle filter. This paper demonstrates how the combination of
multiple cue integration and particle filtering results in a robust
tracking method. We also demonstrate how each cue's weight can be adapted
on-line during the tracking procedure.
Proceedings of Digital Image Computing: Techniques and Applications, December 2003, Sydney, Australia, pp 399-408
The paper is available as
PDF 177K
Incorporating constraints into the design of locally identifiable calibration patterns
- Anton van den Hengel, Rhys Hill, Michael J. Brooks
Camera calibration requires the identification of points in an image
that correspond to known locations in the scene. These are typically
determined through the use of a calibration pattern designed to
facilitate feature localisation. We present in this paper a novel
method of generating patterns such that each subregion is
individually identifiable by its cross ratio. The method aims to
minimise the probability of misidentifying a subregion. A key
advantage of the method is the ability to place constraints on the
size of the elements constituting the pattern. This allows a
calibration object to be used in a wider variety of viewing
conditions, increasing the flexibility of the calibration process.
International Conference on Image Processing, 2003, I: 817-820
The paper is available as
PDF 292K
A Voting Scheme For Estimating The Synchrony Of Moving-Camera Videos
- D.W. Pooley, M.J. Brooks, A.J. van den Hengel, W. Chojnacki
Recovery of dynamic scene properties from multiple videos usually
requires the manipulation of synchronous (simultaneously captured)
frames. This paper is concerned with the automated determination of
this synchrony when the temporal alignment of sequences is unknown. A
cost function characterising departure from synchrony is first evolved
for the case in which two videos are generated by cameras that may be
moving. A novel voting method is then presented for minimising the
cost function in the case where the ratio of the cameras frame rates
is unknown. Experimental results indicate this relatively general
approach holds promise.
International Conference on Image Processing, 2003, I: 413-416
The paper is available as
PDF 321K
Revisiting Hartley's Normalised Eight-Point Algorithm
- Wojciech Chojnacki, Michael J. Brooks, Anton van den Hengel, D. Gawley
The paper gives a novel explanation for the improvement in
performance of the stereo-vision eight-point algorithm that results
from using normalised data. It is first established that the
normalised algorithm acts to minimise a specific cost function. It
is then shown that this cost function is statistically better
founded than the cost function associated with the non-normalised
algorithm. This supercedes the standard agument that improved
performance is due to the better conditioning of a pivotal matrix.
Experimental results are given that support the shift in argument
from numerical stability to statistical soundness as a means of
rationalising performance. This work continues a wider effort to
place a variety of estimation techniques within a coherent
framework.
IEEE Trans. Pattern Analysis Machine Intelligence, 25, 9, 2003, pp 1172-1177.
The paper is available as
PDF 136K
A new constrained parameter estimator: experiments in fundamental matrix computation
- A. van den Hengel, W. Chojnacki, M. J. Brooks, D. Gawley
In recent work the authors proposed a wide-ranging method for estimating parameters that constrain image feature locations
and satisfy a constraint not involving image data. The present work illustrates the use of the method with experiments concerning
estimation of the fundamental matrix. Results are given for both synthetic and real images. It is demonstrated that the method
gives results commensurate with, or superior to, previous approaches, with the advantage of being fast.
In Proceedings of the 13th British Machine Vision Conference, September, 2002, volume 2, pp 468-476, 2002.
The paper is available as
PDF 184K
A new approach to constrained parameter estimation applicable to some computer vision problems
- W. Chojnacki, M. J. Brooks, A. van den Hengel, D. Gawley,
Previous work of the authors developed a theoretically well-founded scheme (FNS) for finding the minimiser of a class of cost
functions. Various problems in video analysis, stereo vision, ellipse-fitting, etc, may be expressed in terms of finding such
a
minimiser. However, in common with many other approaches, it is necessary to correct the minimiser as a post-process if an
ancillary constraint is also to be satisfied. In this paper we develop the first integrated scheme (CFNS) for simultaneously
minimising
the cost function and satisfying the constraint. Preliminary experiments in the domain of fundamental-matrix estimation show
that
CFNS generates rank-2 estimates with smaller cost function values than rank-2 corrected FNS estimates. Furthermore, when
compared with the Hartley-Zisserman Gold Standard method, CFNS is seen to generate results of comparable quality in a fraction
of the time.
In D. Suter, editor, Statistical Methods in Video Processing Workshop held in conjunction with ECCV'02, Copenhagen, Denmark,
June 1-2, 2002.
The paper is available as
PDF 272K
What value covariance information in estimating vision parameters?
- M. J. Brooks, W. Chojnacki, D. Gawley, A. van den Hengel
Many parameter estimation methods used in computer vision are able to utilise covariance information describing the uncertainty
of data measurements. This paper considers the value of this information to the estimation process when applied to
measured
image point locations. Covariance matrices are first described and a procedure is then outlined whereby covariances
may be
associated with image features located via a measurement process. An empirical study is made of the conditions under
which
covariance information enables generation of improved parameter estimates. Also explored is the extent to which the
noise
should be anisotropic and inhomogeneous if improvements are to be obtained over covariance-free methods. Critical
in this is
the devising of synthetic experiments under which noise conditions can be precisely controlled. Given that covariance
information is, in itself, subject to estimation error, tests are also undertaken to determine the impact of imprecise
covariance
information upon the quality of parameter estimates. Finally, an experiment is carried out to assess the value
of covariances
in estimating the fundamental matrix from real images.
International Conference on Computer Vision, Vancouver, July 2001.
The paper is available as
PDF 117K
Rationalising the Renormalisation Method of Kanatani
- W. Chojnacki, M. J. Brooks, A. van den Hengel
The renormalisation technique of Kanatani is intended to iteratively minimise a cost function of a certain form while avoiding
systematic bias inherent in the common method of minimisation due to Sampson. Within the vision community, the technique
has generally been perceived as somewhat controversial and impenetrable. This work presents an alternative, simpler
derivation of the technique, along with new insights that place it in the context of other approaches. We first show
that the
minimiser of the cost function must satisfy a special variational equation. A Newton-like, fundamental numerical scheme
is
presented with the property that its theoretical limit coincides with the minimiser. Standard statistical techniques
are then
employed to derive afresh several renormalisation schemes. The fundamental scheme proves pivotal in the rationalising
of the
renormalisation and other schemes, and enables us to show that the renormalisation schemes do not have as their theoretical
limit
the desired minimiser. The various minimisation schemes are finally subjected to a rigorous performance analysis.
Journal Mathematical Imaging and Vision, 14, 1, 2001, 21-38.
The paper is available as
PDF 188Kb
Is covariance information useful in estimating vision parameters?
- M. J. Brooks, W. Chojnacki, A. van den Hengel, D. Gawley
This paper assesses some of the practical ramifications of recent
developments in estimating vision parameters given information
characterising the uncertainty of the data. This uncertainty
information may sometimes be estimated in association with the
observation process, and is usually represented in the form of
covariance matrices. An empirical study is made of the conditions
under which improved parameter estimates can be obtained from data
when covariance information is available. We explore, in the case
of fundamental matrix estimation and conic fitting, the extent to
which the noise should be anisotropic and inhomogeneous if
improvements over traditional methods are to be obtained. Critical
in this is the devising of synthetic experiments under which noise
conditions can be precisely controlled. Given that covariance
information is, in itself, subject to estimation error, tests are
also undertaken to determine the impact of imprecise covariance
information upon the quality of parameter estimates. We thus
investigate the consequences for parameter estimation of
inaccuracies in the characterisation of noise that inevitably arise
in practical computation
SPIE Videometrics, San Jose, Jan. 2001, pp 195-203
The paper is available as
PDF 102K
A fast MLE-based method for estimating the fundamental matrix
- W. Chojnacki, M. J. Brooks, A. van den Hengel, D. Gawley
We present a novel method for estimating the fundamental matrix, a
key problem arising in stereo vision. The method aims to minimise a
cost function that is derived from maximum likelihood
considerations. The respective minimiser turns out to be
significantly more accurate than the familiar algebraic least
squares technique. Furthermore, the method is identical in accuracy
to a Levenberg-Marquardt minimiser, while proving simpler and
faster.
International Conference on Image Processing,
Thessoloniki, Oct. 2001
The paper is available as
PDF 311K
On the fitting of surfaces to data with covariances
- W. Chojnacki, M. J. Brooks, A. van den Hengel, D. Gawley
We consider the problem of estimating parameters of a model described by an equation of special form. Specific models arise
in the
analysis of a wide class of computer vision problems, including conic fitting and estimation of the fundamental matrix.
We assume
that noisy data are accompanied by (known) covariance matrices characterising the uncertainty of the measurements. A cost
function is first obtained by considering a maximum likelihood formulation, and applying certain necessary approximations
that
render the problem tractable. A novel, Newton-like iterative scheme is then generated for determining a minimiser of the cost
function. Unlike alternative approaches such as Sampson's method or the renormalisation technique, the new scheme has as
its
theoretical limit the minimiser of the cost function. Furthermore the scheme is simply expressed, efficient, and unsurpassed
as a
general technique in our testing. An important feature of the method is that it can serve as a basis for conducting theoretical
comparison of various estimation approaches.
IEEE Trans. Pattern Analysis Machine Intelligence, 22, 11, Nov. 2000, 1294-1303.
The paper is available as
PDF 213K
Fundamental matrix from optical flow: optimal computation and reliability evaluation
- K. Kanatani, Y. Shimizu, N. Ohta, M.J. Brooks, W. Chojnacki, A. van den Hengel
The optical flow observed by a moving camera satisfies, in the absence of noise, a special equation analogous to the
epipolar
constraint arising in stereo vision. Computing the ``flow fundamental matrix'' of this equation is an essential
prerequisite to
undertaking 3-D analysis of the flow. This paper presents an optimal formulation of the problem of estimating this
matrix under
an assumed noise model. This model admits independent Gaussian noise that is not necessarily isotropic or homogeneous. A
theoretical bound is derived for the accuracy of the estimate. An algorithm is then devised that employs a technique called
renormalization to deliver an estimate and then corrects the estimate so as to satisfy a particular decomposability condition.
The
algorithm also provides an evaluation of the reliability of the estimate. Epipoles and their associated reliabilities
are computed
in both simulated and real-image experiments. Experiments indicate that the algorithm delivers results in the vicinity
of the
theoretical accuracy bound.
Journal of Electronic Imaging, 9, 2, April 2000, 194-202.
The paper is available as
PDF 487Kb
Rationalising the Renormalisation method of Kanatani
- W. Chojnacki, M. J. Brooks, A. van den Hengel
The renormalisation scheme of Kanatani is intended to
iteratively minimise a cost function of certain form while avoiding
systematic bias inherent in Sampson's method of minimisation. This
paper is concerned with enhancing our understanding of Kanatani's
complex scheme by expressing it within a novel framework common to
several methods. This approach enables us to demonstrate that the
renormalisation scheme does not have as its theoretical limit the
desired minimiser, in contrast with an alternative, simpler approach
presented.
Second International Symposium on Advanced Concepts for
Intelligent Vision Systems (ACIVS'00), Baden-Baden (Germany), Aug. 2000, pp. 13-19
The paper is available as
PDF (Sorry, this file is temporarily unavailable)
Estimating vision parameters given data with covariances
- W. Chojnacki, M. J. Brooks, A. van den Hengel, D. Gawley
A new parameter estimation method is presented, applicable to many
computer vision problems. It operates under the assumption that the
data (typically image point locations) are accompanied by covariance
matrices characterising data uncertainty. An MLE-based cost
function is first formulated and a new minimisation scheme is then
developed. Unlike Sampson's method or the renormalisation technique
of Kanatani, the new scheme has as its theoretical limit the true
minimum of the cost function. It also has the advantages of being
simply expressed, efficient, and unsurpassed in our comparative
testing.
British Machine Vision Conference, Bristol, Sept. 2000, pp. 182-191
The paper is available as
PDF 122K
A simplified treatment of Kanatani's renormalisation method
- W. Chojnacki, M. J. Brooks, A. van den Hengel
The renormalisation method of Kanatani is intended to find the
minimisers of cost functions of a certain form. As such, it has
applicability to a wide spectrum of computer vision problems that
may be couched in these terms. However, despite its sophistication,
the method of Kanatani has been slow to gain broad acceptance,
perhaps because of the complexity of its original derivation. In
this paper we present an alternative and simpler treatment of this
important method
International Conference on Control, Automation, Robotics and
Computer Vision (ICARCV 2000), Singapore, Dec. 2000, paper 196
The paper is available as
PDF 89K
Incorporating optical flow information into a self-calibration procedure for a moving camera
- M.J. Brooks, W. Chojnacki, A. Dick, A. van den Hengel, K. Kanatani, N. Ohta
In this paper we consider robust techniques for estimating structure
from motion in the uncalibrated case. We show how information
describing the uncertainty of the data may be incorporated into the
formulation of the problem, and we explore the situations in which
this appears to be advantageous. The structure recovery technique
is based on a method for self-calibrating a single moving camera
from instantaneous optical flow developed in previous work of some
of the authors~\cite{BCB:Det}. The method of self-calibration rests
upon an equation that we term the differential epipolar equation for
uncalibrated optical flow. This equation incorporates two matrices
(analogous to the fundamental matrix in stereo vision) which encode
information about the ego-motion and internal geometry of the
camera. Any sufficiently large, non-degenerate optical flow field
enables the ratio of the entries of the two matrices to be
estimated. Under certain assumptions, the moving camera can be
self-calibrated by means of closed-form expressions in the entries
of these matrices. Reconstruction of the scene, up to a scalar
factor, may then proceed using a straightforward
method~\cite{BCB:Det}. The critical step in this whole approach is
therefore the accurate estimation of the aforementioned ratio. To
this end, the problem is couched in a least-squares minimisation
framework whereby candidate cost functions are derived via ordinary
least squares, total least squares, and weighted least squares
techniques. Various computational schemes are adopted for
minimising the cost functions. Carefully devised synthetic
experiments reveal that when the optical flow field is contaminated
with inhomogeneous and anisotropic Gaussian noise, the best
performer is the weighted least squares approach with
renormalisation
SPIE Electronic Imaging '99, Videometrics VI, San Jose, Jan. 1999
The paper is available as
PDF 225K
Incorporating the epipolar constraint into a multiresolution algorithm for stereo image matching
- J. Magarey, A. Dick, M.J. Brooks, G. Newsam, A. van den Hengel
We present a new algorithm for matching calibrated stereo image pairs.
Matching is achieved within a multiresolution framework that utilises
a complex wavelet transform of each image. Integrated within this
coarse-to-fine approach is a regularisation step at each resolution.
Calibration information, in the form of the epipolar constraint, is
incorporated into the regularisation step . Uncertainty in the
calibration parameters can be accommodated. This composite matching
scheme avoids the need for expensive prior resampling of one image
along epipolar lines and yields excellent results in various tests
with real images.
Applied Informatics '99, 17th IASTED International Conference, Innsbruck, Feb. 1999.
The paper is available as
PDF 666K
Fitting surfaces to data with covariance information: fundamental methods applicable to computer vision
- W. Chojnacki, M. J. Brooks, A. van den Hengel
We are concerned with solving an equation whose form is applicable to a wide class of problems arising in computer vision.
The
equation typically relates image point locations to the parameters of some appropriate model. We assume that each measured
datum
is accompanied by a covariance matrix that characterises the uncertainty of the measurement. Noisy data are assumed to be
in
plentiful supply, implying that our problem is overdetermined. To tackle noise, the problem is transformed to one of least
squares
minimisation. In this sense, we are concerned with fitting a surface to data and their covariances. Examples are given of
computer
vision problems whose forms constitute instances of our general equation. The paper has two principal concerns: the establishing
of
a suitable cost function for our general problem, and the deriving of effective schemes for minimising the cost function.
A weighted
least squares (WLS) cost function is obtained by considering an optimal maximum likelihood formulation, and applying certain
necessary approximations that render the problem tractable. A new and fundamental Newton-like iterative scheme is then generated
for directly minimising the WLS cost function. This proves valuable in the deriving afresh of various existing and modified
schemes,
and helps us to show that the renormalisation approaches of Kanatani do not theoretically act to minimise the WLS cost function.
A
portion of this work serves to rationalise renormalisation, and several new variations on the theme are proposed. Various
minimisation schemes are then tested. Experiments are carried out on the benchmark conic fitting problem of estimating ellipses
from synthetic data points and their covariances. When the data exhibit noise that is anisotropic and inhomogeneous, those
methods
that make use of covariance information perform markedly better than more traditional methods that do not. None of the methods
outperforms the fundamental scheme. Thus, being in addition simply expressed and constituting a genuine minimiser of the WLS
cost
function, the fundamental scheme offers strong advantages over the alternatives considered.
TR99-03, Department of Computer Science, University of Adelaide, August 1999.
The paper is available as
PDF 256K
Robust determination of structure from motion in the uncalibrated case
- M.J. Brooks, W. Chojnacki, A. van den Hengel, L. Baumela
Robust techniques are developed for determining structure from motion in the uncalibrated case. The structure recovery is
based on
previous work of the authors in which it was shown that a camera undergoing unknown motion and having an unknown, and
possibly varying, focal length can be self-calibrated via closed-form expressions in the entries of two matrices derivable
from an
instantaneous optical flow field. Critical to the recovery process is the obtaining of accurate numerical estimates, up to
a scalar
factor, of these matrices in the presence of noisy optical flow data. We present techniques for the determination of these
matrices via
least-squares methods, and also a way of enforcing a dependency constraint that is imposed on these matrices. A method for
eliminating outlying flow vectors is also given. Results of experiments with real-image sequences are presented that suggest
that the
approach holds promise.
Proc. Fifth European Conference on Computer Vision - ECCV'98, Freiburg, Germany, June 1998, Lecture Notes in Computer Science
(Vol. 1), 1406, Springer Verlag, pp. 281-295
The paper is available as
PDF 133K
Robust techniques for the estimation of structure from motion
in the uncalibrated case
- M. J. Brooks, W. Chojnacki, A. van den Hengel, L. Baumela
Robust techniques are developed for determining structure from
motion in the uncalibrated case. It is shown that a camera with
unknown and possibly varying focal length and ego-motion can be
self-calibrated via closed-form expressions in the elements of two
special matrices derivable from an instantaneous optical flow field.
Techniques are presented for the robust determination of the two
special matrices, up to a scale factor, via least-squares methods; a
means of eliminating outlying flow vectors is also described. A
method is then given for obtaining a scaled Euclidean
3D-reconstruction from both the optical flow and the previously
computed self-calibration parameters. Special camera motions are
detailed that preclude self-calibration. Experiments with real-image
sequences confirm that the approach holds promise.
IEICE Technical Group Meeting on Pattern
Recognition and Media Understanding, December 18--19, 1997, Niigata, Japan
The paper is available as
PDF (Sorry, temporarily unavailable)
3D reconstruction from optical flow generated by an uncalibrated camera undergoing unknown motion
- M. J. Brooks, W. Chojnacki, A. van den Hengel, L Baumela
A procedure is described for self-calibration of a moving
camera from instantaneous optical flow. Under certain assumptions,
this procedure allows the ego-motion and some intrinsic parameters
of the camera to be determined solely from the instantaneous
positions and velocities of a set of image features. The proposed
method relies on the use of a differential epipolar equation that
relates optical flow to the ego-motion and internal geometry of the
camera. The information about the camera's ego-motion and internal
geometry enters the differential epipolar equation via two matrices.
It emerges that the optical flow determines the composite ratio of
some of the entries of the two matrices. It is shown that a camera
with unknown focal length undergoing arbitrary motion can be
self-calibrated via closed-form expressions in the composite ratio.
The corresponding formulae specify five ego-motion parameters, as
well as the focal length and its derivative. An accompanying
procedure is presented for reconstructing the viewed scene, up to a
scale factor, from the derived self-calibration parameters and the
optical flow data. Various least-squares techniques and an outlier
rejection scheme are presented to facilitate robust estimation of
the critical composite ratio. Experimental results are given that
suggest the approach holds promise.
International Workshop on Image Analysis and
Information Fusion, November 1997, Adelaide, Australia
The paper is available as
PDF 213K
Robust estimation of structure from motion in the uncalibrated case
- A. van den Hengel
A picture of a scene is a 2-dimensional representation of a
3-dimensional world. In the process of projecting the scene onto
the 2-dimensional image plane, some of the information about the
3-dimensional scene is inevitably lost. Given a series of images of
a scene, typically taken by a video camera, it is sometimes possible
to recover some of this lost 3-dimensional information. Within the
computer vision literature this process is described as that of
recovering structure from motion. If some of the information about
the internal geometry of the camera is unknown, then the problem is
described as that of recovering structure from motion in the
uncalibrated case. It is this uncalibrated version of the problem
that is the concern of this thesis.
Optical flow represents the movement of points across the image plane
over time. Previous work in the area of structure from motion has
given rise to a so-called differential epipolar equation
which describes the relationship between optical flow and the motion
and internal parameters of the camera. This equation allows the
calibration of a camera undergoing unknown motion and having an
unknown, and possibly varying, focal length. Obtaining accurate
estimates of the camera motion and internal parameters in the presence
of noisy optical flow data is critical to the structure recovery
process.
We present and compare a variety of methods for estimating the
coefficients of the differential epipolar equation. The goal of this
process is to derive a tractable total least squares estimator of
structure from motion robust to the presence of inaccuracies in the
data. Methods are also presented for rectifying optical flow to a
particular motion estimate, eliminating outliers from the data, and
calculating the relative motion of a camera over an image
sequence. The thesis thus explores the application of numerical and
statistical techniques for estimation of structure from motion in the
uncalibrated case.
Ph. D. thesis, Adelaide University, May 2000
The paper is available as
PDF 5.5Mb
|
|
|
|