<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>#deadbeef</title>
	<atom:link href="http://blog.cordiner.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.cordiner.net</link>
	<description>the rants of a technocrat</description>
	<lastBuildDate>Wed, 23 May 2012 06:00:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.cordiner.net' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/390d853f1afba399226b015915d5e3b1?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>#deadbeef</title>
		<link>http://blog.cordiner.net</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.cordiner.net/osd.xml" title="#deadbeef" />
	<atom:link rel='hub' href='http://blog.cordiner.net/?pushpress=hub'/>
		<item>
		<title>Object tracking using a Kalman filter (MATLAB)</title>
		<link>http://blog.cordiner.net/2011/05/03/object-tracking-using-a-kalman-filter-matlab/</link>
		<comments>http://blog.cordiner.net/2011/05/03/object-tracking-using-a-kalman-filter-matlab/#comments</comments>
		<pubDate>Tue, 03 May 2011 02:55:41 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Image processing]]></category>
		<category><![CDATA[MATLAB]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=357</guid>
		<description><![CDATA[The Kalman filter is useful for tracking different types of moving objects. It was originally invented by Rudolf Kalman at NASA to track the trajectory of spacecraft. At its heart, the Kalman filter is a method of combining noisy (and possibly missing) measurements and predictions of the state of an object to achieve an estimate [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=357&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The Kalman filter is useful for tracking different types of moving objects. It was originally invented by Rudolf Kalman at NASA to track the trajectory of spacecraft. At its heart, the Kalman filter is a method of combining noisy (and possibly missing) measurements and predictions of the state of an object to achieve an estimate of its true current state. Kalman filters can be applied to many different types of <a href="http://en.wikipedia.org/wiki/Linear_dynamical_system">linear dynamical systems</a> and the &#8220;state&#8221; here can refer to any measurable quantity, such as an object&#8217;s location, velocity, temperature, voltage, or a combination of these.</p>
<p>In a previous article, I showed how <a href="http://blog.cordiner.net/2010/02/15/opencv-viola-jones-object-detection-in-matlab/">face detection can be performed in MATLAB</a> using OpenCV. In this article, I will combine this face detector with a Kalman filter to build a simple face tracker that can track a face in a video.</p>
<p>If you are unfamiliar with Kalman filters, I suggest you read up first on how <a href="http://en.wikipedia.org/wiki/Alpha_beta_filter">alpha beta filters</a> work. They are a simplified version of the Kalman filter that are much easier to understand, but still apply many of the core ideas of the Kalman filter.</p>
<h2>Face tracking without a Kalman filter</h2>
<p>The OpenCV-based face detector can be applied to every frame to detect the location of the face. Because it may detect multiple faces, we need a method to find the relationship between a detected face in one frame to another face in the next frame &#8212; this is a combinatorial problem known as <a href="http://en.wikipedia.org/wiki/Data_association">data association</a>. The simplest method is the <a href="http://en.wikipedia.org/wiki/Data_association">nearest neighbour</a> approach, and some other methods can be found in <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.112.8588">this survey paper on object tracking</a>. However, to greatly simplify the problem, the tracker I have implemented is a single face tracker and it assumes there is always a face in the frame. This means that every face that is detected can be assumed to be the same person&#8217;s face. If more than one face is detected, only the first face is used. If no faces are detected, a detection error is assumed. The MATLAB code below will detect the face location in a sequence of images and output the bounding box coordinates to a CSV file.</p>
<p><pre class="brush: matlabkey;">
function detect_faces(imgDir, opencvPath, includePath, outputFilename)

    % Load the required libraries

    if libisloaded('highgui100'), unloadlibrary highgui100, end
    if libisloaded('cv100'), unloadlibrary cv100, end
    if libisloaded('cxcore100'), unloadlibrary cxcore100, end

    loadlibrary(...
        fullfile(opencvPath, 'bin\cxcore100.dll'), @proto_cxcore);
    loadlibrary(...
        fullfile(opencvPath, 'bin\cv100.dll'), ...
        fullfile(opencvPath, 'cv\include\cv.h'), ...
            'alias', 'cv100', 'includepath', includePath);
    loadlibrary(...
        fullfile(opencvPath, 'bin\highgui100.dll'), ...
        fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...
            'alias', 'highgui100', 'includepath', includePath);

    % Load the cascade

    classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml';
    cvCascade = calllib('cv100', 'cvLoadHaarClassifierCascade', classifierFilename, libstruct('CvSize',struct('width',int16(100),'height',int16(100))));

    % Create memory storage

    cvStorage = calllib('cxcore100', 'cvCreateMemStorage', 0);

    % Get the list of images
    imageFiles = dir(imgDir);
    detections = struct;

    h = waitbar(0, 'Performing face detection...'); % progress bar

    % Open the output CSV file
    fid = fopen(outputFilename, 'w');
    fprintf(fid, 'filename,x1,y1,x2,y2');

    for i = 1:numel(imageFiles)

        if imageFiles(i).isdir; continue; end
        imageFile = fullfile(imgDir, imageFiles(i).name);

        % Load the input image
        cvImage = calllib('highgui100', ...
            'cvLoadImage', imageFile, int16(1));
        if ~cvImage.Value.nSize
            error('Image could not be loaded');
        end

        % Perform face detection
        cvSeq = calllib('cv100', 'cvHaarDetectObjects', cvImage, cvCascade, cvStorage, 1.1, 2, 0, libstruct('CvSize',struct('width',int16(40),'height',int16(40))));

        % Save the detected bounding box, if any (and if there's multiple
        % detections, just use the first one)
        detections(i).filename = imageFile;
        if cvSeq.Value.total == 1
            cvRect = calllib('cxcore100', ...
                'cvGetSeqElem', cvSeq, int16(1));
            fprintf(fid, '%s,%d,%d,%d,%d', imageFile, ...
                cvRect.Value.x, cvRect.Value.y, ...
                cvRect.Value.x + cvRect.Value.width, ...
                cvRect.Value.y + cvRect.Value.height);
        else
            fprintf(fid, '%s,%d,%d,%d,%d', imageFile, 0, 0, 0, 0);
        end

        % Release image
        calllib('cxcore100', 'cvReleaseImage', cvImage);
        waitbar(i / numel(imageFiles), h);

    end

    % Release resources

    fclose(fid);
    close(h);
    calllib('cxcore100', 'cvReleaseMemStorage', cvStorage);
    calllib('cv100', 'cvReleaseHaarClassifierCascade', cvCascade);

end
</pre></p>
<p>We can then run our face detector and generate an output file, <code>faces.csv</code>, like this:</p>
<p><pre class="brush: matlabkey;">
imgDir = 'images';
opencvPath = 'C:\Program Files\OpenCV';
includePath = fullfile(opencvPath, 'cxcore\include');
detect_faces(imgDir, opencvPath, includePath, 'faces.csv');
</pre></p>
<p>In the video below, I have run this script on the <a href="http://www-prima.inrialpes.fr/FGnet/data/01-TalkingFace/talking_face.html">FGnet Talking Face database</a> (which is free to download) and displayed the bounding boxes overlayed on the image sequence. You can download a copy of the <code>faces.csv</code> file that was used to generate the video from <a href="http://dl.dropbox.com/u/6830023/blog/kalman/faces.csv">here</a>.</p>
<span style="text-align:center; display: block;"><a href="http://blog.cordiner.net/2011/05/03/object-tracking-using-a-kalman-filter-matlab/"><img src="http://img.youtube.com/vi/_8OpICC4Jig/2.jpg" alt="" /></a></span>
<p>The bounding box roughly follows the face, but its trajectory is quite noisy and the video contains numerous frames where the bounding box disappears because the face was not detected. The Kalman filter can be used to smooth this trajectory and estimate the location of the bounding box when the face detector fails.</p>
<h2>Kalman filtering: The gritty details</h2>
<p>The Kalman filter is a recursive two-stage filter. At each iteration, it performs a <em>predict</em> step and an <em>update</em> step.</p>
<p>The predict step predicts the current location of the moving object based on previous observations. For instance, if an object is moving with constant acceleration, we can predict its current location, <img src='http://s0.wp.com/latex.php?latex=%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;hat{&#92;textbf{x}}_{t}' title='&#92;hat{&#92;textbf{x}}_{t}' class='latex' />, based on its previous location, <img src='http://s0.wp.com/latex.php?latex=%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt-1%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;hat{&#92;textbf{x}}_{t-1}' title='&#92;hat{&#92;textbf{x}}_{t-1}' class='latex' />, using the <a href="http://en.wikipedia.org/wiki/Equations_of_motion">equations of motion</a>.</p>
<p>The update step takes the measurement of the object&#8217;s current location (if available), <img src='http://s0.wp.com/latex.php?latex=%5Ctextbf%7Bz%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;textbf{z}_{t}' title='&#92;textbf{z}_{t}' class='latex' />, and combines this with the predicted current location, <img src='http://s0.wp.com/latex.php?latex=%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;hat{&#92;textbf{x}}_{t}' title='&#92;hat{&#92;textbf{x}}_{t}' class='latex' />, to obtain an <em>a posteriori</em> estimated current location of the object, <img src='http://s0.wp.com/latex.php?latex=%5Ctextbf%7Bx%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;textbf{x}_{t}' title='&#92;textbf{x}_{t}' class='latex' />.</p>
<p>The equations that govern the Kalman filter are given below (taken from the <a href="http://en.wikipedia.org/wiki/Kalman_filter#The_Kalman_filter">Wikipedia article</a>):</p>
<ol>
<li>Predict stage:
<ol>
<li>Predicted (<em>a priori</em>) state: <img src='http://s0.wp.com/latex.php?latex=%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt%7Ct-1%7D+%3D+%5Ctextbf%7BF%7D_%7Bt%7D%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt-1%7Ct-1%7D+%2B+%5Ctextbf%7BB%7D_%7Bt%7D+%5Ctextbf%7Bu%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;hat{&#92;textbf{x}}_{t|t-1} = &#92;textbf{F}_{t}&#92;hat{&#92;textbf{x}}_{t-1|t-1} + &#92;textbf{B}_{t} &#92;textbf{u}_{t}' title='&#92;hat{&#92;textbf{x}}_{t|t-1} = &#92;textbf{F}_{t}&#92;hat{&#92;textbf{x}}_{t-1|t-1} + &#92;textbf{B}_{t} &#92;textbf{u}_{t}' class='latex' /></li>
<li>Predicted (<em>a priori</em>) estimate covariance: <img src='http://s0.wp.com/latex.php?latex=%5Ctextbf%7BP%7D_%7Bt%7Ct-1%7D+%3D+%5Ctextbf%7BF%7D_%7Bt%7D+%5Ctextbf%7BP%7D_%7Bt-1%7Ct-1%7D+%5Ctextbf%7BF%7D_%7Bt%7D%5E%7BT%7D%2B+%5Ctextbf%7BQ%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;textbf{P}_{t|t-1} = &#92;textbf{F}_{t} &#92;textbf{P}_{t-1|t-1} &#92;textbf{F}_{t}^{T}+ &#92;textbf{Q}_{t}' title='&#92;textbf{P}_{t|t-1} = &#92;textbf{F}_{t} &#92;textbf{P}_{t-1|t-1} &#92;textbf{F}_{t}^{T}+ &#92;textbf{Q}_{t}' class='latex' /></li>
</ol>
</li>
<li>Update stage:
<ol>
<li>Innovation or measurement residual: <img src='http://s0.wp.com/latex.php?latex=%5Ctilde%7B%5Ctextbf%7By%7D%7D_t+%3D+%5Ctextbf%7Bz%7D_t+-+%5Ctextbf%7BH%7D_t%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt%7Ct-1%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;tilde{&#92;textbf{y}}_t = &#92;textbf{z}_t - &#92;textbf{H}_t&#92;hat{&#92;textbf{x}}_{t|t-1}' title='&#92;tilde{&#92;textbf{y}}_t = &#92;textbf{z}_t - &#92;textbf{H}_t&#92;hat{&#92;textbf{x}}_{t|t-1}' class='latex' /></li>
<li>Innovation (or residual) covariance: <img src='http://s0.wp.com/latex.php?latex=%5Ctextbf%7BS%7D_t+%3D+%5Ctextbf%7BH%7D_t+%5Ctextbf%7BP%7D_%7Bt%7Ct-1%7D+%5Ctextbf%7BH%7D_t%5ET+%2B+%5Ctextbf%7BR%7D_t&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;textbf{S}_t = &#92;textbf{H}_t &#92;textbf{P}_{t|t-1} &#92;textbf{H}_t^T + &#92;textbf{R}_t' title='&#92;textbf{S}_t = &#92;textbf{H}_t &#92;textbf{P}_{t|t-1} &#92;textbf{H}_t^T + &#92;textbf{R}_t' class='latex' /></li>
<li>Optimal Kalman gain: <img src='http://s0.wp.com/latex.php?latex=%5Ctextbf%7BK%7D_t+%3D+%5Ctextbf%7BP%7D_%7Bt%7Ct-1%7D+%5Ctextbf%7BH%7D_t%5ET+%5Ctextbf%7BS%7D_t%5E%7B-1%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;textbf{K}_t = &#92;textbf{P}_{t|t-1} &#92;textbf{H}_t^T &#92;textbf{S}_t^{-1}' title='&#92;textbf{K}_t = &#92;textbf{P}_{t|t-1} &#92;textbf{H}_t^T &#92;textbf{S}_t^{-1}' class='latex' /></li>
<li>Updated (<em>a posteriori</em>) state estimate: <img src='http://s0.wp.com/latex.php?latex=%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt%7Ct%7D+%3D+%5Chat%7B%5Ctextbf%7Bx%7D%7D_%7Bt%7Ct-1%7D+%2B+%5Ctextbf%7BK%7D_t%5Ctilde%7B%5Ctextbf%7By%7D%7D_t&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;hat{&#92;textbf{x}}_{t|t} = &#92;hat{&#92;textbf{x}}_{t|t-1} + &#92;textbf{K}_t&#92;tilde{&#92;textbf{y}}_t' title='&#92;hat{&#92;textbf{x}}_{t|t} = &#92;hat{&#92;textbf{x}}_{t|t-1} + &#92;textbf{K}_t&#92;tilde{&#92;textbf{y}}_t' class='latex' /></li>
<li>Updated (<em>a posteriori</em>) estimate covariance: <img src='http://s0.wp.com/latex.php?latex=%5Ctextbf%7BP%7D_%7Bt%7Ct%7D+%3D+%28I+-+%5Ctextbf%7BK%7D_t+%5Ctextbf%7BH%7D_t%29+%5Ctextbf%7BP%7D_%7Bt%7Ct-1%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;textbf{P}_{t|t} = (I - &#92;textbf{K}_t &#92;textbf{H}_t) &#92;textbf{P}_{t|t-1}' title='&#92;textbf{P}_{t|t} = (I - &#92;textbf{K}_t &#92;textbf{H}_t) &#92;textbf{P}_{t|t-1}' class='latex' /></li>
</ol>
</li>
</ol>
<p>They can be difficult to understand at first, so let&#8217;s first take a look at what each of these variables are used for:</p>
<ul>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> is the current state vector, as estimated by the Kalman filter, at time <img src='http://s0.wp.com/latex.php?latex=%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{t}' title='{t}' class='latex' />.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' /> is the measurement vector taken at time <img src='http://s0.wp.com/latex.php?latex=%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{t}' title='{t}' class='latex' />.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BP%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{P}_{t}}' title='{&#92;mathbf{P}_{t}}' class='latex' /> measures the estimated accuracy of <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> at time <img src='http://s0.wp.com/latex.php?latex=%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{t}' title='{t}' class='latex' />.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BF%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{F}}' title='{&#92;mathbf{F}}' class='latex' /> describes how the system moves (ideally) from one state to the next, i.e. how one state vector is projected to the next, assuming no noise (e.g. no acceleration)</li>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BH%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{H}}' title='{&#92;mathbf{H}}' class='latex' /> defines the mapping from the state vector, <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' />, to the measurement vector, <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' />.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BQ%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{Q}}' title='{&#92;mathbf{Q}}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BR%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{R}}' title='{&#92;mathbf{R}}' class='latex' /> define the Gaussian process and measurement noise, respectively, and characterise the variance of the system.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BB%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{B}}' title='{&#92;mathbf{B}}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bu%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{u}}' title='{&#92;mathbf{u}}' class='latex' /> are control-input parameters are only used in systems that have an input; these can be ignored in the case of an object tracker.</li>
</ul>
<p>Note that in a simple system, the current state <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> and the measurement <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' /> will contain the same set of state variables (only <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> will be a filtered version of <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' />) and <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BH%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{H}}' title='{&#92;mathbf{H}}' class='latex' /> will be an identity matrix, but many real-world systems will include <a href="http://en.wikipedia.org/wiki/Latent_variable">latent variables</a> that are not directly measured. For example, if we are tracking the location of a car, we may be able to directly measure its location from a GPS device and its velocity from the speedometer, but not its acceleration.</p>
<p>In the predict stage, the state of the system and its error covariance are transitioned using the defined transition matrix <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BF%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{F}}' title='{&#92;mathbf{F}}' class='latex' />, and can be implemented in MATLAB as:</p>
<p><pre class="brush: matlabkey;">
function [x,P] = kalman_predict(x,P,F,Q)
    x = F*x; %predicted state
    P = F*P*F' + Q; %predicted estimate covariance
end
</pre></p>
<p>In the update stage, we first calculate the difference between our predicted and measured states. We then calculate the Kalman gain matrix, <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BK%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{K}}' title='{&#92;mathbf{K}}' class='latex' />, which is used to weight between our predicted and measured states and is adjusted based on a ratio of prediction error <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BP%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{P}_{t}}' title='{&#92;mathbf{P}_{t}}' class='latex' /> to measurement noise <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BS%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{S}_{t}}' title='{&#92;mathbf{S}_{t}}' class='latex' />.</p>
<p>Finally, the state vector and its error covariance are then updated with the measured state. It can be implemented in MATLAB as:</p>
<p><pre class="brush: matlabkey;">
function [x,P] = kalman_update(x,P,z,H,R)
    y = z - H*x; %measurement error/innovation
    S = H*P*H' + R; %measurement error/innovation covariance
    K = P*H'*inv(S); %optimal Kalman gain
    x = x + K*y; %updated state estimate
    P = (eye(size(x,1)) - K*H)*P; %updated estimate covariance
end
</pre></p>
<p>Both the stages only update two variables: <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' />, the state variable, and <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BP%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{P}_{t}}' title='{&#92;mathbf{P}_{t}}' class='latex' />, the prediction error covariance variable.</p>
<p>The two stages of the filter correspond to the <a href="http://en.wikipedia.org/wiki/State_space_(controls)\#Linear_systems">state-space model</a> typically used to model linear dynamical systems. The first stage solves the <em>process equation</em>:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bx%7D_%7Bt%2B1%7D%3D%5Cmathbf%7BF%7D%5Cmathbf%7Bx%7D_%7Bk%7D%2B%5Cmathbf%7Bw%7D_%7Bk%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{x}_{t+1}=&#92;mathbf{F}&#92;mathbf{x}_{k}+&#92;mathbf{w}_{k}' title='&#92;displaystyle &#92;mathbf{x}_{t+1}=&#92;mathbf{F}&#92;mathbf{x}_{k}+&#92;mathbf{w}_{k}' class='latex' /></p>
<p>The process noise <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bw%7D_%7Bk%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{w}_{k}}' title='{&#92;mathbf{w}_{k}}' class='latex' /> is <a href="http://en.wikipedia.org/wiki/Additive_white_Gaussian_noise">additive Gaussian white noise (AWGN)</a>with zero mean and covariance defined by:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+E%5Cleft%5B%5Cmathbf%7Bw%7D_%7Bt%7D%5Cmathbf%7Bw%7D_%7Bt%7D%5E%7BT%7D%5Cright%5D%3D%5Cmathbf%7BQ%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle E&#92;left[&#92;mathbf{w}_{t}&#92;mathbf{w}_{t}^{T}&#92;right]=&#92;mathbf{Q}' title='&#92;displaystyle E&#92;left[&#92;mathbf{w}_{t}&#92;mathbf{w}_{t}^{T}&#92;right]=&#92;mathbf{Q}' class='latex' /></p>
<p>The second one is the <em>measurement equation</em>:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bz%7D_%7Bt%7D%3D%5Cmathbf%7BH%7D%5Cmathbf%7Bx%7D_%7Bt%7D%2B%5Cmathbf%7Bv%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{z}_{t}=&#92;mathbf{H}&#92;mathbf{x}_{t}+&#92;mathbf{v}_{t}' title='&#92;displaystyle &#92;mathbf{z}_{t}=&#92;mathbf{H}&#92;mathbf{x}_{t}+&#92;mathbf{v}_{t}' class='latex' /></p>
<p>The measurement noise <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bv%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{v}_{t}}' title='{&#92;mathbf{v}_{t}}' class='latex' /> is also AGWN with zero mean and covariance defined by:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+E%5Cleft%5B%5Cmathbf%7Bv%7D_%7Bt%7D%5Cmathbf%7Bv%7D_%7Bt%7D%5E%7BT%7D%5Cright%5D%3D%5Cmathbf%7BR%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle E&#92;left[&#92;mathbf{v}_{t}&#92;mathbf{v}_{t}^{T}&#92;right]=&#92;mathbf{R}' title='&#92;displaystyle E&#92;left[&#92;mathbf{v}_{t}&#92;mathbf{v}_{t}^{T}&#92;right]=&#92;mathbf{R}' class='latex' /></p>
<h2>Defining the system</h2>
<p>In order to implement a Kalman filter, we have to define several variables that model the system. We have to choose the variables contained by <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' />, and also choose suitable values for <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BF%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{F}}' title='{&#92;mathbf{F}}' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BH%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{H}}' title='{&#92;mathbf{H}}' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BQ%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{Q}}' title='{&#92;mathbf{Q}}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BR%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{R}}' title='{&#92;mathbf{R}}' class='latex' />, as well as an initial value for <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BP%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{P}_{t}}' title='{&#92;mathbf{P}_{t}}' class='latex' />.</p>
<p>We will define our measurement vector as:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bz%7D_%7Bt%7D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccc%7D+x_%7B1%2Ct%7D+%26+y_%7B1%2Ct%7D+%26+x_%7B2%2Ct%7D+%26+y_%7B2%2Ct%7D%5Cend%7Barray%7D%5Cright%5D%5E%7BT%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{z}_{t}=&#92;left[&#92;begin{array}{cccc} x_{1,t} &amp; y_{1,t} &amp; x_{2,t} &amp; y_{2,t}&#92;end{array}&#92;right]^{T}' title='&#92;displaystyle &#92;mathbf{z}_{t}=&#92;left[&#92;begin{array}{cccc} x_{1,t} &amp; y_{1,t} &amp; x_{2,t} &amp; y_{2,t}&#92;end{array}&#92;right]^{T}' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=%5Cleft%28x_%7B1%2Ct%7D%2C%5C%2C+y_%7B1%2Ct%7D%5Cright%29&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;left(x_{1,t},&#92;, y_{1,t}&#92;right)' title='&#92;left(x_{1,t},&#92;, y_{1,t}&#92;right)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cleft%28x_%7B2%2Ct%7D%2C%5C%2C+y_%7B2%2Ct%7D%5Cright%29&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;left(x_{2,t},&#92;, y_{2,t}&#92;right)' title='&#92;left(x_{2,t},&#92;, y_{2,t}&#92;right)' class='latex' /> are the upper-left and lower-right corners of the bounding box around the detected face, respectively. This is simply the output from the Viola and Jones face detector.</p>
<p>A logical choice for our state vector is:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bx%7D_%7Bt%7D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccccc%7D+x_%7B1%2Ct%7D+%26+y_%7B1%2Ct%7D+%26+x_%7B2%2Ct%7D+%26+y_%7B2%2Ct%7D+%26+dx_%7Bt%7D+%26+dy_%7Bt%7D%5Cend%7Barray%7D%5Cright%5D%5E%7BT%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{x}_{t}=&#92;left[&#92;begin{array}{cccccc} x_{1,t} &amp; y_{1,t} &amp; x_{2,t} &amp; y_{2,t} &amp; dx_{t} &amp; dy_{t}&#92;end{array}&#92;right]^{T}' title='&#92;displaystyle &#92;mathbf{x}_{t}=&#92;left[&#92;begin{array}{cccccc} x_{1,t} &amp; y_{1,t} &amp; x_{2,t} &amp; y_{2,t} &amp; dx_{t} &amp; dy_{t}&#92;end{array}&#92;right]^{T}' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=%7Bdx_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{dx_{t}}' title='{dx_{t}}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7Bdy_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{dy_{t}}' title='{dy_{t}}' class='latex' /> are the first-order derivatives. Other vectors are also possible; for example, <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.3549">some papers</a> introduce a &#8220;scale&#8221; variable, which assumes that the bounding box maintains a fixed aspect ratio.</p>
<p>The transition matrix <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BF%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{F}}' title='{&#92;mathbf{F}}' class='latex' /> defines the equations used to transition from one state vector <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> to the next vector <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%2B1%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t+1}}' title='{&#92;mathbf{x}_{t+1}}' class='latex' /> (without taking into account any measurements, <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' />). It is plugged in to the process equation:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bx%7D_%7Bt%2B1%7D%3D%5Cmathbf%7BF%7D%5Cmathbf%7Bx%7D_%7Bk%7D%2B%5Cmathbf%7Bw%7D_%7Bk%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{x}_{t+1}=&#92;mathbf{F}&#92;mathbf{x}_{k}+&#92;mathbf{w}_{k}' title='&#92;displaystyle &#92;mathbf{x}_{t+1}=&#92;mathbf{F}&#92;mathbf{x}_{k}+&#92;mathbf{w}_{k}' class='latex' /></p>
<p>Let&#8217;s look at some basic equations describing motion:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbegin%7Baligned%7Dx+%26+%3Ddx_%7B0%7Dt%2B%5Cfrac%7B1%7D%7B2%7Dd%5E%7B2%7Dx%5Ccdot%5CDelta+T%5E%7B2%7D%5C%5C+dx+%26+%3Ddx_%7B0%7D%2Bd%5E%7B2%7Dx%5Ccdot%5CDelta+T%5Cend%7Baligned%7D+&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;begin{aligned}x &amp; =dx_{0}t+&#92;frac{1}{2}d^{2}x&#92;cdot&#92;Delta T^{2}&#92;&#92; dx &amp; =dx_{0}+d^{2}x&#92;cdot&#92;Delta T&#92;end{aligned} ' title='&#92;displaystyle &#92;begin{aligned}x &amp; =dx_{0}t+&#92;frac{1}{2}d^{2}x&#92;cdot&#92;Delta T^{2}&#92;&#92; dx &amp; =dx_{0}+d^{2}x&#92;cdot&#92;Delta T&#92;end{aligned} ' class='latex' /></p>
<p>We could express this system using the following recurrence:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbegin%7Baligned%7Dx_%7Bt%2B1%7D+%26+%3Dx_%7Bt%7D%2Bdx_%7Bt%7D%5Ccdot%5CDelta+T%2B%5Cfrac%7B1%7D%7B2%7Dd%5E%7B2%7Dx_%7Bt%7D%5Ccdot%5CDelta+T%5E%7B2%7D%5C%5C+dx_%7Bt%2B1%7D+%26+%3Ddx_%7Bt%7D%2Bd%5E%7B2%7Dx_%7Bt%7D%5Ccdot%5CDelta+T%5Cend%7Baligned%7D+&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;begin{aligned}x_{t+1} &amp; =x_{t}+dx_{t}&#92;cdot&#92;Delta T+&#92;frac{1}{2}d^{2}x_{t}&#92;cdot&#92;Delta T^{2}&#92;&#92; dx_{t+1} &amp; =dx_{t}+d^{2}x_{t}&#92;cdot&#92;Delta T&#92;end{aligned} ' title='&#92;displaystyle &#92;begin{aligned}x_{t+1} &amp; =x_{t}+dx_{t}&#92;cdot&#92;Delta T+&#92;frac{1}{2}d^{2}x_{t}&#92;cdot&#92;Delta T^{2}&#92;&#92; dx_{t+1} &amp; =dx_{t}+d^{2}x_{t}&#92;cdot&#92;Delta T&#92;end{aligned} ' class='latex' /></p>
<p>These same equations can also be used to model the <img src='http://s0.wp.com/latex.php?latex=%7By_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{y_{t}}' title='{y_{t}}' class='latex' /> variables and their derivatives. Referring back to the process equation, we can thus model this system as:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cleft%5B%5Cbegin%7Barray%7D%7Bc%7D+x_%7B1%2Ct%2B1%7D%5C%5C+y_%7B1%2Ct%2B1%7D%5C%5C+x_%7B2%2Ct%2B1%7D%5C%5C+y_%7B2%2Ct%2B1%7D%5C%5C+dx_%7Bt%2B1%7D%5C%5C+dy_%7Bt%2B1%7D%5Cend%7Barray%7D%5Cright%5D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccccc%7D+1+%26+0+%26+0+%26+0+%26+1+%26+0%5C%5C+0+%26+1+%26+0+%26+0+%26+0+%26+1%5C%5C+0+%26+0+%26+1+%26+0+%26+1+%26+0%5C%5C+0+%26+0+%26+0+%26+1+%26+0+%26+1%5C%5C+0+%26+0+%26+0+%26+0+%26+1+%26+0%5C%5C+0+%26+0+%26+0+%26+0+%26+0+%26+1%5Cend%7Barray%7D%5Cright%5D%5Cleft%5B%5Cbegin%7Barray%7D%7Bc%7D+x_%7B1%2Ct%7D%5C%5C+y_%7B1%2Ct%7D%5C%5C+x_%7B2%2Ct%7D%5C%5C+y_%7B2%2Ct%7D%5C%5C+dx_%7Bt%7D%5C%5C+dy_%7Bt%7D%5Cend%7Barray%7D%5Cright%5D%2B%5Cleft%5B%5Cbegin%7Barray%7D%7Bc%7D+d%5E%7B2%7Dx_%7Bt%7D%2F2%5C%5C+d%5E%7B2%7Dy_%7Bt%7D%2F2%5C%5C+d%5E%7B2%7Dx_%7Bt%7D%2F2%5C%5C+d%5E%7B2%7Dy_%7Bt%7D%2F2%5C%5C+d%5E%7B2%7Dx_%7Bt%7D%5C%5C+d%5E%7B2%7Dy_%7Bt%7D%5Cend%7Barray%7D%5Cright%5D%5Ctimes%5CDelta+T&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;left[&#92;begin{array}{c} x_{1,t+1}&#92;&#92; y_{1,t+1}&#92;&#92; x_{2,t+1}&#92;&#92; y_{2,t+1}&#92;&#92; dx_{t+1}&#92;&#92; dy_{t+1}&#92;end{array}&#92;right]=&#92;left[&#92;begin{array}{cccccc} 1 &amp; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 1&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 1&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1&#92;end{array}&#92;right]&#92;left[&#92;begin{array}{c} x_{1,t}&#92;&#92; y_{1,t}&#92;&#92; x_{2,t}&#92;&#92; y_{2,t}&#92;&#92; dx_{t}&#92;&#92; dy_{t}&#92;end{array}&#92;right]+&#92;left[&#92;begin{array}{c} d^{2}x_{t}/2&#92;&#92; d^{2}y_{t}/2&#92;&#92; d^{2}x_{t}/2&#92;&#92; d^{2}y_{t}/2&#92;&#92; d^{2}x_{t}&#92;&#92; d^{2}y_{t}&#92;end{array}&#92;right]&#92;times&#92;Delta T' title='&#92;displaystyle &#92;left[&#92;begin{array}{c} x_{1,t+1}&#92;&#92; y_{1,t+1}&#92;&#92; x_{2,t+1}&#92;&#92; y_{2,t+1}&#92;&#92; dx_{t+1}&#92;&#92; dy_{t+1}&#92;end{array}&#92;right]=&#92;left[&#92;begin{array}{cccccc} 1 &amp; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 1&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 1&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1&#92;end{array}&#92;right]&#92;left[&#92;begin{array}{c} x_{1,t}&#92;&#92; y_{1,t}&#92;&#92; x_{2,t}&#92;&#92; y_{2,t}&#92;&#92; dx_{t}&#92;&#92; dy_{t}&#92;end{array}&#92;right]+&#92;left[&#92;begin{array}{c} d^{2}x_{t}/2&#92;&#92; d^{2}y_{t}/2&#92;&#92; d^{2}x_{t}/2&#92;&#92; d^{2}y_{t}/2&#92;&#92; d^{2}x_{t}&#92;&#92; d^{2}y_{t}&#92;end{array}&#92;right]&#92;times&#92;Delta T' class='latex' /></p>
<p>The process noise matrix <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BQ%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{Q}}' title='{&#92;mathbf{Q}}' class='latex' /> measures the variability of the input signal away from the &#8220;ideal&#8221; transitions defined in the transition matrix. Larger values in this matrix mean that the input signal has greater variance and the filter needs to be more adaptable. Smaller values result in a smoother output, but the filter is not as adaptable to large changes. This can be a little difficult to define, and may require some fine tuning. Based on our definition of the measurement noise <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bv%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{v}_{t}}' title='{&#92;mathbf{v}_{t}}' class='latex' /> above, our process noise matrix is defined as:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbegin%7Baligned%7D%5Cmathbf%7BQ%7D+%26+%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccccc%7D+%5CDelta+T%5E%7B4%7D%2F4+%26+0+%26+0+%26+0+%26+%5CDelta+T%5E%7B3%7D%2F2+%26+0%5C%5C+0+%26+%5CDelta+T%5E%7B4%7D%2F4+%26+0+%26+0+%26+0+%26+%5CDelta+T%5E%7B3%7D%2F2%5C%5C+0+%26+0+%26+%5CDelta+T%5E%7B4%7D%2F4+%26+0+%26+%5CDelta+T%5E%7B3%7D%2F2+%26+0%5C%5C+0+%26+0+%26+0+%26+%5CDelta+T%5E%7B4%7D%2F4+%26+0+%26+%5CDelta+T%5E%7B3%7D%2F2%5C%5C+%5CDelta+T%5E%7B3%7D%2F2+%26+0+%26+%5CDelta+T%5E%7B3%7D%2F2+%26+0+%26+%5CDelta+T%5E%7B2%7D+%26+0%5C%5C+0+%26+%5CDelta+T%5E%7B3%7D%2F2+%26+0+%26+%5CDelta+T%5E%7B3%7D%2F2+%26+0+%26+%5CDelta+T%5E%7B2%7D%5Cend%7Barray%7D%5Cright%5D%5Ctimes+a%5E%7B2%7D%5C%5C+%26+%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccccc%7D+1%2F4+%26+0+%26+0+%26+0+%26+1%2F2+%26+0%5C%5C+0+%26+1%2F4+%26+0+%26+0+%26+0+%26+1%2F2%5C%5C+0+%26+0+%26+1%2F4+%26+0+%26+1%2F2+%26+0%5C%5C+0+%26+0+%26+0+%26+1%2F4+%26+0+%26+1%2F2%5C%5C+1%2F2+%26+0+%26+1%2F2+%26+0+%26+1+%26+0%5C%5C+0+%26+1%2F2+%26+0+%26+1%2F2+%26+0+%26+1%5Cend%7Barray%7D%5Cright%5D%5Ctimes10%5E%7B-2%7D%5Cend%7Baligned%7D+&amp;bg=f9f7f5&amp;fg=000000&amp;s=0' alt='&#92;displaystyle &#92;begin{aligned}&#92;mathbf{Q} &amp; =&#92;left[&#92;begin{array}{cccccc} &#92;Delta T^{4}/4 &amp; 0 &amp; 0 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0&#92;&#92; 0 &amp; &#92;Delta T^{4}/4 &amp; 0 &amp; 0 &amp; 0 &amp; &#92;Delta T^{3}/2&#92;&#92; 0 &amp; 0 &amp; &#92;Delta T^{4}/4 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; &#92;Delta T^{4}/4 &amp; 0 &amp; &#92;Delta T^{3}/2&#92;&#92; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{2} &amp; 0&#92;&#92; 0 &amp; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{2}&#92;end{array}&#92;right]&#92;times a^{2}&#92;&#92; &amp; =&#92;left[&#92;begin{array}{cccccc} 1/4 &amp; 0 &amp; 0 &amp; 0 &amp; 1/2 &amp; 0&#92;&#92; 0 &amp; 1/4 &amp; 0 &amp; 0 &amp; 0 &amp; 1/2&#92;&#92; 0 &amp; 0 &amp; 1/4 &amp; 0 &amp; 1/2 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1/4 &amp; 0 &amp; 1/2&#92;&#92; 1/2 &amp; 0 &amp; 1/2 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 1/2 &amp; 0 &amp; 1/2 &amp; 0 &amp; 1&#92;end{array}&#92;right]&#92;times10^{-2}&#92;end{aligned} ' title='&#92;displaystyle &#92;begin{aligned}&#92;mathbf{Q} &amp; =&#92;left[&#92;begin{array}{cccccc} &#92;Delta T^{4}/4 &amp; 0 &amp; 0 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0&#92;&#92; 0 &amp; &#92;Delta T^{4}/4 &amp; 0 &amp; 0 &amp; 0 &amp; &#92;Delta T^{3}/2&#92;&#92; 0 &amp; 0 &amp; &#92;Delta T^{4}/4 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; &#92;Delta T^{4}/4 &amp; 0 &amp; &#92;Delta T^{3}/2&#92;&#92; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{2} &amp; 0&#92;&#92; 0 &amp; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{3}/2 &amp; 0 &amp; &#92;Delta T^{2}&#92;end{array}&#92;right]&#92;times a^{2}&#92;&#92; &amp; =&#92;left[&#92;begin{array}{cccccc} 1/4 &amp; 0 &amp; 0 &amp; 0 &amp; 1/2 &amp; 0&#92;&#92; 0 &amp; 1/4 &amp; 0 &amp; 0 &amp; 0 &amp; 1/2&#92;&#92; 0 &amp; 0 &amp; 1/4 &amp; 0 &amp; 1/2 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1/4 &amp; 0 &amp; 1/2&#92;&#92; 1/2 &amp; 0 &amp; 1/2 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 1/2 &amp; 0 &amp; 1/2 &amp; 0 &amp; 1&#92;end{array}&#92;right]&#92;times10^{-2}&#92;end{aligned} ' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=%7B%5CDelta+T%3D1%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;Delta T=1}' title='{&#92;Delta T=1}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7Ba%3Dd%5E%7B2%7Dx_%7Bt%7D%3Dd%5E%7B2%7Dy_%7Bt%7D%3D0.1%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{a=d^{2}x_{t}=d^{2}y_{t}=0.1}' title='{a=d^{2}x_{t}=d^{2}y_{t}=0.1}' class='latex' />.</p>
<p>The measurement matrix <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BH%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{H}}' title='{&#92;mathbf{H}}' class='latex' /> maps between our measurement vector <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' /> and state vector <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' />. It is plugged in to the measurement equation:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bz%7D_%7Bt%7D%3D%5Cmathbf%7BH%7D%5Cmathbf%7Bx%7D_%7Bt%7D%2B%5Cmathbf%7Bv%7D_%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{z}_{t}=&#92;mathbf{H}&#92;mathbf{x}_{t}+&#92;mathbf{v}_{t}' title='&#92;displaystyle &#92;mathbf{z}_{t}=&#92;mathbf{H}&#92;mathbf{x}_{t}+&#92;mathbf{v}_{t}' class='latex' /></p>
<p>The variables <img src='http://s0.wp.com/latex.php?latex=%7Bx_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{x_{t}}' title='{x_{t}}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7By_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{y_{t}}' title='{y_{t}}' class='latex' /> are mapped directly from <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bz%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{z}_{t}}' title='{&#92;mathbf{z}_{t}}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' />, whereas the derivative variables are latent (hidden) variables and so are not directly measured and are not included in the mapping. This gives us the measurement matrix:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7BH%7D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccccc%7D+1+%26+0+%26+0+%26+0+%26+0+%26+0%5C%5C+0+%26+1+%26+0+%26+0+%26+0+%26+0%5C%5C+0+%26+0+%26+1+%26+0+%26+0+%26+0%5C%5C+0+%26+0+%26+0+%26+1+%26+0+%26+0%5Cend%7Barray%7D%5Cright%5D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{H}=&#92;left[&#92;begin{array}{cccccc} 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0&#92;end{array}&#92;right]' title='&#92;displaystyle &#92;mathbf{H}=&#92;left[&#92;begin{array}{cccccc} 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0&#92;end{array}&#92;right]' class='latex' /></p>
<p>The matrix <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BR%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{R}}' title='{&#92;mathbf{R}}' class='latex' /> defines the error of the measuring device. For a physical instrument such as a speedometer or voltmeter, the measurement accuracy may be defined by the manufacturer. In the case of a face detector, we can determine the accuracy empirically. For instance, we may find that our Viola and Jones face detector detects faces to within 10 pixels of the actual face location 95% of the time. If we assume this error is Gaussian-distributed (which is a requirement of the Kalman filter), this gives us a variance of 6.5 pixels for each of the coordinates, so the measurement noise vector is then given by:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7Bv%7D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccc%7D+6.5+%26+6.5+%26+6.5+%26+6.5%5Cend%7Barray%7D%5Cright%5D%5E%7BT%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{v}=&#92;left[&#92;begin{array}{cccc} 6.5 &amp; 6.5 &amp; 6.5 &amp; 6.5&#92;end{array}&#92;right]^{T}' title='&#92;displaystyle &#92;mathbf{v}=&#92;left[&#92;begin{array}{cccc} 6.5 &amp; 6.5 &amp; 6.5 &amp; 6.5&#92;end{array}&#92;right]^{T}' class='latex' /></p>
<p>The errors are <a href="http://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables">independent</a>, so our <a href="http://en.wikipedia.org/wiki/Covariance_matrix">covariance matrix</a> is given by:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7BR%7D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccc%7D+6.5%5E%7B2%7D+%26+0+%26+0+%26+0%5C%5C+0+%26+6.5%5E%7B2%7D+%26+0+%26+0%5C%5C+0+%26+0+%26+6.5%5E%7B2%7D+%26+0%5C%5C+0+%26+0+%26+0+%26+6.5%5E%7B2%7D%5Cend%7Barray%7D%5Cright%5D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccc%7D+1+%26+0+%26+0+%26+0%5C%5C+0+%26+1+%26+0+%26+0%5C%5C+0+%26+0+%26+1+%26+0%5C%5C+0+%26+0+%26+0+%26+1%5Cend%7Barray%7D%5Cright%5D+%5Ctimes+42.25&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{R}=&#92;left[&#92;begin{array}{cccc} 6.5^{2} &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 6.5^{2} &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 6.5^{2} &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 6.5^{2}&#92;end{array}&#92;right]=&#92;left[&#92;begin{array}{cccc} 1 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1&#92;end{array}&#92;right] &#92;times 42.25' title='&#92;displaystyle &#92;mathbf{R}=&#92;left[&#92;begin{array}{cccc} 6.5^{2} &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 6.5^{2} &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 6.5^{2} &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 6.5^{2}&#92;end{array}&#92;right]=&#92;left[&#92;begin{array}{cccc} 1 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1&#92;end{array}&#92;right] &#92;times 42.25' class='latex' /></p>
<p>Decreasing the values in <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BR%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{R}}' title='{&#92;mathbf{R}}' class='latex' /> means we are optimistically assuming our measurements are more accurate, so the filter performs less smoothing and the predicted signal will follow the observed signal more closely. Conversely, increasing <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BR%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{R}}' title='{&#92;mathbf{R}}' class='latex' /> means we have less confidence in the accuracy of the measurements, so more smoothing is performed.</p>
<p>The estimate covariance matrix <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BP%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{P}}' title='{&#92;mathbf{P}}' class='latex' /> is a measure of the estimated accuracy of <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7Bx%7D_%7Bt%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{x}_{t}}' title='{&#92;mathbf{x}_{t}}' class='latex' /> at time <img src='http://s0.wp.com/latex.php?latex=%7Bt%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{t}' title='{t}' class='latex' />. It is adjusted over time by the filter, so we only need to supply a reasonable initial value. If we know for certain the exact state variable at start-up, then we can initialise <img src='http://s0.wp.com/latex.php?latex=%7B%5Cmathbf%7BP%7D%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;mathbf{P}}' title='{&#92;mathbf{P}}' class='latex' /> to a matrix of all zeros. Otherwise, it should be initialised as a diagonal matrix with a large value along the diagonal:</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathbf%7BP%7D%3D%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccccc%7D+1+%26+0+%26+0+%26+0+%26+0+%26+0%5C%5C+0+%26+1+%26+0+%26+0+%26+0+%26+0%5C%5C+0+%26+0+%26+1+%26+0+%26+0+%26+0%5C%5C+0+%26+0+%26+0+%26+1+%26+0+%26+0%5C%5C+0+%26+0+%26+0+%26+0+%26+1+%26+0%5C%5C+0+%26+0+%26+0+%26+0+%26+0+%26+1%5Cend%7Barray%7D%5Cright%5D%5Ctimes%5Cepsilon&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='&#92;displaystyle &#92;mathbf{P}=&#92;left[&#92;begin{array}{cccccc} 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1&#92;end{array}&#92;right]&#92;times&#92;epsilon' title='&#92;displaystyle &#92;mathbf{P}=&#92;left[&#92;begin{array}{cccccc} 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1 &amp; 0&#92;&#92; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 &amp; 1&#92;end{array}&#92;right]&#92;times&#92;epsilon' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=%7B%5Cepsilon%5Cgg0%7D&amp;bg=f9f7f5&amp;fg=444444&amp;s=0' alt='{&#92;epsilon&#92;gg0}' title='{&#92;epsilon&#92;gg0}' class='latex' />. The filter will then prefer the information from the first few measurements over the information already in the model.</p>
<h2>Implementing the face tracker</h2>
<p>The following script implements the system we have defined above. It loads the face detection results from CSV file, performs the Kalman filtering, and displays the detected bounding boxes.</p>
<p><pre class="brush: matlabkey;">% read in the detected face locations
fid = fopen('detect_faces.csv');
fgetl(fid); %ignore the header
detections = textscan(fid, '%[^,] %d %d %d %d', 'delimiter', ',');
fclose(fid);

% define the filter
x = [ 0; 0; 0; 0; 0; 0 ];
F = [ 1 0 0 0 1 0 ; ...
      0 1 0 0 0 1 ; ...
      0 0 1 0 1 0 ; ...
      0 0 0 1 0 1 ; ...
      0 0 0 0 1 0 ; ...
      0 0 0 0 0 1 ];
Q = [ 1/4  0   0   0  1/2  0  ; ...
       0  1/4  0   0   0  1/2 ; ...
       0   0  1/4  0  1/2  0  ; ...
       0   0   0  1/4  0  1/2 ; ...
      1/2  0  1/2  0   1   0  ; ...
       0  1/2  0  1/2  0   1  ] * 1e-2;
H = [ 1 0 0 0 0 0 ; ...
      0 1 0 0 0 0 ; ...
      0 0 1 0 0 0 ; ...
      0 0 0 1 0 0 ];
R = eye(4) * 42.25;
P = eye(6) * 1e4;

nsamps = numel(detections{1});
for n = 1:nsamps

    % read the next detected face location
    meas_x1 = detections{2}(n);
    meas_x2 = detections{4}(n);
    meas_y1 = detections{3}(n);
    meas_y2 = detections{5}(n);
    z = double([meas_x1; meas_x2; meas_y1; meas_y2]);

    % step 1: predict
    [x,P] = kalman_predict(x,P,F,Q);

    % step 2: update (if measurement exists)
    if all(z &gt; 0)
        [x,P] = kalman_update(x,P,z,H,R);
    end

    % draw a bounding box around the detected face
    img = imread(detections{1}{n});
    imshow(img);
    est_z = H*x;
    est_x1 = est_z(1);
    est_x2 = est_z(2);
    est_y1 = est_z(3);
    est_y2 = est_z(4);
    if all(est_z &gt; 0) &amp;&amp; est_x2 &gt; est_x1 &amp;&amp; est_y2 &gt; est_y1
        rectangle('Position', [est_x1 est_y1 est_x2-est_x1 est_y2-est_y1], 'EdgeColor', 'g', 'LineWidth', 3);
    end
    drawnow;

end</pre></p>
<p>The results of running this script are shown in the following video:</p>
<span style="text-align:center; display: block;"><a href="http://blog.cordiner.net/2011/05/03/object-tracking-using-a-kalman-filter-matlab/"><img src="http://img.youtube.com/vi/z-fHB-vTKPg/2.jpg" alt="" /></a></span>
<p>Clearly we can see that this video has a much smoother and more accurate bounding box around the face than the unfiltered version shown previously, and the video no longer has frames with missing detections.</p>
<h2>Closing remarks</h2>
<p>In the future, I aim to write an article on the <a href="http://en.wikipedia.org/wiki/Extended_Kalman_filter">extended Kalman filter (EKF)</a> and <a href="http://en.wikipedia.org/wiki/Kalman_filter\#Unscented_Kalman_filter">unscented Kalman filter (UKF)</a> (and the similar <a href="http://en.wikipedia.org/wiki/Particle_filter">particle filter</a>). These are both non-linear versions of the Kalman filter. Although face trackers are usually implemented using the linear Kalman filter, the non-linear versions have some other interesting applications in image and signal processing.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/357/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/357/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/357/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/357/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/357/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/357/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/357/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/357/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=357&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2011/05/03/object-tracking-using-a-kalman-filter-matlab/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>
	</item>
		<item>
		<title>Exploring your Gmail social network (Python)</title>
		<link>http://blog.cordiner.net/2011/01/10/explorin-your-gmail-social-network-python/</link>
		<comments>http://blog.cordiner.net/2011/01/10/explorin-your-gmail-social-network-python/#comments</comments>
		<pubDate>Mon, 10 Jan 2011 10:11:11 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Graph theory]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=1096</guid>
		<description><![CDATA[In Malcolm Gladwell&#8217;s bestseller The Tipping Point, he outlines a theory called &#8220;The Law of the Few&#8221; that identifies key types of people in social networks. One of the important ones is the connector. Connectors are hubs in the network. The idea is that, rather than society consisting of one large sparsely connected network, it [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=1096&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In Malcolm Gladwell&#8217;s bestseller <a href="http://www.gladwell.com/tippingpoint/">The Tipping Point</a>, he outlines a theory called &#8220;The Law of the Few&#8221; that identifies key types of people in social networks. One of the important ones is the <em>connector</em>. Connectors are hubs in the network. The idea is that, rather than society consisting of one large sparsely connected network, it consists of many smaller, densely connected networks, and these smaller networks are in turn connected together by the connectors forming a network of networks known as a <a href="http://en.wikipedia.org/wiki/Small-world_network">&#8220;small-world network&#8221;</a>. Most people know at least one connector, and it is through these key people and their fellow connectors that we are connected to the rest of the world.</p>
<p>An interesting experiment is to explore your own personal social network to identify the connectors. An easy way to get an idea of your network is from your email mailbox. If we assume that the sender and all of the recipients of each group email we send or receive know each other, we can scan our mailbox and build a graph with email addresses as the nodes and edges joining the email addresses of people who know each other. In a <a href="http://blog.cordiner.net/2009/12/16/accessing-your-gmail-messages-in-matlab/">previous post</a>, I showed how Gmail messages can be easily accessed from many programming languages using an <a href="http://www.sqlite.org/cvstrac/wiki?p=SqliteWrappers">SQLite driver</a>. Using this, we can find our social network from our Gmail mailbox.</p>
<h3>Generating the graph</h3>
<p>First we need to connect to the Gmail SQLite database with the standard <a href="http://docs.python.org/library/sqlite3.html">sqlite3</a> library in Python. You need to set up Gmail Offline and locate your Gmail data file, as described in my <a href="http://blog.cordiner.net/2009/12/16/accessing-your-gmail-messages-in-matlab/">previous post</a>. I also suggest that you increase the Gmail Offline recent message range and re-sync your messages so that you have a larger collection of messages to work with (instructions can be found <a href="http://mail.google.com/support/bin/answer.py?hl=en&amp;answer=161965">here</a> &#8211; I set mine to 3 months).</p>
<p>To represent and analyse the network, I have used the excellent Python <a href="http://networkx.lanl.gov/">NetworkX library</a> from the Los Alamos National Lab, which you will have to install, as well as <a href="http://matplotlib.sourceforge.net/">matplotlib</a> to render the network figures.</p>
<p>The following function will read your Gmail messages and build a NetworkX <a href="http://networkx.lanl.gov/reference/classes.graph.html"><code>Graph</code></a> object:<br />
<pre class="brush: python;">import email.utils
import itertools
import networkx
import sqlite3

def get_email_graph(db, ignored_addresses=[]):

    cur = db.cursor()
    cur.execute(&quot;SELECT c4FromAddress, c5ToAddresses, c6CcAddresses, c7BccAddresses FROM MessagesFT_content&quot;)

    graph = networkx.Graph()

    for address_field in cur: # loop through from, to, cc and bcc address fields
        addresses = set(address for name, address in email.utils.getaddresses(address_field) if address != '') # parse the addresses
        if len(addresses) &amp;lt;= 2 or len(addresses) &amp;gt; 10:
            continue # ignore any emails with a single recipient or very large group emails
        for address in ignored_addresses:
            addresses.discard(address) # remove any ignored addresses
        graph.add_edges_from(itertools.combinations(addresses, 2))

    return graph</pre><br />
You can then call the function this to build your social network graph (remembering to change the path to the location of your Gmail SQLite database):<br />
<pre class="brush: python;">db = sqlite3.connect(&quot;C:/path/to/mail.google.com/http_80/myemail@gmail.com-GoogleMail#database[1]&quot;)
graph = get_email_graph(db)</pre></p>
<h3>Excluding yourself from the graph</h3>
<p>The <code>get_email_graph</code> function above has a second parameter, <code>ignored_addresses</code>, which can be used for excluding particular email addresses from the graph. Because the social network is built from your own mailbox, you will be at the centre of the graph and connected to every other node, and will always appear to be a large connector. For this reason, I removed myself from the network to see how everyone else in the network is connected to one another.</p>
<p>In addition to my primary Gmail address, I have multiple additional email addresses that forward to my Gmail account. I have set up my Gmail to be able to send from these addresses (if you are not familiar with this feature, see <a href="http://mail.google.com/support/bin/answer.py?hl=en&amp;answer=22370">this article</a>). Gmail Offline stores your primary email address and any additional outgoing email addresses as <a href="http://tools.ietf.org/html/rfc4627">JSON</a>-encoded records in the <code>DataArrays</code> table. If you have Gmail set up with your additional email addresses, the function below will return a list of all of your email addresses so that you can exclude them from the graph:<br />
<pre class="brush: python;">import json
def get_my_addresses(db):
    cur = db.cursor()
    # fetch the primary Gmail address
    cur.execute(&quot;SELECT Value FROM DataArrays WHERE Type = 'ui'&quot;)
    json_str = cur.fetchone()[0]
    primary_address = json.loads(json_str)[1]
    # get any additional email addresses
    cur.execute(&quot;SELECT Value FROM DataArrays WHERE Type = 'cfs'&quot;)
    json_str = cur.fetchone()[0]
    additional_addresses = [address for name, address, misc1, misc2 in json.loads(json_str)[1]]
    return set([primary_address,] + additional_addresses)</pre><br />
You should then use this list when calling <code>get_email_graph</code>, like this:<br />
<pre class="brush: python;">db = sqlite3.connect(&quot;C:/path/to/mail.google.com/http_80/myemail@gmail.com-GoogleMail#database[1]&quot;)
ignored_addresses = get_my_addresses(db)
graph = get_email_graph(db, ignored_addresses)</pre><br />
If you have multiple email addresses but don&#8217;t have Gmail configured with all of your forwarded email addresses, you can manually specify all of your email addresses as a list and then pass it to the <code>get_email_graph</code> function as above.</p>
<h3>Visualising the graph</h3>
<p>Now that we have loaded our social network into a graph object, we can display it as a figure with the following code:<br />
<pre class="brush: python;">from matplotlib import pyplot
networkx.draw(graph, node_size=5, font_size=8, width=.2)
pyplot.show()</pre><br />
This is the output from my graph (with the email addresses obfuscated using a <a href="http://docs.python.org/library/hashlib.html">hash function</a>):</p>
<p style="text-align:center;"><a href="http://thedeadbeef.files.wordpress.com/2011/01/full_network_labelled.png"><img class="aligncenter size-full wp-image-1118" title="Full network" src="http://thedeadbeef.files.wordpress.com/2011/01/full_network_labelled.png?w=497" alt=""   /></a></p>
<p>Clearly we can see that, rather than a single network, it is composed of multiple separate connected components. In my case, it was immediately apparent that these components correspond to different groups of people that I regularly deal with &#8211; one is my friends and family, another my work colleagues, and the remaining two are classmates and staff at two different universities I am associated with. Above I have labelled what each component corresponds to.</p>
<p>We can use the <code><a href="http://networkx.lanl.gov/reference/generated/networkx.algorithms.components.connected.connected_components.html#networkx.algorithms.components.connected.connected_components">connected_components</a></code> function to extract these distinct components in order to plot and analyse them separately. The code below will loop through them generating two plots for each component &#8211; one using the default layout and the other using a circular layout &#8211; and with the nodes sized proportionally to the number of connections it has:<br />
<pre class="brush: python;">for connected_group in networkx.connected_components(graph):
    subgraph = networkx.subgraph(graph, connected_group)
    node_list, node_size = zip(*networkx.degree(subgraph).items())
    pyplot.figure()
    pyplot.subplot(121)
    networkx.draw(subgraph, nodelist=node_list, node_size=node_size, font_size=5, width=.2)
    pyplot.subplot(122)
    pyplot.axis('equal')
    networkx.draw_circular(subgraph, nodelist=node_list, node_size=node_size, font_size=8, width=.2)
pyplot.show()</pre><br />
Below are examples of two component plots generated for my social network. The first is my &amp; family network and the second is my work network.</p>
<div id="attachment_1115" class="wp-caption aligncenter" style="width: 507px"><a href="http://thedeadbeef.files.wordpress.com/2011/01/figure1-friends.png"><img class="size-full wp-image-1115" title="Friends sub-network" src="http://thedeadbeef.files.wordpress.com/2011/01/figure1-friends.png?w=497&h=372" alt="" width="497" height="372" /></a><p class="wp-caption-text">Friends &amp; family network</p></div>
<div id="attachment_1116" class="wp-caption aligncenter" style="width: 507px"><a href="http://thedeadbeef.files.wordpress.com/2011/01/figure2-work.png"><img class="size-full wp-image-1116" title="Work sub-network" src="http://thedeadbeef.files.wordpress.com/2011/01/figure2-work.png?w=497&h=372" alt="" width="497" height="372" /></a><p class="wp-caption-text">Work network</p></div>
<h3>Analysing your social network</h3>
<p>The NetworkX library has a wealth of <a href="http://networkx.lanl.gov/reference/algorithms.html">graph algorithms</a> that can be used to calculate interesting statistics on the graphs, such as the <a href="http://networkx.lanl.gov/reference/algorithms.shortest_paths.html">shortest path</a> between two nodes and various <a href="http://networkx.lanl.gov/reference/algorithms.centrality.html">centrality</a> measures. In the figure below, the friends and family network has been enlarged and the email address labels removed. We can see that the network contains a handful of large nodes with large numbers of connections, with the majority having only a few connections. These large nodes are clearly the <em>connectors</em> amongst my friends and family network that Gladwell discussed.</p>
<p><a href="http://thedeadbeef.files.wordpress.com/2011/01/figure1-friends-nolabel.png"><img class="aligncenter size-full wp-image-1132" title="Friend sub-network (zoomed)" src="http://thedeadbeef.files.wordpress.com/2011/01/figure1-friends-nolabel.png?w=497&h=493" alt="" width="497" height="493" /></a></p>
<p>The number of connections an email address has is its <a href="http://en.wikipedia.org/wiki/Degree_(graph_theory)">degree</a>, and can be calculated using the <a href="http://networkx.lanl.gov/reference/generated/networkx.Graph.degree.html"><code>degree</code></a> function. Below is a histogram of the node degrees:</p>
<p><a href="http://thedeadbeef.files.wordpress.com/2011/01/hist12.png"><img class="aligncenter size-full wp-image-1150" title="Friends sub-network histogram" src="http://thedeadbeef.files.wordpress.com/2011/01/hist12.png?w=497&h=372" alt="" width="497" height="372" /></a></p>
<p>The majority of the nodes have a degree of less than 20. The <a href="http://en.wikipedia.org/wiki/Six_degrees_of_separation">&#8220;six degrees of separation&#8221; theory</a> claims that every person in the world is connected to every other person by a chain of no more than approximately six connections. In graph theory terms, this means that the maximum shortest path between any two nodes in the graph is no more than 6. My friends &amp; family graph has a maximum shortest path length of 5 and an average of 2.1 (calculated with the <a href="http://networkx.lanl.gov/reference/generated/networkx.shortest_path_length.html"><code>shortest_path_length</code></a> function), which seems to agree with this theory.</p>
<h3>Closing thoughts</h3>
<p>Social networking websites such as <a href="http://computer.howstuffworks.com/internet/social-networking/networks/friendster3.htm">Friendster</a> and <a href="http://computer.howstuffworks.com/internet/social-networking/networks/linkedin3.htm">LinkedIn</a> have leveraged the idea of &#8220;six degrees of separation&#8221; by allowing you to build your social network by connecting to your friends&#8217; connections. Calculating the degrees of separation has been solved by <a href="http://en.wikipedia.org/wiki/Shortest_path_problem#Algorithms">multiple well-studied algorithms</a>. However, these algorithms often do not scale well &#8211; for instance, the Floyd-Warshall algorithm has <em>O</em>(<em>n</em><sup>3</sup>) time and <em>O</em>(<em>n</em><sup>2</sup>) space complexity. Implementing an algorithm to find the <em>n</em>-level network of each person to every other person in a parallel and <a href="http://en.wikipedia.org/wiki/Dynamic_problem_(algorithms)">dynamic manner</a> and on an extremely large network consisting of many thousands or millions of nodes, as must be the case for many social networking websites, is certainly an <a href="http://stackoverflow.com/questions/1556451/how-do-sites-like-linkedin-efficiently-display-1st-2nd-3rd-level-relationship-nex">interesting</a> and <a href="http://stackoverflow.com/questions/2076715/challenge-how-to-implement-an-algorithm-for-six-degree-of-separation">challenging problem</a> &#8211; one which I plan to write a future article on.</p>
<p>The approach I have used represents the social network as an <a href="http://en.wikipedia.org/wiki/Glossary_of_graph_theory#Weighted_graphs_and_networks">unweighted</a> and <a href="http://en.wikipedia.org/wiki/Directed_graph">undirected</a> graph. One extension to the approach above could be to use a weighted graph, where the weights are the total number of emails between each pair of people. This would have the advantage of giving a higher weighting to connections between people who frequently communicate and would prevent a single large group email from giving all of its recipients a high degree. Another extension could be to use a directed graph, where each node points to the node it received the email from. This would result in the edges tending to point to frequent communicators, and an algorithm such as <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1768">PageRank</a> (which is <a href="http://networkx.lanl.gov/reference/generated/networkx.pagerank.html">implemented in NetworkX</a>) could be used to analyse the most &#8220;important&#8221; nodes.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/1096/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/1096/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/1096/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/1096/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/1096/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/1096/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/1096/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/1096/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=1096&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2011/01/10/explorin-your-gmail-social-network-python/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2011/01/full_network_labelled.png" medium="image">
			<media:title type="html">Full network</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2011/01/figure1-friends.png" medium="image">
			<media:title type="html">Friends sub-network</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2011/01/figure2-work.png" medium="image">
			<media:title type="html">Work sub-network</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2011/01/figure1-friends-nolabel.png" medium="image">
			<media:title type="html">Friend sub-network (zoomed)</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2011/01/hist12.png" medium="image">
			<media:title type="html">Friends sub-network histogram</media:title>
		</media:content>
	</item>
		<item>
		<title>Eigenfaces face recognition (MATLAB)</title>
		<link>http://blog.cordiner.net/2010/12/02/eigenfaces-face-recognition-matlab/</link>
		<comments>http://blog.cordiner.net/2010/12/02/eigenfaces-face-recognition-matlab/#comments</comments>
		<pubDate>Wed, 01 Dec 2010 14:03:22 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Image processing]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[face recognition]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[pca]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=200</guid>
		<description><![CDATA[Eigenfaces is a well studied method of face recognition based on principal component analysis (PCA), popularised by the seminal work of Turk &#38; Pentland. Although the approach has now largely been superseded, it is still often used as a benchmark to compare the performance of other algorithms against, and serves as a good introduction to subspace-based [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=200&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Eigenface">Eigenfaces</a> is a well studied method of face recognition based on <a href="http://en.wikipedia.org/wiki/Principal_component_analysis">principal component analysis (PCA)</a>, popularised by the seminal work of <a href="http://www.mitpressjournals.org/doi/abs/10.1162/jocn.1991.3.1.71">Turk &amp; Pentland</a>. Although the approach has now largely been superseded, it is still often used as a benchmark to compare the performance of other algorithms against, and serves as a good introduction to subspace-based approaches to face recognition. In this post, I&#8217;ll provide a very simple implementation of eigenfaces face recognition using MATLAB.</p>
<p>PCA is a method of transforming a number of correlated variables into a smaller number of uncorrelated variables. Similar to how Fourier analysis is used to decompose a signal into a set of additive orthogonal sinusoids of varying frequencies, PCA decomposes a signal (or image) into a set of additive orthogonal basis vectors or <em>eigenvectors</em>. The main difference is that, while Fourier analysis uses a fixed set of basis functions, the PCA basis vectors are learnt from the data set via unsupervised training. PCA can be applied to the task of face recognition by converting the pixels of an image into a number of eigenface feature vectors, which can then be compared to measure the similarity of two face images.</p>
<p><strong>Note:</strong> This code requires the <a href="http://www.mathworks.com/products/statistics/">Statistics Toolbox</a>. If you don&#8217;t have this, you could take a look at this <a href="http://www.cs.ait.ac.th/~mdailey/matlab/">excellent article by Matthew Dailey</a>, which I discovered while writing this post. He implements the PCA functions manually, so his code doesn&#8217;t require any toolboxes.</p>
<h3>Loading the images</h3>
<p>The first step is to load the training images. You can obtain faces from a variety of publicly available <a href="http://www.face-rec.org/databases/">face databases</a>. In these examples, I have used a cropped version of the <a href="http://www.vision.caltech.edu/html-files/archive.html">Caltech 1999 face database</a>. The main requirements are that the faces images must be:</p>
<ul>
<li><strong>Greyscale images with a consistent resolution.</strong> If using colour images, convert them to greyscale first with <a href="http://www.mathworks.com/help/toolbox/images/ref/rgb2gray.html"><code>rgb2gray</code></a>. I used a resolution of 64 × 48 pixels.</li>
<li><strong>Cropped to only show the face. </strong> If the images include background, the face recognition will not work  properly, as the background will be incorporated into the classifier. I also usually try to avoid hair, since a persons hair style can change significantly (or they could wear a hat).</li>
<li><strong>Aligned based on facial features.</strong> Because PCA is <a href="http://en.wikipedia.org/wiki/Translation_(geometry)">translation</a> variant, the faces must be frontal and well aligned on facial features such as the eyes, nose and mouth. Most face databases have ground truth available so you don&#8217;t need to label these features by hand. The <a href="http://www.mathworks.com/products/image/">Image Processing Toolbox</a> provides some <a href="http://www.mathworks.com/help/toolbox/images/ref/f3-23960.html#f3-23515">handy functions for image registration</a>.</li>
</ul>
<p>Each image is converted into a column vector and then the images are loaded into a matrix of size <em>n × m</em>, where <em>n</em> is the number of pixels in each image and <em>m</em> is the total number of images. The following code reads in all of the PNG images from the directory specified by <code>input_dir</code> and scales all of the images to the size specified by <code>image_dims</code>:<br />
<pre class="brush: matlabkey;">input_dir = '/path/to/my/images';
image_dims = [48, 64];

filenames = dir(fullfile(input_dir, '*.png'));
num_images = numel(filenames);
images = [];
for n = 1:num_images
    filename = fullfile(input_dir, filenames(n).name);
    img = imread(filename);
    if n == 1
        images = zeros(prod(image_dims), num_images);
    end
    images(:, n) = img(:);
end</pre></p>
<h3>Training</h3>
<p>Training the face detector requires the following steps (compare to <a href="http://en.wikipedia.org/wiki/Principal_component_analysis#Computing_PCA_using_the_covariance_method">the steps to perform PCA</a>):</p>
<ol>
<li>Calculate the mean of the input face images</li>
<li>Subtract the mean from the input images to obtain the mean-shifted images</li>
<li>Calculate the eigenvectors and eigenvalues of the mean-shifted images</li>
<li>Order the eigenvectors by their corresponding eigenvalues, in decreasing order</li>
<li>Retain only the eigenvectors with the largest eigenvalues (the <em>principal components</em>)</li>
<li>Project the mean-shifted images into the eigenspace using the retained eigenvectors</li>
</ol>
<p>The code is shown below:<br />
<pre class="brush: matlabkey;">% steps 1 and 2: find the mean image and the mean-shifted input images
mean_face = mean(images, 2);
shifted_images = images - repmat(mean_face, 1, num_images);

% steps 3 and 4: calculate the ordered eigenvectors and eigenvalues
[evectors, score, evalues] = princomp(images');

% step 5: only retain the top 'num_eigenfaces' eigenvectors (i.e. the principal components)
num_eigenfaces = 20;
evectors = evectors(:, 1:num_eigenfaces);

% step 6: project the images into the subspace to generate the feature vectors
features = evectors' * shifted_images;</pre><br />
Steps 1 and 2 allow us to obtain zero-mean face images. Calculating the eigenvectors and eigenvalues in steps 3 and 4 can be achieved using the <a href="http://www.mathworks.com/help/toolbox/stats/princomp.html"><code>princomp</code></a> function. This function also takes care of mean-shifting the input, so you do not need to perform this manually before calling the function. However, I have still performed the mean-shifting in steps 1 and 2 since it is required for step 6, and the eigenvalues are still calculated as they will be used later to investigate the eigenvectors. The output from step 4 is a matrix of eigenvectors. Since the <code>princomp</code> function already sorts the eigenvectors by their eigenvalues, step 5 is accomplished simply by truncating the number of columns in the eigenvector matrix. Here we will truncate it to 20 principal components, which is set by the variable <code>num_eigenfaces</code>; this number was selected somewhat arbitrarily, but I will show you later how you can perform some analysis to make a more educated choice for this value. Step 6 is achieved by projecting the mean-shifted input images into the subspace defined by our truncated set of eigenvectors. For each input image, this projection will generate a feature vector of <code>num_eigenfaces</code> elements.</p>
<h3>Classification</h3>
<p>Once the face images have been projected into the eigenspace, the similarity between any pair of face images can be calculated by finding the Euclidean distance <img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=\left\Vert \mathbf{y}_{1}-\mathbf{y}_{2}\right\Vert" /> between their corresponding feature vectors <img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=\mathbf{y}_1" /> and <img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=\mathbf{y}_2" />; the smaller the distance between the feature vectors, the more similar the faces. We can define a simple similarity score <img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=s\left(\mathbf{y}_1,\mathbf{y}_2\right)" /> based on the inverse Euclidean distance:</p>
<p style="text-align:center;"><img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=s\left(\mathbf{y}_1,\mathbf{y}_2\right)=\frac{1}{1+\left\Vert \mathbf{y}_{1}-\mathbf{y}_{2}\right\Vert } \in\left[0,1\right]" /></p>
<p>To perform face recognition, the similarity score is calculated between an input face image and each of the training images. The matched face is the one with the highest similarity, and the magnitude of the similarity score indicates the confidence of the match (with a unit value indicating an exact match).</p>
<p>Given an input image <code>input_image</code> with the same dimensions <code>image_dims</code> as your training images, the following code will calculate the similarity score to each training image and display the best match:<br />
<pre class="brush: matlabkey;">% calculate the similarity of the input to each training image
feature_vec = evectors' * (input_image(:) - mean_face);
similarity_score = arrayfun(@(n) 1 / (1 + norm(features(:,n) - feature_vec)), 1:num_images);

% find the image with the highest similarity
[match_score, match_ix] = max(similarity_score);

% display the result
figure, imshow([input_image reshape(images(:,match_ix), image_dims)]);
title(sprintf('matches %s, score %f', filenames(match_ix).name, match_score));</pre><br />
Below is an example of a true positive match that was found on my training set with a score of 0.4425:</p>
<p><a href="http://thedeadbeef.files.wordpress.com/2010/12/match21.png"><img class="aligncenter size-full wp-image-898" title="Eigenfaces match" src="http://thedeadbeef.files.wordpress.com/2010/12/match21.png?w=497" alt=""   /></a></p>
<p>To detect cases where no matching face exists in the training set, you can set a minimum threshold for the similarity score and ignore any matches below this score.</p>
<h3>Further analysis</h3>
<p>It can be useful to take a look at the eigenvectors or &#8220;eigenfaces&#8221; that are generated during training:<br />
<pre class="brush: matlabkey;">% display the eigenvectors
figure;
for n = 1:num_eigenfaces
    subplot(2, ceil(num_eigenfaces/2), n);
    evector = reshape(evectors(:,n), image_dims);
    imshow(evector);
end</pre><br />
<a href="http://thedeadbeef.files.wordpress.com/2010/12/eigenvectors21.png"><img class="aligncenter size-full wp-image-902" title="Eigenvectors" src="http://thedeadbeef.files.wordpress.com/2010/12/eigenvectors21.png?w=497" alt=""   /></a></p>
<p>Above are the 20 eigenfaces that my training set generated. The subspace projection we performed in the final step of training generated a feature vector of 20 coefficients for each image. The feature vectors represent each image as a linear combination of the eigenfaces defined by the coefficients in the feature vector; if we multiply each eigenface by its corresponding coefficient and then sum these weighted eigenfaces together, we can roughly reconstruct the input image. The feature vectors can be thought of as a type of compressed representation of the input images.</p>
<p>Notice that the different eigenfaces shown above seem to accentuate different features of the face. Some focus more on the eyes, others on the nose or mouth, and some a combination of them. If we generated more eigenfaces, they would slowly begin to accentuate noise and high frequency features. I mentioned earlier that our choice of 20 principal components was somewhat arbitrary. Increasing this number would mean that we would retain a larger set of eigenvectors that capture more of the variance within the data set. We can make a more informed choice for this number by examining how much variability each eigenvector accounts for.  This variability is given by the eigenvalues. The plot below shows the cumulative eigenvalues for the first 30 principal components:<br />
<pre class="brush: matlabkey;">% display the eigenvalues
normalised_evalues = evalues / sum(evalues);
figure, plot(cumsum(normalised_evalues));
xlabel('No. of eigenvectors'), ylabel('Variance accounted for');
xlim([1 30]), ylim([0 1]), grid on;</pre><br />
<a href="http://thedeadbeef.files.wordpress.com/2010/12/eigenvalues21.png"><img class="aligncenter size-full wp-image-932" title="Eigenvalues" src="http://thedeadbeef.files.wordpress.com/2010/12/eigenvalues21.png?w=497" alt=""   /></a></p>
<p>We can see that the first eigenvector accounts for 50% of the variance in the data set, while the first 20 eigenvectors together account for just over 85%, and the first 30 eigenvectors for 90%. Increasing the number of eigenvectors generally increases recognition accuracy but also increases computational cost. Note, however, that using too many principal components does not necessarily always lead to higher accuracy, since we eventually reach a point of diminishing returns where the low-eigenvalue components begin to capture unwanted within-class scatter. The ideal number of eigenvectors to retain will depend on the application and the data set, but in general a size that captures around 90% of the variance is usually a reasonable trade-off.</p>
<h3>Closing remarks</h3>
<p>The eigenfaces approach is now largely superceded in the face recognition literature. However, it serves as a good introduction to the many other similar subspace-base face recognition algorithms. Usually these algorithms differ in the objective function that is used to select the subspace projection. Some subspace-based techniques that are quite popular include:</p>
<ul>
<li><strong><a href="http://en.wikipedia.org/wiki/Independent_component_analysis">Independent component analysis (ICA)</a></strong> &#8211; selects subspace projections that maximise the statistical independence of the dimensions</li>
<li><strong><a href="http://en.wikipedia.org/wiki/Linear_discriminant_analysis">Fisher&#8217;s linear discriminant (FLD)</a></strong> - selects subspace projections that maximise the ratio of between-class to within-class scatter.</li>
<li><strong><a href="http://en.wikipedia.org/wiki/Non-negative_matrix_factorization">Non-negative matrix factorisation (NMF)</a></strong> &#8211; selects subspace projections that generate non-negative basis vectors</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/200/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=200&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/12/02/eigenfaces-face-recognition-matlab/feed/</wfw:commentRss>
		<slash:comments>48</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=leftVertmathbfy_1-mathbfy_2rightVert" medium="image" />

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=mathbfy_1" medium="image" />

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=mathbfy_2" medium="image" />

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=sleft(mathbfy_1,mathbfy_2right)" medium="image" />

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=sleft(mathbfy_1,mathbfy_2right)=frac11+leftVertmathbfy_1-mathbfy_2rightVertinleft0,1right" medium="image" />

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/12/match21.png" medium="image">
			<media:title type="html">Eigenfaces match</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/12/eigenvectors21.png" medium="image">
			<media:title type="html">Eigenvectors</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/12/eigenvalues21.png" medium="image">
			<media:title type="html">Eigenvalues</media:title>
		</media:content>
	</item>
		<item>
		<title>Calculating variance and mean with MapReduce (Python)</title>
		<link>http://blog.cordiner.net/2010/06/16/calculating-variance-and-mean-with-mapreduce-python/</link>
		<comments>http://blog.cordiner.net/2010/06/16/calculating-variance-and-mean-with-mapreduce-python/#comments</comments>
		<pubDate>Wed, 16 Jun 2010 13:08:34 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=764</guid>
		<description><![CDATA[I&#8217;m half way through Peter Seibel&#8217;s Coders At Work and one of the recurring topics seems to be the difficulty of parallel computing. Nearly every developer interviewed in the book so far claims that, in their experience, the bugs in multithreaded and multiprocess applications have been the most difficult to track down and fix. This [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=764&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m half way through <a href="http://www.codersatwork.com/">Peter Seibel&#8217;s </a><em><a href="http://www.codersatwork.com/">Coders At Work</a></em> and one of the recurring topics seems to be the difficulty of <a href="http://en.wikipedia.org/wiki/Parallel_computing">parallel computing</a>. Nearly every developer interviewed in the book so far claims that, in their experience, the bugs in multithreaded and multiprocess applications have been the most difficult to track down and fix. This is such an old topic that you&#8217;d think we would have gotten it right by now. But we haven&#8217;t, and the recent growth of cloud computing and multi-core processors has revived the topic of parallelism. One of the more intuitive approaches in recent years has been <a href="http://labs.google.com/papers/mapreduce.html">MapReduce</a>, a framework introduced by a Google researcher. Google uses it in their search engine for tasks such as counting the frequency of particular words in a large number of documents.</p>
<h3>MapReduce</h3>
<p>If you&#8217;re not familiar with MapReduce, quite simply it is a methodology for standardising the method of implementing massively parallel data processing in grid computing. The data input data is broken down into processing units or blocks. Each block is processed with a <code>map()</code> function, and then the results of all of the blocks are combined with a <code>reduce()</code> function. The name comes from the Python functions <code><a href="http://docs.python.org/library/functions.html#map">map()</a></code> and <code><a href="http://docs.python.org/library/functions.html#reduce">reduce()</a></code>, but the general concept is implementable in any language with similar functions, such as C#, Java and C++.</p>
<p>In the case of calculating the statistics on a large dataset, such as the frequency, mean and variance of a set of numbers, a MapReduce-based implementation would slice up the data and process the slices in parallel. The bulk of the actual parallel processing occurs in the map step. The reduce step is usually a minimal step to combine these independently calculated results, which can be as simple as adding the results from each of the map outputs.</p>
<h3>A simple example</h3>
<p>Let&#8217;s say that we wanted to count the total number of characters in an array of strings. Here, the map function will find the length of each string, and then the reduce function will add these together. In Python, one possible implementation would be:<br />
<pre class="brush: python;">import operator

strings = ['string 1', 'string xyz', 'foobar']
print reduce(operator.add, map(len, strings))</pre><br />
This code does not really achieve anything special &#8212; counting characters is a trivial task in nearly any language, and is achievable in a variety of different ways. The advantage of this particular solution is that, because the map functions can be executed easily in parallel, it&#8217;s extremely scalable.The processing units are <a href="http://en.wikipedia.org/wiki/Associativity">associative</a> and <a href="http://en.wikipedia.org/wiki/Commutativity">commutative</a>, meaning that they can be calculated independently of each other; each of the strings could be sent to a separate processor or compute node for calculation of the <code>len()</code> function. In Python 2.6, you can use the <code><a href="http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.map">multiprocessing.Pool.map()</a></code> function as a drop-in replacement for the map function (and there are <a href="http://wiki.python.org/moin/ParallelProcessing#SymmetricMultiprocessing">numerous other parallel implementations</a> of the map function available). The following is a modification to our program which will distribute this task across two threads:<br />
<pre class="brush: python;">import operator
from multiprocessing import Pool

strings = ['string 1', 'string xyz', 'foobar']
pool = Pool(processes=2)
print reduce(operator.add, pool.map(len, strings))</pre></p>
<h3>Parallel statistics</h3>
<p>As a real-world example, say that we have a large array of random floats loaded into memory and wish to calculate the total count, mean and variance of them. Using the MapReduce paradigm, our map function could make use of Numpy&#8217;s <code><a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.size.html">size()</a></code>, <code><a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html">mean()</a></code> and <code><a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.var.html">var()</a></code> functions. The reduce function needs to implement a way to combine these statistics in parallel. It needs to accept, for example, the means of two samples, and find their combined mean.</p>
<p>Determining the combined number of samples is, of course, extremely simple:</p>
<p style="text-align:center;"><img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=n_{ab}=n_{a}+n_{b}" /></p>
<p>Combining two mean values is also fairly trivial. We find the mean of the two means, weighted by the number of samples:</p>
<p style="text-align:center;"><img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=\mu_{ab}=\frac{n_{a}\mu_{a}+n_{b}\mu_{b}}{n_{a}+n_{b}}" /></p>
<p>The variance is a little trickier. If both samples have a similar mean, a sample size-weighted mean would provide a reasonable estimate for the combined variance. However, we can calculate the precise variance as:</p>
<p style="text-align:center;"><img src="http://sciencesoft.at/image/latexurl/image.png?dpi=85&amp;template=inlinemath&amp;src=\sigma_{ab}=\frac{n_{a}\sigma_{a}+n_{b}\sigma_{b}}{n_{a}+n_{b}}+n_{a}n_{b}\left(\frac{\mu_{b}-\mu_{a}}{n_{a}+n_{b}}\right)^{2}" /></p>
<p>which is based on a <a href="http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm">pairwise variance algorithm</a>.</p>
<p>An example implementation in Python is shown below:<br />
<pre class="brush: python;">import numpy

def mapFunc(row):
    &quot;Calculate the statistics for a single row of data.&quot;
    return (numpy.size(row), numpy.mean(row), numpy.var(row))

def reduceFunc(row1, row2):
    &quot;Calculate the combined statistics from two rows of data.&quot;
    n_a, mean_a, var_a = row1
    n_b, mean_b, var_b = row2
    n_ab = n_a + n_b
    mean_ab = ((mean_a * n_a) + (mean_b * n_b)) / n_ab
    var_ab = (((n_a * var_a) + (n_b * var_b)) / n_ab) + ((n_a * n_b) * ((mean_b - mean_a) / n_ab)**2)
    return (n_ab, mean_ab, var_ab)

numRows = 100
numSamplesPerRow = 500
x = numpy.random.rand(numRows, numSamplesPerRow)
y = reduce(reduceFunc, map(mapFunc, x))
print &quot;n=%d, mean=%f, var=%f&quot; % y</pre><br />
The output after running this program five times was:</p>
<pre>n=50000, mean=0.497709, var=0.082983
n=50000, mean=0.498162, var=0.082474
n=50000, mean=0.498098, var=0.083814
n=50000, mean=0.498482, var=0.083203
n=50000, mean=0.499027, var=0.083813</pre>
<p>You can verify these results by running the following from the interactive shell, which should produce the same output:</p>
<pre>print "n=%d, mean=%f, var=%f" % (numpy.size(x), numpy.mean(x), numpy.var(x))</pre>
<p>MapReduce is interesting because a large class of programs can be solved by restating them as MapReduce programs. It&#8217;s no silver bullet approach to writing bug-free parallel software, but you may find that a large number of existing solutions in your code base can be easily parallelised using the MapReduce approach.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/764/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/764/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/764/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/764/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/764/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/764/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/764/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/764/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=764&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/06/16/calculating-variance-and-mean-with-mapreduce-python/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=n_ab=n_a+n_b" medium="image" />

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=mu_ab=fracn_amu_a+n_bmu_bn_a+n_b" medium="image" />

		<media:content url="http://sciencesoft.at/image/latexurl/image.png?dpi=85&#38;template=inlinemath&#38;src=sigma_ab=fracn_asigma_a+n_bsigma_bn_a+n_b+n_an_bleft(fracmu_b-mu_an_a+n_bright)2" medium="image" />
	</item>
		<item>
		<title>Web development made easy</title>
		<link>http://blog.cordiner.net/2010/05/11/web-development-made-easy/</link>
		<comments>http://blog.cordiner.net/2010/05/11/web-development-made-easy/#comments</comments>
		<pubDate>Tue, 11 May 2010 13:21:28 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Random thoughts]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=742</guid>
		<description><![CDATA[Here&#8217;s a funny browser trick a colleague emailed to me. Cut and paste the following into the address bar: javascript:document.body.contentEditable='true'; document.designMode='on'; void 0 And voila, you can now edit the page! These features form part of the upcoming HTML5 standard, which will make embedding rich text editors in web applications a snap. For some really cool demos [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=742&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a funny browser trick a colleague emailed to me. Cut and paste the following into the address bar:</p>
<pre>javascript:document.body.contentEditable='true'; document.designMode='on'; void 0</pre>
<p>And voila, you can now edit the page! These features form part of the upcoming <a href="http://www.w3.org/TR/html5/">HTML5 standard</a>, which will make embedding rich text editors in web applications a snap.</p>
<p>For some really cool demos of HTML5&#8242;s capabilities, including a 3-D version of Tetris, check out <a href="http://www.benjoffe.com/code/">Ben Joffe&#8217;s page</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/742/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/742/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/742/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/742/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/742/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/742/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/742/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/742/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=742&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/05/11/web-development-made-easy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>
	</item>
		<item>
		<title>Wake-on-LAN (C#)</title>
		<link>http://blog.cordiner.net/2010/03/06/wake-on-lan-c/</link>
		<comments>http://blog.cordiner.net/2010/03/06/wake-on-lan-c/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 13:04:08 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Code snippets]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=580</guid>
		<description><![CDATA[Wake-on-LAN is a nifty feature of some network cards that allows you to remotely power on a workstation sitting on the local network. This is an OS-agnostic feature that works by broadcasting a specially crafted &#8220;magic&#8221; packet at the data link layer. The target computer sits in a low-power state with only its network card [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=580&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Wake-on-LAN is a nifty feature of some network cards that allows you to remotely power on a workstation sitting on the local network. This is an OS-agnostic feature that works by broadcasting a specially crafted &#8220;magic&#8221; packet at the <a href="http://en.wikipedia.org/wiki/Data_Link_Layer">data link layer</a>. The target computer sits in a low-power state with only its network card switched on, and when it receives the magic packet, the network card &#8220;wakes up&#8221; the computer, powering it on and booting it up.</p>
<p>Wake-on-LAN is a handy tool and could serve a number of purposes. For instance, it could be combined with a remote shutdown command (like <a href="http://support.microsoft.com/kb/317371"><tt>shutdown.exe</tt></a> for Windows) to easily turn any computers in your office on and off at the click of a button. (I&#8217;m sure you can think of other more creative uses!)</p>
<h3>Enabling Wake-on-LAN</h3>
<p>The first step is to check that your computer supports Wake-on-LAN. There&#8217;s a few things to check:</p>
<ol>
<li>Your network card must support Wake-on-LAN</li>
<li>Your power supply must support Wake-on-LAN</li>
<li>Wake-on-LAN must be enabled in BIOS</li>
<li>Your router must be configured to forward broadcast packets</li>
<li>Your OS must be configured to enable Wake-on-LAN</li>
</ol>
<p>Google is your friend here; I won&#8217;t delve into the specifics of enabling Wake-on-LAN, and will assume you have successfully set it up in the following discussion.</p>
<h3>Remotely waking a computer</h3>
<p>As explained above, when a computer with Wake-on-LAN enabled is switched off, its network card waits in a low-power mode listening for a special &#8220;magic&#8221; packet that signals the computer to power on. In order to wake up a computer, you need to know its <a href="http://en.wikipedia.org/wiki/MAC_address">MAC address</a> (the computer is not switched on, so it doesn&#8217;t have an IP address yet). There are various ways of checking the MAC address of a computer. In Windows, the simplest way is to type &#8220;<tt>ipconfig /all</tt>&#8221; in a command prompt and look for the &#8220;physical address&#8221; listed for your network adapter. In Linux, the equivalent is &#8220;<tt>ifconfig</tt>&#8220;.</p>
<p>Now that we have the MAC address, let&#8217;s define a simple C# class for storing MAC addresses:<br />
<pre class="brush: csharp;">public class MACAddress
{
    private byte[] bytes;

    public MACAddress(byte[] bytes)
    {
        if (bytes.Length != 6)
            throw new System.ArgumentException(&quot;MAC address must have 6 bytes&quot;);
        this.bytes = bytes;
    }

    public byte this[int i]
    {
        get { return this.bytes[i]; }
        set { this.bytes[i] = value; }
    }

    public override string ToString()
    {
        return BitConverter.ToString(this.bytes, 0, 6);
    }
}</pre><br />
Our next step is to send the <a href="http://en.wikipedia.org/wiki/Wake-on-LAN#Magic_packet">magic packet</a> to a given MAC address. The magic packet is a very simple fixed size frame that consists of 6 bytes of ones (<tt>FF FF FF FF FF FF</tt>) followed by sixteen repetitions of the target MAC address, and is sent as a broadcast UDP packet, like so:<br />
<pre class="brush: csharp;">using System.Net;
using System.Net.Sockets;

...

public static void WakeUp(MACAddress mac)
{
    byte[] packet = new byte[17*6];

    for(int i = 0; i &lt; 6; i++)
        packet[i] = 0xff;

    for (int i = 1; i &lt;= 16; i++)
        for (int j = 0; j &lt; 6; j++)
            packet[i * 6 + j] = mac[j];

    UdpClient client = new UdpClient();
    client.Connect(IPAddress.Broadcast, 0);
    client.Send(packet, packet.Length);
}</pre><br />
For example, to use this code to wake up the computer with the MAC address of <tt>00-01-4A-18-9B-0C</tt>, you would call the <tt>WakeUp()</tt> function as follows:<br />
<pre class="brush: csharp;">WakeUp(new MACAddress(new byte[]{0x00, 0x01, 0x4A, 0x18, 0x9B, 0x0C}));</pre><br />
If you correctly enabled Wake-on-LAN on your target host, it should power on shortly after receiving this packet.</p>
<h3>Remotely obtaining MAC addresses</h3>
<p>Although you can manually obtain the MAC address of each individual computer on your computer (e.g. by running &#8220;<tt>ipconfig /all</tt>&#8221; on each one), this can be a time-consuming process if you need to wake up a large number of computers. A more convenient method of remotely checking the MAC addresses of computers on your LAN is using <a href="http://en.wikipedia.org/wiki/Address_Resolution_Protocol">ARP</a>. This is an Ethernet-level protocol that is used to request the MAC address of any computer from its IP address. Your OS caches the MAC address of every host it communicates with. For example, try pinging a computer and then immediately after type &#8220;<tt>arp -a</tt>&#8221; at the command prompt. This is the output from my computer after pinging my router and immediately checking the ARP cache:</p>
<pre>C:\&gt;ping 192.168.0.1 -n 1 &amp;&amp; arp -a

Pinging 192.168.0.1 with 32 bytes of data:

Reply from 192.168.0.1: bytes=32 time=1ms TTL=64

Ping statistics for 192.168.0.1:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 1ms, Maximum = 1ms, Average = 1ms

Interface: 192.168.0.183 --- 0x20002
  Internet Address      Physical Address      Type
  192.168.0.1           00-1c-f0-02-65-69     dynamic</pre>
<p>It shows that the MAC address of my router is <tt>00-1C-F0-02-65-69</tt>. In C#, we can send ARP requests using the <a href="http://msdn.microsoft.com/en-us/library/aa366358%28VS.85%29.aspx"><tt>SendARP</tt></a> function provided by the <a href="http://msdn.microsoft.com/en-us/library/aa366073%28VS.85%29.aspx">Windows IP Helper API</a>. The following code allows us to resolve an IP address to a MAC address using ARP:<br />
<pre class="brush: csharp;">using System.Net;
using System.Runtime.InteropServices;

...

[DllImport(&quot;iphlpapi.dll&quot;, ExactSpelling=true)]
private static extern int SendARP(int DestIP, int SrcIP,
	[Out] byte[] pMacAddr, ref int PhyAddrLen);

public static MACAddress IpToMacAddress(IPAddress ipAddress)
{
	byte[] mac = new byte[6];
	int len = mac.Length;
	int res = SendARP((int)ipAddress.Address, 0, mac, ref len);
	if(res != 0)
		throw new WebException(&quot;Error &quot; + res + &quot; looking up &quot; + ipAddress.ToString());
	return new MACAddress(mac);
}</pre><br />
For example, if I wanted to obtain the MAC address of the host <tt>192.168.0.1</tt>:<br />
<pre class="brush: csharp;">MACAddress foo = IpToMacAddress(IPAddress.Parse(&quot;192.168.0.1&quot;));</pre><br />
After shutting down this host, I could then power it back on with the following:<br />
<pre class="brush: csharp;">WakeUp(foo);</pre></p>
<h3>Closing remarks</h3>
<p>An important thing to note is that Wake-on-LAN operates below the <a href="http://en.wikipedia.org/wiki/Internet_Layer">IP level</a>. This means that the sending machine needs to be on the LAN, so we cannot send them over remote IP-based connections, such as over SSH or VPN. Also, while this article only dealt with Wake-on-LAN over a wired local network, it is possible to perform wake-up over wi-fi (try Googling for &#8220;WoWLAN&#8221; if you are trying to wake up computers over wi-fi).</p>
<p>You may have also noticed that Wake-on-LAN is very insecure. Any host can wake up any other computer on the LAN with Wake-on-LAN enabled armed with only its MAC address. Some NICs do allow password-protected WOL but, as far as I know, this is not widely implemented.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/580/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=580&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/03/06/wake-on-lan-c/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>
	</item>
		<item>
		<title>OpenCV Viola &amp; Jones object detection in MATLAB</title>
		<link>http://blog.cordiner.net/2010/02/15/opencv-viola-jones-object-detection-in-matlab/</link>
		<comments>http://blog.cordiner.net/2010/02/15/opencv-viola-jones-object-detection-in-matlab/#comments</comments>
		<pubDate>Mon, 15 Feb 2010 11:16:15 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Image processing]]></category>
		<category><![CDATA[MATLAB]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=230</guid>
		<description><![CDATA[In image processing, one of the most successful object detectors devised is the Viola and Jones detector, proposed in their seminal CVPR paper in 2001. A popular implementation used by image processing researchers and implementers is provided by the OpenCV library. In this post, I&#8217;ll show you how run the OpenCV object detector in MATLAB [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=230&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In image processing, one of the most successful <a href="http://en.wikipedia.org/wiki/Object_detection">object detectors</a> devised is the <a href="http://en.wikipedia.org/wiki/Viola-Jones_object_detection_framework">Viola and Jones detector</a>, proposed in their seminal <a href="http://doi.ieeecomputersociety.org/10.1109/CVPR.2001.990517">CVPR paper in 2001</a>. A popular implementation used by image processing researchers and implementers is provided by the <a href="http://opencv.willowgarage.com/documentation/object_detection.html#haar-feature-based-cascade-classifier-for-object-detection">OpenCV library</a>. In this post, I&#8217;ll show you how run the OpenCV object detector in MATLAB for Windows. You should have some familiarity with OpenCV and with the Viola and Jones detector to work through this tutorial.</p>
<h3>Steps in the object detector</h3>
<p>MATLAB is able to <a href="http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_external/f43202.html">call functions in shared libraries</a>. This means that, using the compiled OpenCV DLLs, we are able to directly call various OpenCV functions from within MATLAB. The flow of our MATLAB program, including the required OpenCV external function calls (based on <a href="http://opencv.willowgarage.com/wiki/FaceDetection">this example</a>), will go something like this:</p>
<ol>
<li><code>cvLoadHaarClassifierCascade:</code> Load object detector cascade</li>
<li><code>cvCreateMemStorage:</code> Allocate memory for detector</li>
<li><code>cvLoadImage:</code> Load image from disk</li>
<li><code>cvHaarDetectObjects:</code> Perform object detection</li>
<li>For each detected object:
<ol>
<li><code>cvGetSeqElem:</code> Get next detected object of type <code>cvRect</code></li>
<li>Display this detection result in MATLAB</li>
</ol>
</li>
<li><code>cvReleaseImage:</code> Unload the image from memory</li>
<li><code>cvReleaseMemStorage:</code> De-allocate memory for detector</li>
<li><code>cvReleaseHaarClassifierCascade:</code> Unload the cascade from memory</li>
</ol>
<h3>Loading shared libraries</h3>
<p>The first step is to load the OpenCV shared libraries using MATLAB&#8217;s <a href="http://www.mathworks.com/access/helpdesk/help/techdoc/ref/loadlibrary.html"><code>loadlibrary()</code></a> function. To use the functions listed in the object detector steps above, we need to load the OpenCV libraries <a href="http://opencv.willowgarage.com/documentation/cxcore._the_core_functionality.html"><code>cxcore100.dll</code></a>, <a href="http://opencv.willowgarage.com/documentation/cv._image_processing_and_computer_vision.html"><code>cv100.dll</code></a> and <a href="http://opencv.willowgarage.com/documentation/highgui._high-level_gui_and_media_io.html"><code>highgui100.dll</code></a>. Assuming that OpenCV has been installed to <code>"C:\Program Files\OpenCV"</code>, the libraries are loaded like this:<br />
<pre class="brush: matlabkey;">opencvPath = 'C:\Program Files\OpenCV';
includePath = fullfile(opencvPath, 'cxcore\include');

loadlibrary(...
	fullfile(opencvPath, 'bin\cxcore100.dll'), ...
	fullfile(opencvPath, 'cxcore\include\cxcore.h'), ...
		'alias', 'cxcore100', 'includepath', includePath);
loadlibrary(...
	fullfile(opencvPath, 'bin\cv100.dll'), ...
	fullfile(opencvPath, 'cv\include\cv.h'), ...
		'alias', 'cv100', 'includepath', includePath);
loadlibrary(...
	fullfile(opencvPath, 'bin\highgui100.dll'), ...
	fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...
		'alias', 'highgui100', 'includepath', includePath);</pre><br />
You will get some warnings; these can be ignored for our purposes. You can display the list of functions that a particular shared library exports with the <code>libfunctions()</code> command in MATLAB For example, to list the functions exported by the <a href="http://opencv.willowgarage.com/documentation/highgui._high-level_gui_and_media_io.html"><code>highgui</code></a> library:<br />
<pre class="brush: matlabkey;">&gt;&gt; libfunctions('highgui100')

Functions in library highgui100:

cvConvertImage             cvQueryFrame
cvCreateCameraCapture      cvReleaseCapture
cvCreateFileCapture        cvReleaseVideoWriter
cvCreateTrackbar           cvResizeWindow
cvCreateVideoWriter        cvRetrieveFrame
cvDestroyAllWindows        cvSaveImage
cvDestroyWindow            cvSetCaptureProperty
cvGetCaptureProperty       cvSetMouseCallback
cvGetTrackbarPos           cvSetPostprocessFuncWin32
cvGetWindowHandle          cvSetPreprocessFuncWin32
cvGetWindowName            cvSetTrackbarPos
cvGrabFrame                cvShowImage
cvInitSystem               cvStartWindowThread
cvLoadImage                cvWaitKey
cvLoadImageM               cvWriteFrame
cvMoveWindow
cvNamedWindow</pre><br />
The first step in our object detector is to load a detector cascade. We are going to load one of the frontal face detector cascades that is provided with a normal OpenCV installation:<br />
<pre class="brush: matlabkey;">classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml';
cvCascade = calllib('cv100', 'cvLoadHaarClassifierCascade', classifierFilename, ...
	libstruct('CvSize',struct('width',int16(100),'height',int16(100))));</pre><br />
The function <a href="http://www.mathworks.com/access/helpdesk/help/techdoc/ref/calllib.html"><code>calllib()</code></a> returns a <a href="http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_external/f42650.html"><code>libpointer</code></a> structure containing two fairly self-explanatory fields, <code>DataType</code> and <code>Value</code>. To display the return value from <code>cvLoadHaarClassifierCascade()</code>, we can run:<br />
<pre class="brush: matlabkey;">&gt;&gt; cvCascade.Value

ans =

               flags: 1.1125e+009
               count: 22
    orig_window_size: [1x1 struct]
    real_window_size: [1x1 struct]
               scale: 0
    stage_classifier: [1x1 struct]
         hid_cascade: []</pre><br />
The above output shows that MATLAB has successfully loaded the cascade file and returned a pointer to an OpenCV <code>CvHaarClassifierCascade</code> object.</p>
<h3>Prototype M-files</h3>
<p>We could now continue implementing all of our OpenCV function calls from the object detector steps like this, however we will run into a problem when <code>cvGetSeqElem</code> is called. To see why, try this:</p>
<pre>libfunctions('cxcore100', '-full')</pre>
<p>The <code>-full</code> option lists the signatures for each imported function. The signature for the function <code>cvGetSeqElem()</code> is listed as:</p>
<pre>[cstring, CvSeqPtr] cvGetSeqElem(CvSeqPtr, int32)</pre>
<p>This shows that the return value for the imported <code>cvGetSeqElem()</code> function will be a pointer to a character (<code>cstring</code>). This is based on the function declaration in the <code>cxcore.h</code> header file:</p>
<pre>CVAPI(char*)  cvGetSeqElem( const CvSeq* seq, int index );</pre>
<p>However, in step 5.1 of our object detector steps, we require a <code>CvRect</code> object. Normally in C++ you would simply cast the character pointer return value to a <code>CvRect</code> object, but MATLAB does not support casting of return values from <code>calllib()</code>, so there is no way we can cast this to a <code>CvRect</code>.</p>
<p>The solution is what is referred to as a prototype M-file. By constructing a prototype M-file, we can define our own signatures for the imported functions rather than using the declarations from the C++ header file.</p>
<p>Let&#8217;s generate the prototype M-file now:<br />
<pre class="brush: matlabkey;">loadlibrary(...
	fullfile(opencvPath, 'bin\cxcore100.dll'), ...
	fullfile(opencvPath, 'cxcore\include\cxcore.h'), ...
		'mfilename', 'proto_cxcore');</pre><br />
This will automatically generate a prototype M-file named <code>proto_cxcore.m</code> based on the C++ header file. Open this file up and find the function signature for <code>cvGetSeqElem</code> and replace it with the following:<br />
<pre class="brush: matlabkey;">% char * cvGetSeqElem ( const CvSeq * seq , int index );

fcns.name{fcnNum}='cvGetSeqElem'; fcns.calltype{fcnNum}='cdecl'; fcns.LHS{fcnNum}='CvRectPtr'; fcns.RHS{fcnNum}={'CvSeqPtr', 'int32'};fcnNum=fcnNum+1;</pre><br />
This changes the return type for <code>cvGetSeqElem()</code> from a <code>char</code> pointer to a <code>CvRect</code> pointer.</p>
<p>We can now load the library using the new prototype:<br />
<pre class="brush: matlabkey;">loadlibrary(...
	fullfile(opencvPath, 'bin\cxcore100.dll'), ...
		@proto_cxcore);</pre></p>
<h3>An example face detector</h3>
<p>We now have all the pieces ready to write a complete object detector. The code listing below implements the object detector steps listed above to perform face detection on an image. Additionally, the image is displayed in MATLAB and a box is drawn around any detected faces.<br />
<pre class="brush: matlabkey;">opencvPath = 'C:\Program Files\OpenCV';
includePath = fullfile(opencvPath, 'cxcore\include');
inputImage = 'lenna.jpg';

%% Load the required libraries

if libisloaded('highgui100'), unloadlibrary highgui100, end
if libisloaded('cv100'), unloadlibrary cv100, end
if libisloaded('cxcore100'), unloadlibrary cxcore100, end

loadlibrary(...
	fullfile(opencvPath, 'bin\cxcore100.dll'), @proto_cxcore);
loadlibrary(...
	fullfile(opencvPath, 'bin\cv100.dll'), ...
	fullfile(opencvPath, 'cv\include\cv.h'), ...
		'alias', 'cv100', 'includepath', includePath);
loadlibrary(...
	fullfile(opencvPath, 'bin\highgui100.dll'), ...
	fullfile(opencvPath, 'otherlibs\highgui\highgui.h'), ...
		'alias', 'highgui100', 'includepath', includePath);

%% Load the cascade

classifierFilename = 'C:/Program Files/OpenCV/data/haarcascades/haarcascade_frontalface_alt.xml';
cvCascade = calllib('cv100', 'cvLoadHaarClassifierCascade', classifierFilename, ...
	libstruct('CvSize',struct('width',int16(100),'height',int16(100))));

%% Create memory storage

cvStorage = calllib('cxcore100', 'cvCreateMemStorage', 0);

%% Load the input image

cvImage = calllib('highgui100', ...
	'cvLoadImage', inputImage, int16(1));
if ~cvImage.Value.nSize
	error('Image could not be loaded');
end

%% Perform object detection

cvSeq = calllib('cv100', ...
	'cvHaarDetectObjects', cvImage, cvCascade, cvStorage, 1.1, 2, 0, ...
	libstruct('CvSize',struct('width',int16(40),'height',int16(40))));

%% Loop through the detections and display bounding boxes

imshow(imread(inputImage)); %load and display image in MATLAB
for n = 1:cvSeq.Value.total
	cvRect = calllib('cxcore100', ...
		'cvGetSeqElem', cvSeq, int16(n));
	rectangle('Position', ...
		[cvRect.Value.x cvRect.Value.y ...
		cvRect.Value.width cvRect.Value.height], ...
		'EdgeColor', 'r', 'LineWidth', 3);
end

%% Release resources

calllib('cxcore100', 'cvReleaseImage', cvImage);
calllib('cxcore100', 'cvReleaseMemStorage', cvStorage);
calllib('cv100', 'cvReleaseHaarClassifierCascade', cvCascade);</pre><br />
As an example, the following is the output after running the detector above on a greyscale version of the <a href="http://en.wikipedia.org/wiki/Lenna">Lenna test image</a>:</p>
<p style="text-align:center;"><a href="http://thedeadbeef.files.wordpress.com/2010/02/lenna_output2.jpg"><img class="alignnone size-medium wp-image-523" title="Face detection on Lenna test image" src="http://thedeadbeef.files.wordpress.com/2010/02/lenna_output2.jpg?w=297&h=300" alt="" width="297" height="300" /></a></p>
<p style="text-align:left;">Note: If you get a segmentation fault attempting to run the code above, try <a href="http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_env/brqxeeu-259.html#brqxeeu-293">evaluating the cells one-by-one</a> (e.g. by pressing Ctrl-Enter) &#8211; it seems to fix the problem.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/230/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/230/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/230/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/230/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/230/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/230/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/230/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/230/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=230&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/02/15/opencv-viola-jones-object-detection-in-matlab/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/02/lenna_output2.jpg?w=297" medium="image">
			<media:title type="html">Face detection on Lenna test image</media:title>
		</media:content>
	</item>
		<item>
		<title>Mapping urban noise levels using smartphones</title>
		<link>http://blog.cordiner.net/2010/02/04/mapping-urban-noise-levels-using-smartphones/</link>
		<comments>http://blog.cordiner.net/2010/02/04/mapping-urban-noise-levels-using-smartphones/#comments</comments>
		<pubDate>Thu, 04 Feb 2010 11:18:19 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[In the news]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=481</guid>
		<description><![CDATA[The Sony Lab in Paris recently released a free smartphone app called NoiseTube which uses your smartphone&#8217;s microphone and GPS to measure noise levels as you walk around. This data is combined with data collected from other users in order to plot the current noise levels on a city map, a technique dubbed &#8220;participatory sensing&#8221;. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=481&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img class="alignright" title="Participatory sensing" src="http://noisetube.net/images/participatory_sensing.png" alt="Participatory sensing" width="362" height="80" />The <a href="http://www.csl.sony.fr/">Sony Lab in Paris</a> recently released a free smartphone app called <a href="http://noisetube.net/">NoiseTube</a> which uses your smartphone&#8217;s microphone and GPS to measure noise levels as you walk around. This data is combined with data collected from other users in order to plot the current noise levels on a city map, a technique dubbed &#8220;participatory sensing&#8221;. Anyone can sign up to <a href="http://noisetube.net/signup">download the application and contribute data</a>. I doubt it will really take off, but either way it&#8217;s an interesting concept that makes very clever use of crowdsourcing and repurposing of existing technology. The <a href="http://www.newscientist.com/article/mg20427346.900-cellphone-app-to-make-maps-of-noise-pollution.html">goal is to meet EU requirements</a> of member countries to periodically measure noise pollution levels, but the website is open to users from any country. Currently you can view <a href="http://noisetube.net/users">noise data for individual users</a> (making your data public is optional) and you can <a href="http://noisetube.net/cities">download Google Earth KML data for various cities</a>, but I&#8217;d love to see someone create a Google Maps mashup of this!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/481/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/481/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/481/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/481/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/481/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/481/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/481/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/481/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=481&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/02/04/mapping-urban-noise-levels-using-smartphones/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://noisetube.net/images/participatory_sensing.png" medium="image">
			<media:title type="html">Participatory sensing</media:title>
		</media:content>
	</item>
		<item>
		<title>Configuring TortoiseCVS to use TortoiseMerge</title>
		<link>http://blog.cordiner.net/2010/01/22/configuring-tortoisecvs-to-use-tortoisemerge/</link>
		<comments>http://blog.cordiner.net/2010/01/22/configuring-tortoisecvs-to-use-tortoisemerge/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 09:35:05 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Coding tools]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=428</guid>
		<description><![CDATA[I&#8217;m a big fan of TortoiseSVN for working with Subversion repositories in Windows. Recently I&#8217;ve had to work on a software project using a CVS repository, so I naturally decided to use TortoiseCVS. TortoiseSVN was originally inspired by TortoiseCVS, but despite the similarity in the names, they have no affiliation and function quite differently. One [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=428&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m a big fan of <a href="http://tortoisesvn.tigris.org/">TortoiseSVN</a> for working with <a href="http://subversion.tigris.org/">Subversion</a> repositories in Windows. Recently I&#8217;ve had to work on a software project using a <a href="http://www.nongnu.org/cvs/">CVS</a> repository, so I naturally decided to use <a href="http://www.tortoisecvs.org/">TortoiseCVS</a>. TortoiseSVN was originally inspired by TortoiseCVS, but despite the similarity in the names, they have no affiliation and function quite differently.</p>
<p>One of the first things that I noticed was that TortoiseCVS does not seem to come bundled with a Windows diff tool, unlike the handy <a href="http://tortoisesvn.tigris.org/TortoiseMerge.html">TortoiseMerge</a> that comes with TortoiseSVN. The TortoiseCVS FAQ recommends a couple of <a href="http://www.tortoisecvs.org/faq.html#recommenddiff">third-party diff tools</a> which integrate with it quite nicely. However, if you already have TortoiseSVN installed, there&#8217;s an easy way to configure TortoiseCVS to use TortoiseMerge.</p>
<p>First, open TortoiseCVS&#8217;s settings by right-clicking on the desktop, and then &#8220;CVS&#8221; &gt; &#8220;Preferences&#8230;&#8221;:</p>
<p><a href="http://thedeadbeef.files.wordpress.com/2010/01/step13.gif"><img class="aligncenter size-full wp-image-433" title="TortiseCVS diff - step 1" src="http://thedeadbeef.files.wordpress.com/2010/01/step13.gif?w=497" alt=""   /></a></p>
<p>Click on the &#8220;Tools&#8221; tab. Under &#8220;Diff application&#8221;, browse to the TortoiseMerge.exe executable, which is in the TortoiseSVN bin folder. In my installation, this was:</p>
<pre style="text-align:center;">C:\Program Files\TortoiseSVN\bin\TortoiseMerge.exe</pre>
<p>For &#8220;two-way diff parameters&#8221;, enter the following:</p>
<pre style="text-align:center;">/base:"%1" /mine:"%2"</pre>
<p>Click OK and that&#8217;s it! The image below shows what your preferences should look like:</p>
<p><a href="http://thedeadbeef.files.wordpress.com/2010/01/step21.gif"><img class="aligncenter size-full wp-image-432" title="TortiseCVS diff - step 2" src="http://thedeadbeef.files.wordpress.com/2010/01/step21.gif?w=497" alt=""   /></a></p>
<p>Now you can right-click on any modified text files in a checked out CVS repository, click &#8220;CVS Diff&#8221; and it will fire up TortoiseMerge to show you the differences between your local modified copy and the last commited version in the repository.</p>
<p><strong>Note:</strong> You should also be able to use TortoiseMerge as your TortoiseCVS merge application too. I haven&#8217;t tested this out, but the &#8220;two-way merge parameters&#8221; should be similar to those used above for the diff application.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/428/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/428/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/428/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/428/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/428/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/428/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/428/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/428/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=428&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/01/22/configuring-tortoisecvs-to-use-tortoisemerge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/01/step13.gif" medium="image">
			<media:title type="html">TortiseCVS diff - step 1</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/01/step21.gif" medium="image">
			<media:title type="html">TortiseCVS diff - step 2</media:title>
		</media:content>
	</item>
		<item>
		<title>Parsing English numbers with Perl</title>
		<link>http://blog.cordiner.net/2010/01/02/parsing-english-numbers-with-perl/</link>
		<comments>http://blog.cordiner.net/2010/01/02/parsing-english-numbers-with-perl/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 04:29:08 +0000</pubDate>
		<dc:creator>alister</dc:creator>
				<category><![CDATA[Language processing]]></category>
		<category><![CDATA[Perl]]></category>

		<guid isPermaLink="false">http://blog.cordiner.net/?p=6</guid>
		<description><![CDATA[Note: The problem described here has already been solved with libraries such as Lingua::EN::FindNumber and Lingua::EN::Words2Nums. For production software, I&#8217;d recommend you look at using those modules instead of re-inventing the wheel. This article is only intended for those interested in learning how this type of parsing works. In a project I was recently working [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=6&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><em><strong>Note:</strong> The problem described here has already been solved with libraries such as <a href="http://search.cpan.org/perldoc/Lingua::EN::FindNumber">Lingua::EN::FindNumber</a> and <a href="http://search.cpan.org/perldoc/Lingua::EN::Words2Nums">Lingua::EN::Words2Nums</a>. For production software, I&#8217;d recommend you look at using those modules instead of re-inventing the wheel. </em><em>This article is only intended for those interested in learning how this type of parsing works.</em></p>
<p>In a project I was recently working on, there was a need to perform <a href="http://en.wikipedia.org/wiki/Named_entity_recognition">named entity recognition</a> on natural language text fields in order to convert natural language numbers in the field, such as &#8220;two hundred and thirteen&#8221;, into their numerical values (e.g. 213). Google does a great job of this; for example, try out <a href="http://www.google.com/search?q=two+hundred+and+seven+million+thirteen+thousand+two+hundred+and+ninety+eight+in+decimal">this search</a> to convert a string to a number, and the <a href="http://www.google.com/search?q=207013298+in+english">reverse</a>. In this post, I&#8217;ll discuss how this conversion functionality can be achieved with the nifty Perl recursive descent parser generator <a href="search.cpan.org/dist/Parse-RecDescent/">Parser::RecDescent</a>.</p>
<p>The parsing is a two-step process. First, each of the number words need to be matched and their values looked up in a dictionary. For example, the word &#8220;two&#8221; needs to be matched and recognised as &#8220;2&#8243;, &#8220;hundred&#8221; as &#8220;100&#8243; and &#8220;thirteen&#8221; as &#8220;13&#8243;. In parsing parlance, this step is known as <em>lexical analysis</em>. Second, we need to calculate the total number that the entire sentence represents by accumulating all of the matched values. This is known as <em>syntactic analysis</em>.</p>
<p>Consider the following input string:</p>
<p style="text-align:center;">&#8220;two hundred and seven million thirteen thousand two hundred and ninety eight&#8221;</p>
<p>The first step is to tokenise the string and remove the word &#8220;and&#8221; anywhere it appears. (Breaking a number up with the word &#8220;and&#8221; is a British English convention; American English usually omits the instances of &#8220;and&#8221;.) We then end up with the following list of token values:</p>
<p style="text-align:center;">&#8220;two&#8221;, &#8220;hundred&#8221;, &#8220;seven&#8221;, &#8220;million&#8221;, &#8220;thirteen&#8221;, &#8220;thousand&#8221;, &#8220;two&#8221;, &#8220;hundred&#8221;, &#8220;ninety&#8221;, &#8220;eight&#8221;</p>
<p>If we convert each token value into its numeric equivalent, this becomes:</p>
<p style="text-align:center;">2, 100, 7, 1000000, 13, 1000, 2, 100, 90, 8</p>
<p>Finally, in order to find the total, we calculate:</p>
<p style="text-align:center;">((2 × 100 + 7) × 1000000) + (13 × 1000) + (2 × 100 + 90 + 8) = 207,013,298</p>
<p>This matching and conversion is achieved with a parser generator. A parser generator takes a formal grammar as its input and outputs a parse tree, which is an abstract representation of the input text. The grammar refers to the rules for expressing numbers in English.</p>
<h3>Syntactic analysis</h3>
<p>For the syntactic analysis, I based my approach on an excellent <a href="http://stackoverflow.com/questions/70161/">discussion on this topic</a> on Stackoverflow. One of the posters suggested a very simple algorithm to perform the calculation. I found the provided pseudocode a little confusing, so here is my own version:</p>
<pre>total = 0, prior = 0

for each word in sentence

   value = dictionary[word]

   if prior = 0
      prior = value
   else if prior &gt; value
      prior = prior + value
   else
      prior = prior * value

   if value &gt;= 1000 or last word in sentence
      total = total + prior
      prior = 0</pre>
<p>This algorithm works by retaining two variables, <code>prior</code> and <code>total</code>. The <code>prior</code> variable stores the current value of the current order of magnitude; how many billions, millions or thousands. This is then added back to the <code>total</code> when we step down an order of magnitude. The table below shows the algorithm in action for the input string of &#8220;two hundred and seven million thirteen thousand two hundred and ninety eight&#8221;.</p>
<table border="0">
<thead>
<tr>
<th>Word</th>
<th>Value</th>
<th>Prior</th>
<th>Total</th>
</tr>
</thead>
<tbody>
<tr>
<td>-</td>
<td>-</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>two</td>
<td>2</td>
<td>2</td>
<td>0</td>
</tr>
<tr>
<td>hundred</td>
<td>100</td>
<td>200</td>
<td>0</td>
</tr>
<tr>
<td>seven</td>
<td>7</td>
<td>207</td>
<td>0</td>
</tr>
<tr>
<td>million</td>
<td>1,000,000</td>
<td>0</td>
<td>207,000,000</td>
</tr>
<tr>
<td>thirteen</td>
<td>13</td>
<td>13</td>
<td>207,000,000</td>
</tr>
<tr>
<td>thousand</td>
<td>1,000</td>
<td>0</td>
<td>207,013,000</td>
</tr>
<tr>
<td>two</td>
<td>2</td>
<td>2</td>
<td>207,013,000</td>
</tr>
<tr>
<td>hundred</td>
<td>100</td>
<td>200</td>
<td>207,013,000</td>
</tr>
<tr>
<td>ninety</td>
<td>90</td>
<td>290</td>
<td>207,013,000</td>
</tr>
<tr>
<td>eight</td>
<td>8</td>
<td>298</td>
<td>207,013,000</td>
</tr>
<tr>
<td>-</td>
<td>-</td>
<td>0</td>
<td>207,013,298</td>
</tr>
</tbody>
</table>
<h3>Lexical analysis</h3>
<p>Lexical analysis involves defining a grammar for the formation of English words and matching an input string to this grammar. A simplistic approach is to define a dictionary of all possible number words, such as &#8220;two&#8221;, &#8220;hundred&#8221; and &#8220;million&#8221;, and then match any string if it contains only these words. If we use this approach, the algorithm for syntactic analysis described above will still work, however the lexical analysis stage will match invalid sentences that don&#8217;t mean anything in English, such as &#8220;hundred thirteen seven&#8221;, and feed these into the syntactic analyser with unpredictable results.</p>
<p>A more intelligent, but more complicated approach is to ensure that the English number words appear in a valid sequence. This can be defined by the grammar. An example of such a grammar can be found in the appendix of a <a href="http://www.eecs.harvard.edu/~shieber/Biblio/Papers/shieber-scansoft-96.pdf">paper by Stuart Shieber</a>. Inspired by this approach, I wrote the following grammar, which matches numbers smaller than one billion (10<sup>9</sup>):</p>
<pre>&lt;number&gt; ::= ((number_1to999 number_1e6)? (number_1to999 number_1e3)? number_1to999?) | number_0

&lt;number_0&gt; ::= "zero"
&lt;number_1to9&gt; ::= "one" | "two" | "three" | "four" | "five" | "six" | "seven" | "eight" | "nine"
&lt;number_10to19&gt; ::= "ten" | "eleven" | "twelve" | "thirteen" | "fourteen" | "fifteen"
	| "sixteen" | "seventeen" | "eighteen" | "nineteen"
&lt;number_1to999&gt; ::= (number_1to9? number_100)? (number_1to9 | number_10to19 | (number_tens number_1to9))?
&lt;number_tens&gt; ::= "twenty" | "thirty" | "fourty" | "fifty" | "sixty" | "seventy" | "eighty" | "ninety"
&lt;number_100&gt; ::= "hundred"
&lt;number_1e3&gt; ::= "thousand"
&lt;number_1e6&gt; ::= "million"</pre>
<p>To visualise what it is doing, the syntax diagram below shows the main parts of the grammar (generated using Franz Braun&#8217;s <a href="http://www-cgi.uni-regensburg.de/~brf09510/syntax.html">CGI diagram generator</a>):</p>
<p><img class="alignnone size-full wp-image-78" title="Number parser syntax diagram" src="http://thedeadbeef.files.wordpress.com/2010/01/syntax_diagram11.gif?w=497" alt="Number parser syntax diagram"   /></p>
<h3>Wrapping it all up</h3>
<p>The Perl implementation is shown below. There are a few caveats to this implementation (many of them are identified in the <a href="http://stackoverflow.com/questions/70161/how-to-read-values-from-numbers-written-as-words">Stackoverflow discussion</a>). Because it simply discards the word &#8220;and&#8221; anywhere in the sentence, it doesn&#8217;t distinguish between separate numbers; for example, &#8220;twenty and five&#8221; will be treated as &#8220;twenty five&#8221;. The implementation only recognises numbers up to the millions; if it were extended to billions and above, it would need some method of dealing with <a href="http://en.wikipedia.org/wiki/Long_and_short_scales">short and long scales</a>. Furthermore, it only accepts integers and doesn&#8217;t accept ordinals. It also does not support vernacular forms of numbers, such as &#8220;fifteen hundred&#8221;, &#8220;three-sixty-five&#8221;, &#8220;a hundred&#8221; or &#8220;one point two million&#8221; (these are other nuances in English numerals can be found <a href="http://en.wikipedia.org/wiki/Names_of_numbers_in_English#Cardinal_numbers">here</a>).<br />
<pre class="brush: perl;">#!/usr/bin/perl

use strict;
use Parse::RecDescent;

my $sentence = $ARGV[0] || die &quot;Must pass an argument&quot;;

# Define the grammar
$::RD_AUTOACTION = q { [@item] };
my $parser = Parse::RecDescent-&gt;new(q(

	startrule : (number_1to999 number_1e6)(?) (number_1to999 number_1e3)(?) number_1to999(?) | number_0
	number_0: &quot;zero&quot; {0}
	number_1to9: &quot;one&quot; {1} | &quot;two&quot; {2} | &quot;three&quot; {3} | &quot;four&quot; {4} | &quot;five&quot; {5} | &quot;six&quot; {6} | &quot;seven&quot; {7}
		| &quot;eight&quot; {8} | &quot;nine&quot; {9}
	number_10to19: &quot;ten&quot; {10} | &quot;eleven&quot; {11} | &quot;twelve&quot; {12} | &quot;thirteen&quot; {13} | &quot;fourteen&quot; {14}
		| &quot;fifteen&quot; {15} | &quot;sixteen&quot; {16} | &quot;seventeen&quot; {17} | &quot;eighteen&quot; {18} | &quot;nineteen&quot; {19}
	number_1to999: (number_1to9(?) number_100)(?)
		(number_1to9 | number_10to19 | number_10s number_1to9(?))(?)
	number_10s: &quot;twenty&quot; {20} | &quot;thirty&quot; {30} | &quot;fourty&quot; {40} | &quot;fifty&quot; {50} | &quot;sixty&quot; {60} |
		&quot;seventy&quot; {70} | &quot;eighty&quot; {80} | &quot;ninety&quot; {90}
	number_100: &quot;hundred&quot; {100}
	number_1e3: &quot;thousand&quot; {1e3}
	number_1e6: &quot;million&quot; {1e6}

));

# Perform lexical analysis
$sentence =~ s/(\W)and(\W)/$1$2/gi; #remove the word &quot;and&quot;
my $parseTree = $parser-&gt;startrule(lc $sentence);

# Perform syntactic analysis
my @numbers = flattenParseTree($parseTree); # flatten the tree to a sequence of numbers
my $number = combineNumberSequence(\@numbers); # process the sequence of numbers to find the total

print $number;

sub flattenParseTree($) {

	my $parseTree = shift || return;
	my @tokens = ();
	if(UNIVERSAL::isa( $parseTree, &quot;ARRAY&quot;)) {
		push(@tokens, flattenParseTree($_)) foreach(@{$parseTree});
	} elsif($parseTree &gt; 0) {
		return $parseTree;
	}
	return @tokens;

}

sub combineNumberSequence($) {

	my $numbers = shift || return;
	my $prior = 0;
	my $total = 0;

	for(my $i=0; $i &lt;= $#$numbers; $i++) {
 		if($prior == 0) {
 			$prior = $numbers-&gt;[$i];
		} elsif($prior &gt; $numbers-&gt;[$i]) {
			$prior += $numbers-&gt;[$i];
		} else {
			$prior *= $numbers-&gt;[$i];
		}

		if(($numbers-&gt;[$i] &gt;= 1e3) || ($i == $#$numbers)) {
			$total += $prior;
			$prior = 0;
		}

	}

	return $total;

}</pre></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thedeadbeef.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/thedeadbeef.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/thedeadbeef.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/thedeadbeef.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/thedeadbeef.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/thedeadbeef.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/thedeadbeef.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/thedeadbeef.wordpress.com/6/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.cordiner.net&#038;blog=18553497&#038;post=6&#038;subd=thedeadbeef&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.cordiner.net/2010/01/02/parsing-english-numbers-with-perl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/93fc6f53b58e6130c8c3f279ec355e02?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">acordiner</media:title>
		</media:content>

		<media:content url="http://thedeadbeef.files.wordpress.com/2010/01/syntax_diagram11.gif" medium="image">
			<media:title type="html">Number parser syntax diagram</media:title>
		</media:content>
	</item>
	</channel>
</rss>
