Monday, June 05, 2006

Gather round kids - it's geek story time

If my PhD were a soccer match, then I think I just kicked a goal. I just found out a paper I submitted to one of the top international robotics conferences (IEEE Intelligent Robots and Systems) has been accepted for oral presentation and publication, which is very pleasing news indeed! Perhaps just as exciting is the location of this conference, Beijing, which should hopefully see me traveling there in October to present my work.

For those that care, my paper is about how to dock a mobile robot (or any autonomous vehicle) with a looming surface from the video feed of one single digital camera. The paper provides a theoretical argument for why it is good to track the focus of expansion in the acquired images, when estimating the time-to-contact with the looming surface.

Let me try and explain it (as it's good to practice this sort of thing).

You can think of the focus of expansion as being the point in the image where everything seems to expand from when you move towards a large surface. Now, imagine you wanted to dock with that surface such that you stopped as close as possible to the surface, without actually making contact with it (such as when parking your car in your garage). That requires decelerating in a highly controlled manner, particularly when in close proximity with the surface. As most of us do, the best approach is to decelerate the vehicle in proportion to the estimated distance from the object - and we are very good at doing this, because we have a pair of cameras in our head that provide us with stereo vision, making distance estimation and depth perception very easy (we also have a hell of a lot of cognitive help as well). On a robot with a single camera (or even with two), distance estimation ain't easy! So here's where the research comes in.

The first question is, do we actually need to know our distance to the object to achieve such a docking maneuver anyway ? Sure, we need to have some gauge of our proximity to the surface, but does this actually need to be an absolute measure of distance? Well, the answer is no (of course), we don't! Biological systems don't need this, so why should robots?

Another approach that gets the job done (inspired by honeybees) is to measure the expansion of the projected image (divergence) of the object as you approach it. If you measure the divergence generated from one frame to the next as you move a camera at constant speed towards a surface, you will see that as the surface looms closer, the divergence increases (i.e. things appear to get bigger, faster). Now, think about what would happen if you tried to keep this rate of expansion constant as you approached the surface. Now remember, from one frame to the next, the divergence will increase if you're speed is constant. Therefore, the only way to maintain a constant divergence (apart from just stopping straight away), is to slow down! Of course, as you continue towards the surface at your slower speed, the divergence will again begin to increase, but that's ok, you just need to make another adjustment of your speed, and slow down some more. The point is that if you keep doing this (what we call a control law), you will eventually come to a complete halt, and in theory at least, be pretty darn close to the surface you are docking with.

Simple isn't it!?

Well, of course, it's not quite that simple. It's that in theory bit that never quite translates to real world conditions. My paper, in a nut shell, is focused on how to make the above strategy work reliably, under real world conditions. I came up with a strategy that makes this work a whole lot better if you account for the small bumps and rotations that occur as you approach. You can do this, as it turns out, by tracking the focus of expansion in each frame as it comes in. So while the images acquired will be subject to noisy bumps and rotations as the robot proceeds forward, the focus of expansion will move from one frame to the next. The thing is that the focus of expansion represents the point on the surface you are predominantly heading towards, and so it makes a lot of sense to measure your time-to-contact at this point in the image. I guess tracking the focus of expansion is somewhat analogous to moving your eyes to fixate on an object, even though your head is subject to movement due to shocks and bumps as you move in your environment. OK, this is probably too much detail now, but hopefully, you get the gist.

My paper is mostly a lot of maths, and a few graphs - so I don't expect to see it on Oprah's book club list.

I do have this video (WMV) though, that shows how the basic strategy works for a mobile robot (though I admit, not particularly real world conditions).

If you made it this far down, I applaud (and thank) you!

Want to read more about my research interests?


Anonymous mum said...

Hi Chris Congratulations on the paper your becoming quite the jet setter travelling the globe to present your papers...well done mum

6/06/2006 09:43:00 AM

Blogger Frank and Sue said...

Well Chris, made it to the end without my brain exploding too much from having to concentrate soooo long......very interesting and fascinating work. Also enjoyed the video BUT could not quite understand the subtle differences between them. Was it the rate of deceleration or closeness to the wall you were trying to demonstrate? (or both)
The guy I share my office with commented on the Lexus reverse parking technology they are talking about offering in 2008--- how is this different to what you are achieving? Cheers and beers

6/06/2006 09:52:00 AM

Blogger macca said...

thanks Mum.

6/06/2006 10:13:00 AM

Blogger macca said...

thanks, and well done Frank! glad your head didn't explode.

I wouldn't pay any attention to the different runs in the video. The video was actually demonstrating something a bit different (comparing methods for estimating the visual motion from the camera), but all were using the basic strategy I describe here. For the record, there are no amazing differences between each of the runs, which is kind of nice because it means the control law I am using to decelrating the robot works for a variety of techniques that estimate the visual motion.

I am not really aware of what Lexus are planning to incorporate into their cars, but there is certainly a push towards equiping cars with digital cameras to assist with things like parking, and obstacle/pedestrian detection. As prices come down, the automobile industry is a growing market place for computer vision and robotics technology.

I should say that driver asssistance technology is pretty peripheral to my project. I am very much about autonomous robot navigation using biologically inspired ideas (particularly bees).

6/06/2006 10:26:00 AM


Post a Comment

<< Home