robot vision land
After a few months of working for real money, I’m back (probably) to cooling my heels and puttering around with my metal friends.
A recent job involved generating “face recognition” s/w. I.e. not just a matter of locating a human face inside an image (not a difficult task these days with stuff like opencv around), but trying to tell which of the 1000+ people it *could* be pick the one it most likely is.
It turned out that task was not too difficult, either. There is a lot of information in the placment and size (all as detected by my simple s/w) of just eyes, nose and mouth. So much, in fact, it seems the “1000+” can easily be extended to “100,000″ with no significant loss in reliability.
I’ve been tinkering lately with getting the face recog into an embedded device or 2 — since we all want to be able to search or DVD’s for faces of dick cheney, or an image with angelina and bratt in the same frame, or maybe ID those people on that talk show.
Some examples of my recent musing are on my youtube pages, but here’s a brief list:
http://www.youtube.com/watch?v=rtIUWYdVoZI&feature=channel_page
http://www.youtube.com/watch?v=yutPC27ovLs&feature=channel_page
http://www.youtube.com/watch?v=YhYYpZclBww&feature=channel_page
The clips are mostly experiments with RECOGNISING faces, not with tracking them. So things jerk around quite a bit, and the recognition itself is based on each vid frame and is not “smoothed” out. E.g. it can change its mind from frame-to-frame rather than understanding if X was the guy in the image 1/25th sec ago, and there seems to have been not cutaway to a new scene, then the same guy close to that position this time is probably X again.