ANN ARBOR—Contrary to what you might see in police dramas, you don't have to be Jason Bourne to shake off a computer tracking you through a video feed. Cross paths with someone who vaguely resembles you, and the computer is likely to swap your labels.
But researchers at the University of Michigan have found a way to improve a computer's human-tracking accuracy by more than 30 percent by looking not only at where the targets are going, but also at what they're doing.
"By creating software that understands which activity a person or a group of people is performing, we can obtain much more robust and stable tracking results," says Silvio Savarese, professor of electrical engineering and computer science. "This is a new way of solving the tracking problem and can potentially revolutionize the way researchers look at the tracking problem in general."
Savarese and doctoral student Wongun Choi conducted the research.
“Our method reduces the computational complexity and makes it possible to solve the problem of inferring what a person will do based on their activities as an individual, their interactions with other individuals and their behavior in larger groups”
In order to say anything useful about what its video feed sees, a computer needs to be able to track people and things moving through the scene. Already, computers can identify and track people who are standing or walking, but camera movements and obstacles that temporarily hide targets can throw them off. In order to make computer tracking more reliable, Savarese's team taught the software to recognize interactions such as people walking together, standing in a line or crossing a street.
The motions of an individual give information about his or her interactions, and the interactions can predict future behavior of an individual. For instance, when two people appear to be walking and talking together, the computer can connect their tracks. If the pair then pass behind an obstacle with a third person, the computer now has the intuition to predict that when the individuals reappear, the two in conversation will probably still be together.
If such advanced software tries to follow the behaviors of targets with "brute force"—that is, by considering every possible interaction that could occur and deciding which is most likely—Savarese says that it could take years to give its tracking solution. Ideally, for applications like vehicle collision prevention, such software should run in real time.
To speed things up, the team taught their software to think more like a human. They fed it example videos with targets and behaviors labeled. That way, the computer could capitalize on the analytical strength of our brains: recognizing patterns based on previous experience.
"Our method reduces the computational complexity and makes it possible to solve the problem of inferring what a person will do based on their activities as an individual, their interactions with other individuals and their behavior in larger groups," Savarese says.
His research team will continue to speed up the process, and he reports that a simplified version of the tracking software is approaching real-time operation.
The problem of tracking targets is significant in many fields, from robot vision to observing animal herds in the wild and picking out suspicious activity in a crowd. The application Savarese has in mind would aid drivers, keeping an eye on the pedestrians and sounding an alarm or braking if one of them is about to take an unexpected step into the street.
The research was presented this week at the European Conference on Computer Vision 2012 in Florence, Italy. It was supported by the Office of Naval Research and Toyota Motor Corp.
- Silvio Savarese: http://web.eecs.umich.edu/~silvio