You are here

Rapid Recognition of Faces from Video

An innovative key frame analysis algorithm tracks the face(s) across video frames and automatically selects the frame with the best face(s) quality based on carefully selected metrics, to be used for face recognition.

Recently the need for rapid recognition of faces in video has become increasingly important for national and industrial security. Facial recognition in video requires processing of huge amount of video streams using video analytics. Considering video streams of 30 frames per second (FPS) from surveillance and security cameras, performing real-time face tracking on live stream video or on a large video repository already poses considerable computational challenges. Adding face recognition on top of this definitely complicates the situation. It often requires the computing power of a supercomputer or a cloud computing platform.

To address these challenges, CITeR researchers have developed an innovative key frame analysis algorithm. It tracks faces across frames and automatically selects frames with the best qualities based on carefully selected metrics. Researchers optimized the algorithm around a state-of-the-art graphic processing unit (GPU) as the hardware computing engine. The GPU is traditionally used for video display for computers, but it has better capability of performing parallel computation than do traditional Central Processing Units (CPUs). The team takes advantage of this capability, using it in their algorithm to accelerate face detection and image quality analysis to better meet the increased computational requirements. The team has demonstrated processing speeds of over 100 FPS; much greater than the real-time video frame rate of 30 FPS. Using a high-end GPU the breakthrough approach can be used for offline processing of large video repositories. It also can be used for online processing with a mobile GPU co-located with the video camera. This reduces the need to stream the entire video sequence to the back-end server for processing.

The face quality captured from the video stream has a huge impact on the face recognition accuracy. There has been extensive research on face matching algorithms based on still images. This research addresses how to extract the best quality face from video streams in order to improve the overall performance of face in video recognition system.

There are many application cases in which real-time processing of faces in video is needed. For example, face analysis in videos can be useful for solving crimes that involve video evidence. Airport terminals and railway stations all need real-time face detection and recognition to identify subject(s) of interest in a timely manner. Law enforcement agencies need rapid scanning of huge amounts of repository data searching for usable face images, whether for crime prevention or evidence gathering purposes. Industry and government affiliates are also interested in this technology for use in their operations. This project has been selected to be permanently on display as an active demonstration at FBI Headquarters.

Economic Impact:

For face in video recognition, the traditional approach is human-based. A human examiner goes through video streams looking for subject(s) of interest. This is the most labor intensive method; unfortunately, it is also very error-prone. Another approach is to go through the video frame one by one using computers. This approach can be suitable for offline processing but for large video repositories processing speed can be problematic. The third approach is to sample the video stream to select single frames for processing. This reduces the computation demand from that of the second approach. There is, however, no guarantee that the sampled frame can achieve the best results for identification purposes because the sampling is commonly based on the frame quality, not face quality, which is face recognition’s primary interest. With this new key frame analysis algorithm and GPU accelerated technology, the aforementioned drawbacks can be avoided. The new approach can free human examiners from the tedious work of going through the video streams; the automated process takes care of that. This enables humans to focus on more important aspect of things, such as connection the crime dots together, or using the information to more quickly locate the suspect(s). All of this can decreases costs and make better use of the humans’ capabilities.



The sensor system consists of both surveillance camera and mobile GPU. They can be widely dispersed to perform real-time face recognition in a distributed manner. For back-end processing of large video repositories using high-performance GPU based platforms (the computer with multiple GPU cards or a set of GPU-based computers) researchers can process large amount of video steams with blazingly fast speeds. This greatly improves the system’s capability to identifying suspect(s) efficiently and effectively. It also significantly reduces labor cost associated with face recognition. The bottom line is that GPUs are a more economical way of computation when compared with traditional CPU-based technologies that have normalized computational capacities.

This would transform the current surveillance system from a passive system with the only purpose as recording video steam, to an active system that can detect suspect(s) in real time, putting face recognition at practitioners’ fingertips. Finally, the point should be made that when deploying this technology, caution should be taken to protect the privacy of general public.

For more information, contact Chen Liu at Clarkson University, cliu@clarkson.edu, Bio http://www.clarkson.edu/ece/faculty_staff/faculty/liu.html, 315.268.2090.

PDF icon CITeR-2016.pdf