Facial recognition is a hot topic that is posed to impact all of our lives, as businesses and organizations launch new tools powered by this innovative technology.
Recently, NEC announced its plan to use facial recognition to identify and authenticate over 300,000 athletes, volunteers, media personnel, and guests at the upcoming 2020 Tokyo Olympics. Meanwhile, Uber’s Real-Time ID Check uses driver selfies to verify driver identities before approving passenger rides. And, just three days after installing facial recognition software at leading US airports, US immigration authorities intercepted an imposter posing as a French passport holder at the Washington Dulles International Airport.
Facial recognition comes naturally to humans (unless you suffer from prosopagnosia), but how do machines do it? All facial recognition technology follows the same basic procedure: detect a face in an image; normalize the face for light, pose, size, etc.; mathematically extract distinct features, and compare the data to information in a database.
This is very similar to how security services manually use fingerprint matching. The key difference is that facial recognition is automated—allowing nearly real-time use on a massive scale. A variety of signals, like imaging, video or body heat, are used to power technologies that include 2D and 3D facial recognition, thermal imaging, and skin texture analysis. In this blogpost, we demystify these technologies and explore the facial recognition ecosystem in China.
2D facial recognition identifies distinctive features (e.g., eyes, nose, and cheekbones) from an image of a user’s face. The relative size, position, and shape of these features are then compared to information in a database. This is done by either directly matching features (geometric matching method) or by converting the features into values (photometric approach). Facebook uses 2D facial recognition to tag photos—it suggests friends you might want to tag or photos in which you might want to tag yourself. However, the technology has its limitations: recognition accuracy varies with pose, capture angle, and image lighting.
3D facial recognition resolves these restrictions by capturing a facial image in three dimensions. It does this using multiple images taken at various angles or by using structured light (i.e., projecting a known pattern of light) to measure and render a 3D image. Apple has applied this technology in its Face ID to create 3D digital images of iPhone users’ faces, using infrared dots picked up by a camera that are measured against a stored image to authenticate a user’s face. Apple stands by the security of the Face ID feature—even in dark.
Skin texture analysis goes even further. It scans small patches of skin, at the level of fine lines, pores, and texture, and translates these into unique mathematical measurements. This makes skin texture facial recognition 20–30% more accurate than 3D facial recognition. It’s even capable of distinguishing between identical twins.
Thermal imaging comes in handy in low light or nighttime conditions. Powerful sensors capture the bodies’ thermal infrared waves, which are then converted into a facial image using deep neural networks (also see How do neural networks mimic the human brain?). DJI, a leading drone manufacturer, uses this approach to power its nighttime search and rescue missions.
While many cities, states, and security agencies are only beginning to use facial recognition technology, China has already taken it to the next level. Using an estimated 176 million surveillance cameras (with an additional 450 million expected by 2020), China has fine-tuned facial recognition for domestic security. In one recent experiment, demonstrating the capabilities of Guiyang’s facial recognition system, a BBC reporter was found within the city in just seven minutes.
By 2020, China expects to implement a national social credit system that will calculate a social credit score for each citizen based on multiple data, from income tax, utility and credit bills to criminal activity and traffic records and shopping habits and social media posts. Once established, social credit rating can be used to blacklist or approve citizens for travel, insurance, loans, employment, internet access and more.
The technology powering these endeavors is deep-learning and computer-vision artificial intelligence (AI). In China, SenseTime has established itself as the county’s “largest and self-developed AI company.” While other AI companies use popular open-sourced operating systems or frameworks, such as Caffe, Torch, or TensorFlow, SenseTime has built its own AI platform, Parrots.
This platform boasts an astounding 1,207 network layers that can simultaneously train up to 2 billion facial images. According to SenseTime, its facial recognition technology can detect up to 240 key facial characteristics in a millisecond and can recognize over 10 facial attributes, such as gender, age, expression, and facial hair. More amazingly, it does all this with an error rate of less than 1 in 10,000,000. Currently valued at $4.5 billion USD, SenseTime claims to have processed 500 million identities for facial recognition purposes.
Current facial recognition systems are far from perfect, and there are real and valid concerns regarding accuracy, bias, and abuse of the technology that must be addressed. In fact, Microsoft recently published a strong pitch for public regulation and corporate responsibility for facial recognition. But, challenges like these are a rite of passage in the evolution of any new technology. Eventually, accuracy will meet performance expectations, governments will establish protective standards, and millions of end-users will benefit from mainstream adoption of facial recognition technology.
The question is not “if,” but “when.”
We would like to sincerely thank George Huang, CHRO, Global Head of Business Development and Managing Partner, AI Industries Fund at SenseTime, for generously donating his time and insights to this blogpost.