You are here
Improving face image extraction by using deep learning technique.
The National Library of Medicine (NLM) has made a collection of over a 1.2 million research articles containing 3.2 million figure images searchable using the Open-i SM multimodal (text+image) search engine. Many images are visible light photographs, some of which are images containing faces (“face images”). Some of these face images are acquired in unconstrained settings, while others are studio photos. To extract the face regions in the images, we first applied one of the most widely-used face detectors, a pre-trained Viola-Jones detector implemented in Matlab and OpenCV. The Viola-Jones detector was trained for unconstrained face image detection, but the results for the NLM database included many false positives, which resulted in a very low precision. To improve this performance, we applied a deep learning technique, which reduced the number of false positives and as a result, the detection precision was improved significantly. (For example, the classification accuracy for identifying whether the face regions output by this Viola-Jones detector are true positives or not in a test set is about 96%.) By combining these two techniques (Viola-Jones and deep learning) we were able to increase the system precision considerably, while avoiding the need to manually construct a large training set by manual delineation of the face regions.