Computer program NEIL runs 24/7 to create visual database
Though common sense can sometimes seem rare in the human population, it is almost never associated with computers. Carnegie Mellon researchers, however, are currently implementing a program that allows a computer to label images and acquire common sense almost entirely on its own. The program, called the Never Ending Image Learner (NEIL) runs constantly on Carnegie Mellon’s campus and analyzes thousands of images daily. The researchers aim to create the world’s largest visual database and ultimately improve computer vision.
The research team is led by Abhinav Gupta, an assistant research professor in the department of robotics, and includes Xinlei Chen, a Ph.D. student in Carnegie Mellon’s Language Technologies Institute, and Abhinav Shrivastava, a Ph.D. student in robotics.
Their work builds upon already existing programs that gather a visual knowledge base, such as ImageNet and Visipedia. The issue with these programs, the researchers explain, is that they simply rely too much on human instruction to cover the vast amount of visual data available on the Internet. The fact that NEIL can operate largely on its own makes the process of building visual knowledge bases much more feasible.
Having a computer learn visually is far superior to having it learn from text alone. For example, purely analyzing text references might create incorrect assumptions about objects that would be obvious to people who can see. In this way, NEIL learns in a similar fashion to humans. Babies make visual connections long before they have the ability to read; NEIL works off of the same basic principle.
Since the project began in July, NEIL has analyzed over 5 million pictures. NEIL is able to make the common sense connections by looking through all of these images. For example, after looking through thousands of pictures of wheels and cars, NEIL can make the connection in its own language that “Car” can have a part called a “Wheel.” The program has managed to make over 3,000 of these connections and categorized approximately 500,000 objects.
The technical approach behind NEIL is complex. The researchers started with thousands of “seed images” that they gathered from Google Images in order to train NEIL to recognize patterns. They then used a clustering approach to set up specific models that NEIL can use for future searching. They built upon these models to develop relationships between objects. The program works so that each time NEIL identifies a new object or relationship, it adds to its body of knowledge and becomes better at making visual connections.
“What we have learned in the last 5-10 years of computer vision research is that the more data you have, the better computer vision becomes,” Gupta said in a university press release.
The public can follow NEIL in real-time through the website www.neil-kb.com. The website categorizes NEIL’s knowledge base into objects such as “1950s_car”, scenes such as “Alaska”, and attributes, such as “brown.” As NEIL’s analysis becomes increasingly exhaustive, it will further develop the capability to sub-categorize objects and identify deeper relationships than those between objects. Furthermore, visitors to the site are allowed to send a request or submit a phrase for NEIL to look up.
The program isn’t perfect, of course. NEIL can make mistakes during some searches, especially with homonyms. For example, the researchers mentioned that “Pink” could pose a problem as NEIL might be confused whether the term refers to the pop star or the color. Additionally, because NEIL runs all day, it is computationally intensive and requires over 200 processing cores in 2 clusters.
Moving forward, the team is excited for NEIL’s potential contributions to the fields of scene classification and object classification. They will travel to Sydney, Australia in December to present their current findings at the Institute of Electrical and Electronics Engineers (IEEE) International Conference on Computer Vision.