How should you train your AI system? This question is pertinent, because many deep learning systems are still black boxes. Computer scientists from the Netherlands and Spain have now determined how a deep learning system well suited for image recognition learns to recognize its surroundings. They were able to simplify the learning process by forcing the system’s focus toward secondary characteristics.
Convolutional Neural Networks (CNNs) are a form of bio-inspired deep learning in artificial intelligence. The interaction of thousands of ‘neurons’ mimics the way our brain learns to recognize images. ‘These CNNs are successful, but we don’t fully understand how they work’, says Estefanía Talavera Martinez, lecturer and researcher at the Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence of the University of Groningen in the Netherlands.
She has made use of CNNs herself to analyse images made by wearable cameras in the study of human behaviour. Among other things, Talavera Martinez has been studying our interactions with food, so she wanted the system to recognize the different settings in which people encounter food. ‘I noticed that the system made errors in the classification of some pictures and needed to know why this happened.’
By using heat maps, she analysed which parts of the images were used by the CNNs to identify the setting. ‘This led to the hypothesis that the system was not looking at enough details’, she explains. For example, if an AI system has taught itself to use mugs to identify a kitchen, it will wrongly classify living rooms, offices and other places where mugs are used. The solution that was developed by Talavera Martinez and her colleagues David Morales (Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada) and Beatriz Remeseiro (Department of Computer Science, Universidad de Oviedo), both in Spain, is to distract the system from their primary targets.
They trained CNNs using a standard image set of planes or cars and identified through heat maps which parts of the images were used for classification. Then, these parts were blurred in the image set, which was then used for a second round of training. ‘This forces the system to look elsewhere for identifiers. And by using this extra information, it becomes more fine-grained in its classification.’
The approach worked well in the standard image sets, and was also successful in the images Talavera Martinez had collected herself from the wearable cameras. ‘Our training regime gives us results similar to other approaches, but is much simpler and requires less computing time.’ Previous attempts to increase fine-grained classification included combining different sets of CNNs. The approach developed by Talavera Martinez and her colleagues is much more lightweight. ‘This study gave us a better idea of how these CNNs learn, and that has helped us to improve the training program.’