Convolutional Neural Networks

Recent advancements in Computer Vision started a new era of Neural Networks research, shaping what is nowadays being called Deep Learning. Deep neural networks have improved the state of the art results of many computerized tasks such as image classification, reinforcement learning and text sentiment analysis. Regarding Computer Vision, at the heart of most advancements over the past 5 years lies a neural network architecture called a Convolutional Neural Network (CNN). CNNs leverage one of the most traditional procedures in Digital Image Processing, a convolution operation, to map objects from the image domain to probabilities that can then be associated to categories.

Three common tasks related to CNNs are i) image classification, ii) object detection and iii) image segmentation. The first concerns the problem of associating tags ("cat", "dog", etc) to whole images. Object detection involves the identification of relevant objects in a given image (this image contains two dogs, one cat and a person). Image segmentation, arguably the hardest task among the three, entails the classification of every pixel in the image among a pre-defined set of classes. An example of image segmentation using the U-Net architecture can be seen below (actually a video segmentation):

At INAG we are usually concerned with tasks ii) and iii). Particularly, the segmentation of the objects in an image allows the analysis of important characteristics such as their shape and texture, which can reveal interesting insights about the objects contained in the image.

Some useful code for training CNNs can be obtained in the following github repository: https://github.com/chcomin/torchtrainer

As a side note, we are currently also interested in Graph Neural Networks (a subfield of Geometric Deep Learning), which allows the classification of objects that besides having a set of properties are also interrelated (e.g., classifying people tastes while also taking into account their friends tastes).