Differences between revisions 1 and 2
Revision 1 as of 2015-06-23 11:08:21
Size: 3091
Comment:
Revision 2 as of 2015-06-23 11:08:58
Size: 3091
Comment:
Deletions are marked like this. Additions are marked like this.
Line 19: Line 19:
[[Include(Publications/PerrinetBednar15)]] <<Include(Publications/PerrinetBednar15)>>

Une nouvelle étude analysant comment nous détectons un animal dans une scène visuelle permet de révéler certains mystères du cerveau.

Des scientifiques de l’université d'Aix-Marseille et de l’université d'Edinburgh en Ecosse ont modélisé la façon dont nous pouvons distinguer des animaux sur une image en une fraction de seconde. Ils ont alors trouvé que cette classification est possible à un niveau de représentation très primitif, et non, comme cela est généralement admis, après une longue série d'analyses visuelles de plus en plus abstraites (détection des yeux et des membres, puis de la tête et du corps, etc...).

Cette étude montre que quand des personnes regardent une image, leur cerveau se fait très rapidement une première idée de son contenu, en complément de processus de traitement de plus en plus raffinés et de plus en plus longs.

Ces chercheurs ont utilisé des données précédemment enregistrées dans lesquelles des volontaires regardaient et classifiaient des centaines d'images. Ils ont ensuite utilisé des modèles mathématiques de la représentation visuelle des images dans l'aire visuelle primaire et en particulier des inter-relations entre des éléments de contours voisins. En utilisant cette représentation primitive, ils ont mis en évidence qu'un programme très simple pouvait facilement classifier les images comme contenant ou non un animal, sans avoir besoin d’une connaissance plus élaborée sur les caractéristiques d’un animal comme sa position, sa taille ou son orientation sur l’image.

Cette découverte peut accélérer le développement de requêtes via des images dans les moteurs de recherche, comme Google et Facebook, car elle permet une classification simple et robuste grâce à des caractéristiques statistiques de bas niveau basées la géométrie des objets et pourrait ainsi améliorer l'efficacité de tels algorithmes.

Cette étude a été financée grâce à des aides de la communauté européenne et de l'Agence Nationale pour la Recherche Française et est publiée dans le journal Scientific Reports du groupe Nature Publishing.

Selon Laurent Perrinet, chercheur à l'institut des neurosciences de la Timone, qui a conduit cette étude en collaboration avec James Bednar de l'université d'Edinburgh : "Les résultats de cette étude ont des applications directes pour la classification des images mais aussi des conséquences inattendues sur notre compréhension des mécanismes visuels. En effet, ils montrent qu'en un clin d'œil, nous sommes capables d'extraire une première impression de la scène en exploitant des régularités statistiques simples avant de procéder à une analyse plus complexe de la scène. Plus surprenant encore, nous avons mis en évidence que quand les humains se trompent en classifiant de manière erronée une image comme contenant un animal, alors le modèle que nous avons construit se trompe de la même façon!"

Edge co-occurrences can account for rapid categorization of natural versus animal images

Edge co-occurrences
Edge co-occurrences (A) An example image with the list of extracted edges overlaid. (B) definition of edge co-occurrences (click on the figure for more details).

Making a judgment about the semantic category of a visual scene, such as whether it contains an animal, is typically assumed to involve high-level associative brain areas. Previous explanations require progressively analyzing the scene hierarchically at increasing levels of abstraction, from edge extraction to mid-level object recognition and then object categorization. Here we show that the statistics of edge co-occurrences alone are sufficient to perform a rough yet robust (translation, scale, and rotation invariant) scene categorization. We first extracted the edges from images using a scale-space analysis coupled with a sparse coding algorithm. We then computed the "association field" for different categories (natural, man-made, or containing an animal) by computing the statistics of edge co-occurrences. These differed strongly, with animal images having more curved configurations. We show that this geometry alone is sufficient for categorization, and that the pattern of errors made by humans is consistent with this procedure. Because these statistics could be measured as early as the primary visual cortex, the results challenge widely held assumptions about the flow of computations in the visual system. The results also suggest new algorithms for image classification and signal processing that exploit correlations between low-level structure and the underlying semantic category.

Figures

figure_chevrons.png

Figure 2: The probability distribution function $p(\psi, \theta)$ represents the distribution of the different geometrical arrangements of edges' angles, which we call a chevron map. We show here the histogram for non-animal natural images, illustrating the preference for co-linear edge configurations. For each chevron configuration, deeper and deeper red circles indicate configurations that are more and more likely with respect to a uniform prior, with an average maximum of about $3$ times more likely, and deeper and deeper blue circles indicate configurations less likely than a flat prior (with a minimum of about $0.8$ times as likely). Conveniently, this chevron map shows in one graph that non-animal natural images have on average a preference for co-linear and parallel edges, (the horizontal middle axis) and orthogonal angles (the top and bottom rows),along with a slight preference for co-circular configurations (for $\psi=0$ and $\psi=\pm \frac \pi 2$, just above and below the central row). We compare chevron maps in different image categories in Figure 3. Go back to manuscript page.

figure_chevrons2.png

Figure 3: As for Figure 2, we show the probability of edge configurations as chevron maps for two databases (man-made, animal). Here, we show the ratio of histogram counts relative to that of the non-animal natural image dataset. Deeper and deeper red circles indicate configurations that are more and more likely (and blue respectively less likely) with respect to the histogram computed for non-animal images. In the left plot, the animal images exhibit relatively more circular continuations and converging angles (red chevrons in the central vertical axis) relative to non-animal natural images, at the expense of co-linear, parallel, and orthogonal configurations (blue circles along the middle horizontal axis). The man-made images have strikingly more co-linear features (central circle), which reflects the prevalence of long, straight lines in the cage images in that dataset. We use this representation to categorize images from these different categories in Figure 4. Go back to manuscript page.

figure_results.png

Figure 4: Classification results. To quantify the difference in low-level feature statistics across categories (see Figure~\ref{fig:chevrons2}), we used a standard Support Vector Machine (SVM) classifier to measure how each representation affected the classifier's reliability for identifying the image category. For each individual image, we constructed a vector of features as either (FO) the histogram of first-order statistics as the histogram of edges' orientations, (CM) the chevron map subset of the second-order statistics, (i.e., the two-dimensional histogram of relative orientation and azimuth; see Figure 2 ), or (SO) the full, four-dimensional histogram of second-order statistics (i.e., all parameters of the edge co-occurrences). We gathered these vectors for each different class of images and report here the results of the SVM classifier using an F1 score (50\% represents chance level). While it was expected that differences would be clear between non-animal natural images versus laboratory (man-made) images, results are still quite high for classifying animal images versus non-animal natural images, and are in the range reported by~\citet{Serre07} (F1 score of 80\% for human observers and 82\% for their model), even using the CM features alone. We further extend this results to the psychophysical results of Serre et al. (2007) in Figure 5. Go back to manuscript page.

figure_FA_humans.png

Figure 5: To see whether the patterns of errors made by humans are consistent with our model, we studied the second-order statistics of the 50 non-animal images that human subjects in Serre et al. (2007) most commonly falsely reported as having an animal. We call this set of images the false-alarm image dataset. (Left) This chevron map plot shows the ratio between the second-order statistics of the false-alarm images and the full non-animal natural image dataset, computed as in Figure 3 (left). Just as for the images that actually do contain animals (Figure~\ref{fig:chevrons2}, left), the images falsely reported as having animals have more co-circular and converging (red chevrons) and fewer collinear and orthogonal configurations (blue chevrons). (Right) To quantify this similarity, we computed the Kullback-Leibler distance between the histogram of each of these images from the false-alarm image dataset, and the average histogram of each class. The difference between these two distances gives a quantitative measure of how close each image is to the average histograms for each class. Consistent with the idea that humans are using edge co-occurences to do rapid image categorization, the 50 non-animal images that were worst classified are biased toward the animal histogram ($d' = 1.04$), while the 550 best classified non-animal images are closer to the non-animal histogram. Go back to manuscript page.

reference

  • Laurent U. Perrinet, James A. Bednar. Edge co-occurrences can account for rapid categorization of natural versus animal images, URL URL2 . Scientific Reports, 5:11400, 2015 abstract.


All material (c) L. Perrinet. Please check the copyright notice.


This work was supported by ANR project "BalaV1" N° ANR-13-BSV4-0014-02.
ANR logo


This work was supported by European Union project Number FP7-269921, "BrainScales".
BrainScaleS logoFET logoFP7 logoEU logo


TagYear15 TagBrainScales TagPublicationsArticles TagAnrBalaV1 TagSparse TagBicv

welcome: please sign in