DOI: https://doi.org/10.29363/nanoge.neumatdecas.2023.054
Publication date: 9th January 2023
Although neuromorphic cameras were causally inspired by the retina’s neuromorphology, there is potential to go backwards and try to understand the brain’s processing of visual saliency using event-based data. Visual saliency is the distinct subjective perceptual quality that distinguishes an object from its neighbors and surroundings [1]. Since scientists know that movement, which is captured by photometric silicon retinae, has a huge impact on saliency. Thus, using event-data might be sufficient in tracking a salient object – therefore curbing the need for conventional cameras.
In this talk, we will discuss unsupervised learning methods that exploit the combination of spatial and temporal channels to track salient objects [2]. By using different kernels to induce temporality of spikes, we find that objects can be tracked with accuracy rivaling algorithms deployed on conventional camera data by using several inexpensive distance metrics (e.g. determinant comparison). By then taking the spatiotemporal representation of spikes in pixel neighborhoods, we use three different decision trees to modify tracking templates for object tracking in real-time applications (e.g. drone’s or teletourism.): a relevance tree measuring the difference between the filters, a high-variance tree ensuring generalizability in the filter, and a recency tree measuring how similar the filter is with filters several time-stamps in the past. We show that our results can easily integrate new features without losing old features – representing stability-plasticity, which is necessary for online learning [3].
However, validating these learning models on event-data can be difficult since there is no established ground-truth for saliency. Therefore, we will also present an ongoing experiment to collect eye-tracking data from human participants. The participants are presented event-data captured from imagers like the ATIS and DAVIS cameras. This dataset is to be used as a metric for future visual saliency algorithms written for event-data. If event-based cameras continue to prove their sufficiency in identifying salient objects, this dataset will lay the foundation for future algorithm development.