Video frame interpolation aims to synthesize non-existent frames in-between the original frames. While significant advances have been made from the deep convolutional neural networks, the quality of interpolation is often reduced due to large object motion or occlusion. In this work, we propose to explicitly detect the occlusion by exploring the depth cue in frame interpolation. Specifically, we develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones. In addition, we learn hierarchical features as the contextual information. The proposed model then warps the input frames, depth maps, and contextual features based on the optical flow and local interpolation kernels for synthesizing the output frame. Our model is compact, efficient, and fully differentiable to optimize all the components. We conduct extensive experiments to analyze the effect of the depth-aware flow projection layer and hierarchical contextual features. Quantitative and qualitative results demonstrate that the proposed model performs favorably against state-of-the-art frame interpolation methods on a wide variety of datasets. Read More
Tag Archives: Vision
Rhythm and Synchrony in a Cortical Network Model
We studied mechanisms for cortical gamma-band activity in the cerebral cortex and identified neurobiological factors that affect such activity. This was done by analyzing the behavior of a previously developed, data-driven, large-scale network model that simulated many visual functions of monkey V1 cortex (Chariker et al., 2016). Gamma activity was an emergent property of the model. The model’s gamma activity, like that of the real cortex, was (1) episodic, (2) variable in frequency and phase, and (3) graded in power with stimulus variables like orientation. The spike firing of the model’s neuronal population was only partially synchronous during multiple firing events (MFEs) that occurred at gamma rates. Detailed analysis of the model’s MFEs showed that gamma-band activity was multidimensional in its sources. Most spikes were evoked by excitatory inputs. A large fraction of these inputs came from recurrent excitation within the local circuit, but feedforward and feedback excitation also contributed, either through direct pulsing or by raising the overall baseline. Inhibition was responsible for ending MFEs, but disinhibition led directly to only a small minority of the synchronized spikes. As a potential explanation for the wide range of gamma characteristics observed in different parts of cortex, we found that the relative rise times of AMPA and GABA synaptic conductances have a strong effect on the degree of synchrony in gamma. Read More
Orientation Selectivity from Very Sparse LGN Inputs in a Comprehensive Model of Macaque V1 Cortex
A new computational model of the primary visual cortex (V1) of the macaque monkey was constructed to reconcile the visual functions of V1 with anatomical data on its LGN input, the extreme sparseness of which presented serious challenges to theoretically sound explanations of cortical function. We demonstrate that, even with such sparse input, it is possible to produce robust orientation selectivity, as well as continuity in the orientation map. We went beyond that to find plausible dynamic regimes of our new model that emulate simultaneously experimental data for a wide range of V1 phenomena, beginning with orientation selectivity but also including diversity in neuronal responses, bimodal distributions of the modulation ratio (the simple/complex classification), and dynamic signatures, such as gamma-band oscillations. Intracortical interactions play a major role in all aspects of the visual functions of the model. Read More
A Mathematical Model Unlocks the Secrets of Vision
This is the great mystery of human vision: Vivid pictures of the world appear before our mind’s eye, yet the brain’s visual system receives very little information from the world itself. Much of what we “see” we conjure in our heads.
“A lot of the things you think you see you’re actually making up,” said Lai-Sang Young, a mathematician at New York University. “You don’t actually see them.” Read More
New brain map could improve AI algorithms for machine vision
Despite years of research, the brain still contains broad areas of uncharted territory. A team of scientists, led by neuroscientists from Cold Spring Harbor Laboratory and University of Sydney, recently found new evidence revising the traditional view of the primate brain’s visual system organization using data from marmosets. This remapping of the brain could serve as a future reference for understanding how the highly complex visual system works, and potentially influence the design of artificial neural networks for machine vision. Read More
Emotion schemas are embedded in the human visual system
Theorists have suggested that emotions are canonical responses to situations ancestrally linked to survival. If so, then emotions may be afforded by features of the sensory environment. However, few computational models describe how combinations of stimulus features evoke different emotions. Here, we develop a convolutional neural network that accurately decodes images into 11 distinct emotion categories. We validate the model using more than 25,000 images and movies and show that image content is sufficient to predict the category and valence of human emotion ratings. In two functional magnetic resonance imaging studies, we demonstrate that patterns of human visual cortex activity encode emotion category–related model output and can decode multiple categories of emotional experience. These results suggest that rich, category-specific visual features can be reliably mapped to distinct emotions, and they are coded in distributed representations within the human visual system. Read More
Restoring Vision With Bionic Eyes: No Longer Science Fiction
Bionic vision might sound like science fiction, but Dr. Michael Beyeler is working on just that.
Originally from Switzerland, Dr. Beyeler is wrapping up his postdoctoral fellow at the University of Washington before moving to the University of California Santa Barbara this fall to head up the newly formed Bionic Vision Lab in the Departments of Computer Science and Psychological & Brain Sciences.
We spoke with him about this “deep fascination with the brain” and how he hopes his work will eventually be able to restore vision to the blind. Read More
Fully automatic landing with optically supported navigation for small aircrafts
Challenges and opportunities with Computer Vision at the Edge
Neuromorphic Chips and the Future of Your Cell Phone
This article is particularly fun for me since it brings together two developments that I didn’t see coming together, real time computer vision (RTCV), and neuromorphic neural nets (aka spiking neural nets).
We’ve been following neuromorphic nets for a few years now (additional references at the end of this article) and viewed them as the next generation (3rdgeneration) of neural nets. This was mostly in the context of the pursuit of Artificial General Intelligence (AGI) which is the holy grail (or terrifying terminator) of all we’ve been doing.
Where we got off track was in thinking that neuromorphic nets that are just in their infancy were only for AGI. Turns out that they facilitate a lot of closer-in capabilities, and among them could be real time computer vision (RTCV). Why that’s true turns out to have more to do with how neuromorphics are structured than what fancy things they may be able to do. Here’s the story. Read More