Human Intelligence is a byproduct of thousands of years of evolution. Our capability to decipher and apprehend visual stimuli, interpret sound waves and rationalize everyday activities is something that machines still cannot be programmed to mimic. After years of research in neuroscience and computational science, there is only so much we know about the human brain. The concept of Intelligence – to some extant – is still a black box from the computational perspective; with much left to explore and unravel through layers of intricacy and abstraction. “Understanding vision and building visual systems is really understanding intelligence”, says Dr. Fei Fei Li – Director of Stanford AI Lab, one of the pioneers of modern day Computer Vision. In recent times, the boon in Deep Learning has enabled researchers around the world to solve core computer vision problems in a much more efficient and successful scale than compared to the methods adopted in classic computer vision era. Our research interests lie in the intersection of Deep Learning and Computer Vision. Empirically, we work with discriminative models such as Convolutional Neural Networks, Clustering architectures, variants of Autoencoders with residuality, different architectural proposals for convenient signal propagation. However, our experiments also involves statistical models such as Bayesian Neural Network, generative models such as Generative Adversarial Network.
Here at VIRG [Visual Intelligence Research Group], our mantra is to elevate machine vision and the state of visual intelligence. We experiment with Deep Learning, Linear Algebra, Statistical Probabilities and Multi-variable Calculus to solve different problems in computer vision. Our research focus is on core computer vision problems, namely
- Object Classification and Detection Semantic
- Instance and Panoptic Segmentation
- Object Localization and Pose Estimation
- Image Captioning
- Virtual Question Answering
- Automated Diagnosis from Biomedical Imagery
- Attention based Residual Scene Understanding
While ample of our research endeavors are central to machine vision, we also work with few of the core research topics in Deep Learning. In some of our projects we try to find solutions to the problem of gradient degradation in neural network training, while some of our projects are more focused on deducting memory and run-time complexity overhead. We are also working on (a) making the neural activation a fully learn-able function rather than a fine-tuned one in pursuit of finding generalization and intelligent rationalization from data, (b) Intersection of Ordinary Differential Equation and Deep Neural Networks, (c) A special loss function with learn-able weights to train pixel-wise classification models and (d) An interpretative and interactive computer vision based deep learning library for people from non-computing background.
Other than visual perceptron and Imagery data, we are also working with time-series data and Generative Adversarial Networks. In one of our recent projects we are working on generating accurate sound from silent videos using ideas from both supervised and semi-supervised learning. On another project we are using deep learning to block noise in real time while conversing on mobile phone.