Courses & TutorialsProgrammingSecurity & Cloud
Awesome Computer Vision – Massive Collection of Resources
For a list people in computer vision listed with their academic genealogy, please visit here
Table of Contents
Books
Computer Vision
- Computer Vision: Models, Learning, and Inference – Simon J. D. Prince 2012
- Computer Vision: Theory and Application – Rick Szeliski 2010
- Computer Vision: A Modern Approach (2nd edition) – David Forsyth and Jean Ponce 2011
- Multiple View Geometry in Computer Vision – Richard Hartley and Andrew Zisserman 2004
- Computer Vision – Linda G. Shapiro 2001
- Vision Science: Photons to Phenomenology – Stephen E. Palmer 1999
- Visual Object Recognition synthesis lecture – Kristen Grauman and Bastian Leibe 2011
- Computer Vision for Visual Effects – Richard J. Radke, 2012
- High dynamic range imaging: acquisition, display, and image-based lighting – Reinhard, E., Heidrich, W., Debevec, P., Pattanaik, S., Ward, G., Myszkowski, K 2010
- Numerical Algorithms: Methods for Computer Vision, Machine Learning, and Graphics – Justin Solomon 2015
- Image Processing and Analysis – Stan Birchfield 2018
- Computer Vision, From 3D Reconstruction to Recognition – Silvio Savarese 2018
OpenCV Programming
- Learning OpenCV: Computer Vision with the OpenCV Library – Gary Bradski and Adrian Kaehler
- Practical Python and OpenCV – Adrian Rosebrock
- OpenCV Essentials – Oscar Deniz Suarez, Mª del Milagro Fernandez Carrobles, Noelia Vallez Enano, Gloria Bueno Garcia, Ismael Serrano Gracia
Machine Learning
- Pattern Recognition and Machine Learning – Christopher M. Bishop 2007
- Neural Networks for Pattern Recognition – Christopher M. Bishop 1995
- Probabilistic Graphical Models: Principles and Techniques – Daphne Koller and Nir Friedman 2009
- Pattern Classification – Peter E. Hart, David G. Stork, and Richard O. Duda 2000
- Machine Learning – Tom M. Mitchell 1997
- Gaussian processes for machine learning – Carl Edward Rasmussen and Christopher K. I. Williams 2005
- Learning From Data– Yaser S. Abu-Mostafa, Malik Magdon-Ismail and Hsuan-Tien Lin 2012
- Neural Networks and Deep Learning – Michael Nielsen 2014
- Bayesian Reasoning and Machine Learning – David Barber, Cambridge University Press, 2012
Fundamentals
- Linear Algebra and Its Applications – Gilbert Strang 1995
Courses
Computer Vision
- EENG 512 / CSCI 512 – Computer Vision – William Hoff (Colorado School of Mines)
- Visual Object and Activity Recognition – Alexei A. Efros and Trevor Darrell (UC Berkeley)
- Computer Vision – Steve Seitz (University of Washington)
- Visual Recognition Spring 2016, Fall 2016 – Kristen Grauman (UT Austin)
- Language and Vision – Tamara Berg (UNC Chapel Hill)
- Convolutional Neural Networks for Visual Recognition – Fei-Fei Li and Andrej Karpathy (Stanford University)
- Computer Vision – Rob Fergus (NYU)
- Computer Vision – Derek Hoiem (UIUC)
- Computer Vision: Foundations and Applications – Kalanit Grill-Spector and Fei-Fei Li (Stanford University)
- High-Level Vision: Behaviors, Neurons and Computational Models – Fei-Fei Li (Stanford University)
- Advances in Computer Vision – Antonio Torralba and Bill Freeman (MIT)
- Computer Vision – Bastian Leibe (RWTH Aachen University)
- Computer Vision 2 – Bastian Leibe (RWTH Aachen University)
- Computer Vision Pascal Fua (EPFL):
- Computer Vision 1 Carsten Rother (TU Dresden):
- Computer Vision 2 Carsten Rother (TU Dresden):
- Multiple View Geometry Daniel Cremers (TU Munich):
Computational Photography
- Image Manipulation and Computational Photography – Alexei A. Efros (UC Berkeley)
- Computational Photography – Alexei A. Efros (CMU)
- Computational Photography – Derek Hoiem (UIUC)
- Computational Photography – James Hays (Brown University)
- Digital & Computational Photography – Fredo Durand (MIT)
- Computational Camera and Photography – Ramesh Raskar (MIT Media Lab)
- Computational Photography – Irfan Essa (Georgia Tech)
- Courses in Graphics – Stanford University
- Computational Photography – Rob Fergus (NYU)
- Introduction to Visual Computing – Kyros Kutulakos (University of Toronto)
- Computational Photography – Kyros Kutulakos (University of Toronto)
- Computer Vision for Visual Effects – Rich Radke (Rensselaer Polytechnic Institute)
- Introduction to Image Processing – Rich Radke (Rensselaer Polytechnic Institute)
Machine Learning and Statistical Learning
- Machine Learning – Andrew Ng (Stanford University)
- Learning from Data – Yaser S. Abu-Mostafa (Caltech)
- Statistical Learning – Trevor Hastie and Rob Tibshirani (Stanford University)
- Statistical Learning Theory and Applications – Tomaso Poggio, Lorenzo Rosasco, Carlo Ciliberto, Charlie Frogner, Georgios Evangelopoulos, Ben Deen (MIT)
- Statistical Learning – Genevera Allen (Rice University)
- Practical Machine Learning – Michael Jordan (UC Berkeley)
- Course on Information Theory, Pattern Recognition, and Neural Networks – David MacKay (University of Cambridge)
- Methods for Applied Statistics: Unsupervised Learning – Lester Mackey (Stanford)
- Machine Learning – Andrew Zisserman (University of Oxford)
- Intro to Machine Learning – Sebastian Thrun (Stanford University)
- Machine Learning – Charles Isbell, Michael Littman (Georgia Tech)
- (Convolutional) Neural Networks for Visual Recognition – Fei-Fei Li, Andrej Karphaty, Justin Johnson (Stanford University)
- Machine Learning for Computer Vision – Rudolph Triebel (TU Munich)
Optimization
- Convex Optimization I – Stephen Boyd (Stanford University)
- Convex Optimization II – Stephen Boyd (Stanford University)
- Convex Optimization – Stephen Boyd (Stanford University)
- Optimization at MIT – (MIT)
- Convex Optimization – Ryan Tibshirani (CMU)
Papers
Conference papers on the web
- CVPapers – Computer vision papers on the web
- SIGGRAPH Paper on the web – Graphics papers on the web
- NIPS Proceedings – NIPS papers on the web
- Computer Vision Foundation open access
- Annotated Computer Vision Bibliography – Keith Price (USC)
- Calendar of Computer Image Analysis, Computer Vision Conferences – (USC)
Survey Papers
- Visionbib Survey Paper List
- Foundations and Trends® in Computer Graphics and Vision
- Computer Vision: A Reference Guide
Tutorials and talks
Computer Vision
- Computer Vision Talks – Lectures, keynotes, panel discussions on computer vision
- The Three R’s of Computer Vision – Jitendra Malik (UC Berkeley) 2013
- Applications to Machine Vision – Andrew Blake (Microsoft Research) 2008
- The Future of Image Search – Jitendra Malik (UC Berkeley) 2008
- Should I do a PhD in Computer Vision? – Fatih Porikli (Australian National University)
- Graduate Summer School 2013: Computer Vision – IPAM, 2013
Recent Conference Talks
- CVPR 2015 – Jun 2015
- ECCV 2014 – Sep 2014
- CVPR 2014 – Jun 2014
- ICCV 2013 – Dec 2013
- ICML 2013 – Jul 2013
- CVPR 2013 – Jun 2013
- ECCV 2012 – Oct 2012
- ICML 2012 – Jun 2012
- CVPR 2012 – Jun 2012
3D Computer Vision
- 3D Computer Vision: Past, Present, and Future – Steve Seitz (University of Washington) 2011
- Reconstructing the World from Photos on the Internet – Steve Seitz (University of Washington) 2013
Internet Vision
- The Distributed Camera – Noah Snavely (Cornell University) 2011
- Planet-Scale Visual Understanding – Noah Snavely (Cornell University) 2014
- A Trillion Photos – Steve Seitz (University of Washington) 2013
Computational Photography
- Reflections on Image-Based Modeling and Rendering – Richard Szeliski (Microsoft Research) 2013
- Photographing Events over Time – William T. Freeman (MIT) 2011
- Old and New algorithm for Blind Deconvolution – Yair Weiss (The Hebrew University of Jerusalem) 2011
- A Tour of Modern “Image Processing” – Peyman Milanfar (UC Santa Cruz/Google) 2010
- Topics in image and video processing Andrew Blake (Microsoft Research) 2007
- Computational Photography – William T. Freeman (MIT) 2012
- Revealing the Invisible – Frédo Durand (MIT) 2012
- Overview of Computer Vision and Visual Effects – Rich Radke (Rensselaer Polytechnic Institute) 2014
Learning and Vision
- Where machine vision needs help from machine learning – William T. Freeman (MIT) 2011
- Learning in Computer Vision – Simon Lucey (CMU) 2008
- Learning and Inference in Low-Level Vision – Yair Weiss (The Hebrew University of Jerusalem) 2009
Object Recognition
- Object Recognition – Larry Zitnick (Microsoft Research)
- Generative Models for Visual Objects and Object Recognition via Bayesian Inference – Fei-Fei Li (Stanford University)
Graphical Models
- Graphical Models for Computer Vision – Pedro Felzenszwalb (Brown University) 2012
- Graphical Models – Zoubin Ghahramani (University of Cambridge) 2009
- Machine Learning, Probability and Graphical Models – Sam Roweis (NYU) 2006
- Graphical Models and Applications – Yair Weiss (The Hebrew University of Jerusalem) 2009
Machine Learning
- A Gentle Tutorial of the EM Algorithm – Jeff A. Bilmes (UC Berkeley) 1998
- Introduction To Bayesian Inference – Christopher Bishop (Microsoft Research) 2009
- Support Vector Machines – Chih-Jen Lin (National Taiwan University) 2006
- Bayesian or Frequentist, Which Are You? – Michael I. Jordan (UC Berkeley)
Optimization
- Optimization Algorithms in Machine Learning – Stephen J. Wright (University of Wisconsin-Madison)
- Convex Optimization – Lieven Vandenberghe (University of California, Los Angeles)
- Continuous Optimization in Computer Vision – Andrew Fitzgibbon (Microsoft Research)
- Beyond stochastic gradient descent for large-scale machine learning – Francis Bach (INRIA)
- Variational Methods for Computer Vision – Daniel Cremers (Technische Universität München) (lecture 18 missing from playlist)
Deep Learning
- A tutorial on Deep Learning – Geoffrey E. Hinton (University of Toronto)
- Deep Learning – Ruslan Salakhutdinov (University of Toronto)
- Scaling up Deep Learning – Yoshua Bengio (University of Montreal)
- ImageNet Classification with Deep Convolutional Neural Networks – Alex Krizhevsky (University of Toronto)
- The Unreasonable Effectivness Of Deep Learning Yann LeCun (NYU/Facebook Research) 2014
- Deep Learning for Computer Vision – Rob Fergus (NYU/Facebook Research)
- High-dimensional learning with deep network contractions – Stéphane Mallat (Ecole Normale Superieure)
- Graduate Summer School 2012: Deep Learning, Feature Learning – IPAM, 2012
- Workshop on Big Data and Statistical Machine Learning
- Machine Learning Summer School – Reykjavik, Iceland 2014
- Deep Learning Session 1 – Yoshua Bengio (Universtiy of Montreal)
- Deep Learning Session 2 – Yoshua Bengio (University of Montreal)
- Deep Learning Session 3 – Yoshua Bengio (University of Montreal)
Software
Annotation tools
External Resource Links
- Computer Vision Resources – Jia-Bin Huang (UIUC)
- Computer Vision Algorithm Implementations – CVPapers
- Source Code Collection for Reproducible Research – Xin Li (West Virginia University)
- CMU Computer Vision Page
General Purpose Computer Vision Library
- Open CV
- mexopencv
- SimpleCV
- Open source Python module for computer vision
- ccv: A Modern Computer Vision Library
- VLFeat
- Matlab Computer Vision System Toolbox
- Piotr’s Computer Vision Matlab Toolbox
- PCL: Point Cloud Library
- ImageUtilities
Multiple-view Computer Vision
- MATLAB Functions for Multiple View Geometry
- Peter Kovesi’s Matlab Functions for Computer Vision and Image Analysis
- OpenGV – geometric computer vision algorithms
- MinimalSolvers – Minimal problems solver
- Multi-View Environment
- Visual SFM
- Bundler SFM
- openMVG: open Multiple View Geometry – Multiple View Geometry; Structure from Motion library & softwares
- Patch-based Multi-view Stereo V2
- Clustering Views for Multi-view Stereo
- Floating Scale Surface Reconstruction
- Large-Scale Texturing of 3D Reconstructions
- Awesome 3D reconstruction list
Feature Detection and Extraction
- VLFeat
- SIFT
- David G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
- SIFT++
- BRISK
- Stefan Leutenegger, Margarita Chli and Roland Siegwart, “BRISK: Binary Robust Invariant Scalable Keypoints”, ICCV 2011
- SURF
- Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool, “SURF: Speeded Up Robust Features”, Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346–359, 2008
- FREAK
- A. Alahi, R. Ortiz, and P. Vandergheynst, “FREAK: Fast Retina Keypoint”, CVPR 2012
- AKAZE
- Pablo F. Alcantarilla, Adrien Bartoli and Andrew J. Davison, “KAZE Features”, ECCV 2012
- Local Binary Patterns
High Dynamic Range Imaging
Semantic Segmentation
Low-level Vision
Stereo Vision
- Middlebury Stereo Vision
- The KITTI Vision Benchmark Suite
- LIBELAS: Library for Efficient Large-scale Stereo Matching
- Ground Truth Stixel Dataset
Optical Flow
- Middlebury Optical Flow Evaluation
- MPI-Sintel Optical Flow Dataset and Evaluation
- The KITTI Vision Benchmark Suite
- HCI Challenge
- Coarse2Fine Optical Flow – Ce Liu (MIT)
- Secrets of Optical Flow Estimation and Their Principles
- C++/MatLab Optical Flow by C. Liu (based on Brox et al. and Bruhn et al.)
- Parallel Robust Optical Flow by Sánchez Pérez et al.
Image Denoising
BM3D, KSVD,
Super-resolution
- Multi-frame image super-resolution
- Pickup, L. C. Machine Learning in Multi-frame Image Super-resolution, PhD thesis 2008
- Markov Random Fields for Super-Resolution
- W. T Freeman and C. Liu. Markov Random Fields for Super-resolution and Texture Synthesis. In A. Blake, P. Kohli, and C. Rother, eds., Advances in Markov Random Fields for Vision and Image Processing, Chapter 10. MIT Press, 2011
- Sparse regression and natural image prior
- K. I. Kim and Y. Kwon, “Single-image super-resolution using sparse regression and natural image prior”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 6, pp. 1127-1133, 2010.
- Single-Image Super Resolution via a Statistical Model
- T. Peleg and M. Elad, A Statistical Prediction Model Based on Sparse Representations for Single Image Super-Resolution, IEEE Transactions on Image Processing, Vol. 23, No. 6, Pages 2569-2582, June 2014
- Sparse Coding for Super-Resolution
- R. Zeyde, M. Elad, and M. Protter On Single Image Scale-Up using Sparse-Representations, Curves & Surfaces, Avignon-France, June 24-30, 2010 (appears also in Lecture-Notes-on-Computer-Science – LNCS).
- Patch-wise Sparse Recovery
- Jianchao Yang, John Wright, Thomas Huang, and Yi Ma. Image super-resolution via sparse representation. IEEE Transactions on Image Processing (TIP), vol. 19, issue 11, 2010.
- Neighbor embedding
- H. Chang, D.Y. Yeung, Y. Xiong. Super-resolution through neighbor embedding. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol.1, pp.275-282, Washington, DC, USA, 27 June – 2 July 2004.
- Deformable Patches
- Yu Zhu, Yanning Zhang and Alan Yuille, Single Image Super-resolution using Deformable Patches, CVPR 2014
- SRCNN
- Chao Dong, Chen Change Loy, Kaiming He, Xiaoou Tang, Learning a Deep Convolutional Network for Image Super-Resolution, in ECCV 2014
- A+: Adjusted Anchored Neighborhood Regression
- R. Timofte, V. De Smet, and L. Van Gool. A+: Adjusted Anchored Neighborhood Regression for Fast Super-Resolution, ACCV 2014
- Transformed Self-Exemplars
- Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja, Single Image Super-Resolution using Transformed Self-Exemplars, IEEE Conference on Computer Vision and Pattern Recognition, 2015
Image Deblurring
Non-blind deconvolution
- Spatially variant non-blind deconvolution
- Handling Outliers in Non-blind Image Deconvolution
- Hyper-Laplacian Priors
- From Learning Models of Natural Image Patches to Whole Image Restoration
- Deep Convolutional Neural Network for Image Deconvolution
- Neural Deconvolution
Blind deconvolution
- Removing Camera Shake From A Single Photograph
- High-quality motion deblurring from a single image
- Two-Phase Kernel Estimation for Robust Motion Deblurring
- Blur kernel estimation using the radon transform
- Fast motion deblurring
- Blind Deconvolution Using a Normalized Sparsity Measure
- Blur-kernel estimation from spectral irregularities
- Efficient marginal likelihood optimization in blind deconvolution
- Unnatural L0 Sparse Representation for Natural Image Deblurring
- Edge-based Blur Kernel Estimation Using Patch Priors
- Blind Deblurring Using Internal Patch Recurrence
Non-uniform Deblurring
- Non-uniform Deblurring for Shaken Images
- Single Image Deblurring Using Motion Density Functions
- Image Deblurring using Inertial Measurement Sensors
- Fast Removal of Non-uniform Camera Shake
Image Completion
Image Retargeting
Alpha Matting
- Alpha Matting Evaluation
- Closed-form image matting
- Spectral Matting
- Learning-based Matting
- Improving Image Matting using Comprehensive Sampling Sets
Image Pyramid
Edge-preserving image processing
- Fast Bilateral Filter
- O(1) Bilateral Filter
- Recursive Bilateral Filtering
- Rolling Guidance Filter
- Relative Total Variation
- L0 Gradient Optimization
- Domain Transform
- Adaptive Manifold
- Guided image filtering
Intrinsic Images
- Recovering Intrinsic Images with a global Sparsity Prior on Reflectance
- Intrinsic Images by Clustering
Contour Detection and Image Segmentation
- Mean Shift Segmentation
- Graph-based Segmentation
- Normalized Cut
- Grab Cut
- Contour Detection and Image Segmentation
- Structured Edge Detection
- Pointwise Mutual Information
- SLIC Super-pixel
- QuickShift
- TurboPixels
- Entropy Rate Superpixel
- Contour Relaxed Superpixels
- SEEDS
- SEEDS Revised
- Multiscale Combinatorial Grouping
- Fast Edge Detection Using Structured Forests
Interactive Image Segmentation
- Random Walker
- Geodesic Segmentation
- Lazy Snapping
- Power Watershed
- Geodesic Graph Cut
- Segmentation by Transduction
Video Segmentation
- Video Segmentation with Superpixels
- Efficient hierarchical graph-based video segmentation
- Object segmentation in video
- Streaming hierarchical video segmentation
Camera calibration
- Camera Calibration Toolbox for Matlab
- Camera calibration With OpenCV
- Multiple Camera Calibration Toolbox
Simultaneous localization and mapping
SLAM community:
Tracking/Odometry:
- LIBVISO2: C++ Library for Visual Odometry 2
- PTAM: Parallel tracking and mapping
- KFusion: Implementation of KinectFusion
- kinfu_remake: Lightweight, reworked and optimized version of Kinfu.
- LVR-KinFu: kinfu_remake based Large Scale KinectFusion with online reconstruction
- InfiniTAM: Implementation of multi-platform large-scale depth tracking and fusion
- VoxelHashing: Large-scale KinectFusion
- SLAMBench: Multiple-implementation of KinectFusion
- SVO: Semi-direct visual odometry
- DVO: dense visual odometry
- FOVIS: RGB-D visual odometry
Graph Optimization:
- GTSAM: General smoothing and mapping library for Robotics and SFM — Georgia Institute of Technology
- G2O: General framework for graph optomization
Loop Closure:
- FabMap: appearance-based loop closure system – also available in OpenCV2.4.11
- DBoW2: binary bag-of-words loop detection system
Localization & Mapping:
Single-view Spatial Understanding
- Geometric Context – Derek Hoiem (CMU)
- Recovering Spatial Layout – Varsha Hedau (UIUC)
- Geometric Reasoning – David C. Lee (CMU)
- RGBD2Full3D – Ruiqi Guo (UIUC)
Object Detection
- INRIA Object Detection and Localization Toolkit
- Discriminatively trained deformable part models
- VOC-DPM
- Histograms of Sparse Codes for Object Detection
- R-CNN: Regions with Convolutional Neural Network Features
- SPP-Net
- BING: Objectness Estimation
- Edge Boxes
- ReInspect
Nearest Neighbor Search
General purpose nearest neighbor search
- ANN: A Library for Approximate Nearest Neighbor Searching
- FLANN – Fast Library for Approximate Nearest Neighbors
- Fast k nearest neighbor search using GPU
Nearest Neighbor Field Estimation
- PatchMatch
- Generalized PatchMatch
- Coherency Sensitive Hashing
- PMBP: PatchMatch Belief Propagation
- TreeCANN
Visual Tracking
- Visual Tracker Benchmark
- Visual Tracking Challenge
- Kanade-Lucas-Tomasi Feature Tracker
- Extended Lucas-Kanade Tracking
- Online-boosting Tracking
- Spatio-Temporal Context Learning
- Locality Sensitive Histograms
- Enhanced adaptive coupled-layer LGTracker++
- TLD: Tracking – Learning – Detection
- CMT: Clustering of Static-Adaptive Correspondences for Deformable Object Tracking
- Kernelized Correlation Filters
- Accurate Scale Estimation for Robust Visual Tracking
- Multiple Experts using Entropy Minimization
- TGPR
- CF2: Hierarchical Convolutional Features for Visual Tracking
- Modular Tracking Framework
Saliency Detection
Attributes
Action Reconition
Egocentric cameras
Human-in-the-loop systems
Image Captioning
Optimization
- Ceres Solver – Nonlinear least-square problem and unconstrained optimization solver
- NLopt– Nonlinear least-square problem and unconstrained optimization solver
- OpenGM – Factor graph based discrete optimization and inference solver
- GTSAM – Factor graph based lease-square optimization solver
Deep Learning
Machine Learning
- Awesome Machine Learning
- Bob: a free signal processing and machine learning toolbox for researchers
- LIBSVM — A Library for Support Vector Machines
Datasets
External Dataset Link Collection
- CV Datasets on the web – CVPapers
- Are we there yet? – Which paper provides the best results on standard dataset X?
- Computer Vision Dataset on the web
- Yet Another Computer Vision Index To Datasets
- ComputerVisionOnline Datasets
- CVOnline Dataset
- CV datasets
- visionbib
- VisualData
Low-level Vision
Stereo Vision
- Middlebury Stereo Vision
- The KITTI Vision Benchmark Suite
- LIBELAS: Library for Efficient Large-scale Stereo Matching
- Ground Truth Stixel Dataset
Optical Flow
- Middlebury Optical Flow Evaluation
- MPI-Sintel Optical Flow Dataset and Evaluation
- The KITTI Vision Benchmark Suite
- HCI Challenge
Video Object Segmentation
Change Detection
- Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms
- ChangeDetection.net
Image Super-resolutions
Intrinsic Images
- Ground-truth dataset and baseline evaluations for intrinsic image algorithms
- Intrinsic Images in the Wild
- Intrinsic Image Evaluation on Synthetic Complex Scenes
Material Recognition
Multi-view Reconsturction
Saliency Detection
Visual Tracking
- Visual Tracker Benchmark
- Visual Tracker Benchmark v1.1
- VOT Challenge
- Princeton Tracking Benchmark
- Tracking Manipulation Tasks (TMT)
Visual Surveillance
Saliency Detection
Change detection
Visual Recognition
Image Classification
Scene Recognition
Object Detection
Semantic labeling
Multi-view Object Detection
Fine-grained Visual Recognition
Pedestrian Detection
Action Recognition
Image-based
Video-based
Image Deblurring
Image Captioning
Scene Understanding
SUN RGB-D – A RGB-D Scene Understanding Benchmark Suite
NYU depth v2 – Indoor Segmentation and Support Inference from RGBD Images
Aerial images
Aerial Image Segmentation – Learning Aerial Image Segmentation From Online Maps
Resources for students
Resource link collection
- Resources for students – Frédo Durand (MIT)
- Advice for Graduate Students – Aaron Hertzmann (Adobe Research)
- Graduate Skills Seminars – Yashar Ganjali, Aaron Hertzmann (University of Toronto)
- Research Skills – Simon Peyton Jones (Microsoft Research)
- Resource collection – Tao Xie (UIUC) and Yuan Xie (UCSB)
Writing
- Write Good Papers – Frédo Durand (MIT)
- Notes on writing – Frédo Durand (MIT)
- How to Write a Bad Article – Frédo Durand (MIT)
- How to write a good CVPR submission – William T. Freeman (MIT)
- How to write a great research paper – Simon Peyton Jones (Microsoft Research)
- How to write a SIGGRAPH paper – SIGGRAPH ASIA 2011 Course
- Writing Research Papers – Aaron Hertzmann (Adobe Research)
- How to Write a Paper for SIGGRAPH – Jim Blinn
- How to Get Your SIGGRAPH Paper Rejected – Jim Kajiya (Microsoft Research)
- How to write a SIGGRAPH paper – Li-Yi Wei (The University of Hong Kong)
- How to Write a Great Paper – Martin Martin Hering Hering–Bertram (Hochschule Bremen University of Applied Sciences)
- How to have a paper get into SIGGRAPH? – Takeo Igarashi (The University of Tokyo)
- Good Writing – Marc H. Raibert (Boston Dynamics, Inc.)
- How to Write a Computer Vision Paper – Derek Hoiem (UIUC)
- Common mistakes in technical writing – Wojciech Jarosz (Dartmouth College)
Presentation
- Giving a Research Talk – Frédo Durand (MIT)
- How to give a good talk – David Fleet (University of Toronto) and Aaron Hertzmann (Adobe Research)
- Designing conference posters – Colin Purrington
Research
- How to do research – William T. Freeman (MIT)
- You and Your Research – Richard Hamming
- Warning Signs of Bogus Progress in Research in an Age of Rich Computation and Information – Yi Ma (UIUC)
- Seven Warning Signs of Bogus Science – Robert L. Park
- Five Principles for Choosing Research Problems in Computer Graphics – Thomas Funkhouser (Cornell University)
- How To Do Research In the MIT AI Lab – David Chapman (MIT)
- Recent Advances in Computer Vision – Ming-Hsuan Yang (UC Merced)
- How to Come Up with Research Ideas in Computer Vision? – Jia-Bin Huang (UIUC)
- How to Read Academic Papers – Jia-Bin Huang (UIUC)
Time Management
- Time Management – Randy Pausch (CMU)
Blogs
- Learn OpenCV – Satya Mallick
- Tombone’s Computer Vision Blog – Tomasz Malisiewicz
- Computer vision for dummies – Vincent Spruyt
- Andrej Karpathy blog – Andrej Karpathy
- AI Shack – Utkarsh Sinha
- Computer Vision Talks – Eugene Khvedchenya
- Computer Vision Basics with Python Keras and OpenCV – Jason Chin (University of Western Ontario)