Datasets¶

wiki: List of datasets for machine learning research

Image¶

MNIST¶

MNIST: handwriting digit

ImageNet¶

14 million images
20000 categories
Labeled objects, bounding boxes, descriptive words, SIFT features

CIFAR-10/100¶

CIFAR-10 and CIFAR-100 datasets
low-resolution. Good for generative model. Difficult to learn comparing to face
CIFAR-10: 10 classes, with 6000 images per class CIFAR-100: 100 classes containing 600 images each

NUS-WIDE¶

Common Objects in COntext (COCO)¶

COCO - Complex Adaptive Systems Laboratory Number of images in the dataset: 330,000 images while more than 200,000 are labeled (roughly equal halves for training and validation+test)
Number of classes: 80 object categories, 91 stuff categories
Image resolution: 640×480

Open Image¶

Open Image Dataset V5 by google label + boxes + segmentation + relationship annotation

Image Segmenation¶

	Images	Obj. Inst	Obj. Cls	Part Inst.	Part Cls	Obj. Cls. per Img
COCO	123,287	886,284	91	0	0	3.5
ImageNet∗	476,688	534,309	200	0	0	1.7
NYU Depth V2	1,449	34,064	894	0	0	14.1
Cityscapes	25,000	65,385	30	0	0	12.2
SUN	16,873	313,884	4,479	0	0	9.8
OpenSurfaces	22,214	71,460	160	0	0	N/A
PascalContext	10,103	∼104,398∗∗	540	181,770	40	5.1
ADE20K	22,210	434,826	2,693	175,961	476	9.9
from Scene Parsing through ADE20K Dataset

RGB-D¶

SUN RGB-D¶

A RGB-D Scene Understanding Benchmark Suite
Project page
Challenge

NYU dataset¶

https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html

Object Tracking¶

youtube-bb¶

MOT Challenge: Multiple Object Tracking¶

MOT Challenge detection is provided

OTB2015¶

provide one boundary box in a reference frame, then following this items OTB2015

Face Datasets¶

Labled Faces in the Wild (LFW)¶

YouTube Faces (YTF)¶

MegaFace Challenge¶

Pose¶

Human Activity¶

ActivityNet

HDR Datasets¶

Fairchild¶

Fairchild tone mapping with multi. exposures, 106 images

Video¶

Optical Flow¶

Video Debluring¶

Hand-held camaera from Deep Video Deblurring for Hand-held Cameras (CVPR 2017)
GoPro, from Deep Multi-Scale Convolutional Neural Network for Dynamic Scene Deblurring (CVPR 2017)
REDS from NTIRE 2019

Video Restoration¶

REDS¶

from NTIRE 2019

sharp (ground truth)
blur
blur+compression
low resolution
blur + low resolution

Vimeo90K¶

from Video Enhancement with Task-Oriented Flow (IJCV 2019)

temporal frame interpolation
video denoising
video deblocking
video super-resolution

Image grading¶

MIT-Adobe FiveK Dataset¶

Adobe FiveK 5,000 photos in DNG format An Adobe Lightroom catalog with renditions by 5 experts Semantic information about each photo

Image Compression¶

Image Cpmpression