Datasets

wiki: List of datasets for machine learning research

Image

MNIST

MNIST: handwriting digit

ImageNet

14 million images
20000 categories
Labeled objects, bounding boxes, descriptive words, SIFT features

CIFAR-10/100

CIFAR-10 and CIFAR-100 datasets
low-resolution. Good for generative model. Difficult to learn comparing to face
CIFAR-10: 10 classes, with 6000 images per class CIFAR-100: 100 classes containing 600 images each

NUS-WIDE

Common Objects in COntext (COCO)

COCO - Complex Adaptive Systems Laboratory Number of images in the dataset: 330,000 images while more than 200,000 are labeled (roughly equal halves for training and validation+test)
Number of classes: 80 object categories, 91 stuff categories
Image resolution: 640×480

Open Image

Open Image Dataset V5 by google label + boxes + segmentation + relationship annotation

Image Segmenation

Images Obj. Inst Obj. Cls Part Inst. Part Cls Obj. Cls. per Img
COCO 123,287 886,284 91 0 0 3.5
ImageNet∗ 476,688 534,309 200 0 0 1.7
NYU Depth V2 1,449 34,064 894 0 0 14.1
Cityscapes 25,000 65,385 30 0 0 12.2
SUN 16,873 313,884 4,479 0 0 9.8
OpenSurfaces 22,214 71,460 160 0 0 N/A
PascalContext 10,103 ∼104,398∗∗ 540 181,770 40 5.1
ADE20K 22,210 434,826 2,693 175,961 476 9.9
from Scene Parsing through ADE20K Dataset

RGB-D

SUN RGB-D

A RGB-D Scene Understanding Benchmark Suite
Project page
Challenge

NYU dataset

https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html

Object Tracking

youtube-bb

MOT Challenge: Multiple Object Tracking

MOT Challenge detection is provided

OTB2015

provide one boundary box in a reference frame, then following this items OTB2015

Face Datasets

Labled Faces in the Wild (LFW)

YouTube Faces (YTF)

MegaFace Challenge

Pose

Human Activity

ActivityNet

HDR Datasets

Fairchild

Fairchild tone mapping with multi. exposures, 106 images

Video

Optical Flow

Video Debluring

  • Hand-held camaera from Deep Video Deblurring for Hand-held Cameras (CVPR 2017)
  • GoPro, from Deep Multi-Scale Convolutional Neural Network for Dynamic Scene Deblurring (CVPR 2017)
  • REDS from NTIRE 2019

Video Restoration

REDS

from NTIRE 2019

  1. sharp (ground truth)
  2. blur
  3. blur+compression
  4. low resolution
  5. blur + low resolution

Vimeo90K

from Video Enhancement with Task-Oriented Flow (IJCV 2019)

  1. temporal frame interpolation
  2. video denoising
  3. video deblocking
  4. video super-resolution

Image grading

MIT-Adobe FiveK Dataset

Adobe FiveK 5,000 photos in DNG format An Adobe Lightroom catalog with renditions by 5 experts Semantic information about each photo

Image Compression