Transfer Learning

How transferable are features in deep neural networks? (NIPS 2014)
Example: Given labelled grey-scaled MNIST and unlabeled color MNIST, want to train model for classifier of color MNIST without labelling color MNIST.


Self-taught learning: transfer learning from unlabeled data (NIPS 2007)
train classifier with feature representation (e.g. with auto-encoder)

DaNN (PRICAI 2014)

Domain Adaptive Neural Networks for Object Recognition (PRICAI 2014)
Maximum Mean Discrepancy (MMD) is a measure of the difference between two probability distributions from their samples. It is an effective criterion that compares distributions without initially estimating their density functions.


Deep domain confusion: Maximizing for domain invariance (2014)
adaptation layer along with a domain confusion loss based on MMD


Train big model first, then use it as teacher to teach small model (with faster inference speed)

  1. Do Deep Nets Really Need to be Deep (NIPS 2014) learn value before softmax, could add some unlabelled data
  2. Distilling the Knowledge in a Neural Network(NIPS 2014) learn soft target Better than training small model with labelled data directly. Probably because distillisaton prevent overfit


Domain-Adversarial Neural Networks (NIPS 2014) - Hana Ajakan
Unsupervised Domain Adaptation by Backpropagation (ICML 2015) - Yaroslav Ganin
Domain-Adversarial Training of Neural Networks (JMLR 2016) - Yaroslav Ganin, Hana Ajakan

module function
feature extractor model to be transfered and tunned
label predictor predict output
doman classifier identify if target input within source input domain. If clasifier distinguish as new domain, high loss-> force feature extractor learn to mix 2 domain