标签归档:few-shot learning

Free Lunch for Few-shot Learning: Distribution Calibration

Learning from a limited number of samples is challenging since the learned model can easily become overfitted based on the biased distribution formed by only a few training examples. In this paper, we calibrate the distribution of these few-sample classes by transferring statistics from the classes with sufficient examples, then an adequate number of examples can be sampled from the calibrated distribution to expand the inputs to the classifier. We assume every dimension in the feature representation follows a Gaussian distribution so that the mean and the variance of the distribution can borrow from that of similar classes whose statistics are better estimated with an adequate number of samples. Our method can be built on top of off-the-shelf pretrained feature extractors and classification models without extra parameters. We show that a simple logistic regression classifier trained using the features sampled from our calibrated distribution can outperform the state-of-the-art accuracy on two datasets (~5% improvement on miniImageNet compared to the next best). The visualization of these generated features demonstrates that our calibrated distribution is an accurate estimation.

https://arxiv.org/abs/2101.06395

少样本学习一直是一个具有挑战性的任务,因为在少量训练样本上训练出来的模型容易过拟合到偏移的分布上。在本文中,我们通过充足的数据集上迁移统计信息用于校正少样本数据集上的偏移分布,然后可以从校正后的分布中采样足够多的样本用于训练。我们假设特征表示的每一维都服从高斯分布,那么这就意味着我们可以借鉴由大量数据统计的相似类的均值和方差。我们的方法可以与现成的预训练特征提取器和分类器在顶层进行合作并不需要引入额外参数。实验结果展示了在一个简单的逻辑回归分类器上使用经过校正的特征进行训练后可以在miniImageNet数据集上获得5%的性能提升。

Learning from Very Few Samples: A Survey

Few sample learning (FSL) is significant and challenging in the field of machine learning. The capability of learning and generalizing from very few samples successfully is a noticeable demarcation separating artificial intelligence and human intelligence since humans can readily establish their cognition to novelty from just a single or a handful of examples whereas machine learning algorithms typically entail hundreds or thousands of supervised samples to guarantee generalization ability. Despite the long history dated back to the early 2000s and the widespread attention in recent years with booming deep learning technologies, little surveys or reviews for FSL are available until now. In this context, we extensively review 200+ papers of FSL spanning from the 2000s to 2019 and provide a timely and comprehensive survey for FSL. In this survey, we review the evolution history as well as the current progress on FSL, categorize FSL approaches into the generative model based and discriminative model based kinds in principle, and emphasize particularly on the meta learning based FSL approaches. We also summarize several recently emerging extensional topics of FSL and review the latest advances on these topics. Furthermore, we highlight the important FSL applications covering many research hotspots in computer vision, natural language processing, audio and speech, reinforcement learning and robotic, data analysis, etc. Finally, we conclude the survey with a discussion on promising trends in the hope of providing guidance and insights to follow-up researches.

本文是少样本学习的综述文章。在机器学习领域,少样本学习任务是极难且具有挑战性的任务。在少样本条件下的学习和泛化能力一直是衡量人工智能相对于人类智能的发展水平的条件之一,因为人类可以在少样本的情况下快速地建立起来对于一个新事物的认识,但是人工智能模型往往需要成百上千的数据才能获得有保证的性能。从千禧年早期到最近,随着深度学习技术的发展,少样本学习获得了越来越多的关注,但是直到现在为止,综述的论文还是屈指可数。在本文中,我们广泛地阅读了大量的文献,对于少样本学习的历史和发展趋势做出了总结。

Boosting Few-Shot Visual Learning with Self-Supervision

Boosting Few-Shot Visual Learning with Self-Supervision

Few-shot learning and self-supervised learning address different facets of the same problem: how to train a model with little or no labeled data. Few-shot learning aims for
optimization methods and models that can learn efficiently to recognize patterns in the low data regime. Self-supervised learning focuses instead on unlabeled data and looks into it for the supervisory signal to feed high capacity deep neural networks. In this work we exploit the complementarity of these two domains and propose an approach for improving few-shot learning through self-supervision. We use self-supervision as an auxiliary task in a few-shot learning pipeline, enabling feature extractors to learn richer and more transferable visual representations while still using few annotated samples. Through self-supervision, our approach can be naturally extended towards using diverse unlabeled data from other datasets in the few-shot setting. We report consistent improvements across an array of architectures, datasets and self-supervision techniques.

Few-shot learning和自监督学习解决同一问题的不同方面:如何用少量且无标签的数据去训练模型。Few-shot learning目的是优化方法和模型,这些方法和模型可以有效地学习从而识别低数据状态下的模式。自监督学习将重点关注无标签的数据并且寻找监督信号以提供给深度神经网络。本工作中我们利用这两个领域的互补性,并且提出了一种通过自监督来改善few-shot learning的方法。我们在few-shot learning pipeline中将自监督作为一个辅助任务,使得特征提取器去学习更丰富,更可传递的视觉表示,与此同时仍然使用少量带标注的样本。通过自监督,我们的方法可以自然地扩展到在few-shot设置中使用来自其他数据集的各种未标记数据。

论文地址:https://openaccess.thecvf.com/content_ICCV_2019/papers/Gidaris_Boosting_Few-Shot_Visual_Learning_With_Self-Supervision_ICCV_2019_paper.pdf

代码地址:https://github.com/valeoai/BF3S