Paper Summary on Noise, Anomalies, Adversaries, Robust Learning, Generalization
in Blogs
means being highly related to my personal research interest.
ICCV 2019 on label noise, …
 Deep SelfLearning From Noisy Labels: The proposed SMP trains in an iterative manner which contains two phases: the first phase is to train a network with the original noisy label and corrected label generated in the second phase.
 CoMining: Deep Face Recognition With Noisy Labels: We propose a novel comining framework, which employs two peer networks to detect the noisy faces, exchanges the highconfidence clean faces and reweights the clean faces in a minibatch fashion.
 NLNL: Negative Learning for Noisy Labels: Input image belongs to this label–Positive Learning; Negative Learning (NL)–CNNs are trained using a complementary label as in “input image does not belong to this complementary label.”
 Symmetric Cross Entropy for Robust Learning With Noisy Labels: Already compared in our method.
 O2UNet: A Simple Noisy Label Detection Approach for Deep Neural Networks: It only requires adjusting the hyperparameters of the deep network to make its status transfer from overfitting to underfitting (O2U) cyclically. The losses of each sample are recorded during iterations. The higher the normalized average loss of a sample, the higher the probability of being noisy labels.
NeurIPS 2019MetaWeightNet: Learning an Explicit Mapping For Sample Weighting
 Targted problems: (1) Corrupted Labels (2) Class imbalance
 Methodology: Guided by a small amount of unbiased metadata, to learn an explicit weighting layer which takes training losses as input and outputs examples’ weights.
 Code: https://github.com/xjtushujun/metaweightnet
 Introduciton: Why are the targeted problems important? In practice, however, such biased training data are commonly encountered. For instance, practically collected training samples always contain corrupted labels [10, 11, 12, 13, 14, 15, 16, 17]. A typical example is a dataset roughly collected from a crowdsourcing system [18] or search engines [19, 20], which would possibly yield a large amount of noisy labels. Another popular type of biased training data is those with class imbalance. Realworld datasets are usually depicted as skewed distributions, with a longtailed configuration. A few classes account for most of the data, while most classes are underrepresented. Effective learning with these biased training data, which is regarded to be biased from evaluation/test ones, is thus an important while challenging issue in machine learning [1, 21].
 There exist two entirely contradictive ideas for constructing such a lossweight mapping:
 Emphasise on harder ones: Enforce the learning to more emphasize samples with larger loss values since they are more like to be uncertain hard samples located on the classification boundary. Typical methods of this category include AdaBoost [22, 23], hard negative mining [24] and focal loss [25]. This sample weighting manner is known to be necessary for class imbalance problems, since it can prioritize the minority class with relatively higher training losses.
 Emphasise on easier ones: The rationality lies on that these samples are more likely to be highconfident ones with clean labels. Typical methods include selfpaced learning(SPL) [26], iterative reweighting [27, 17] and multiple variants [28, 29, 30]. This weighting strategy has been especially used in noisy label cases, since it inclines to suppress the effects of samples with extremely large loss values, possibly with corrupted incorrect labels.

Deficiencies:
 How about the case that the training set is both imbalanced and noisy.
 They inevitably involve hyperparameters, to be manually preset or tuned by crossvalidation.
 Experiments of this work:
 Class Imbalance Experiments
 ResNet32 on longtailed CIFAR10 and CIFAR100.
 Corrupted Label Experiments on CIFAR10 and CIFAR100
 WRN2810 with varying noise rates under uniform noise.
 ResNet32 with varying noise rates under flip noise  nonuniform noise.
 Realworld dataClothing 1M with ResNet50
 We use the 7k clean data as the meta dataset.
 Class Imbalance Experiments
 Problems of this work:
 For the case where the training set is both imbalanced and noisy, the authors mentioned in the introduction section that conventional methods cannot address this case. However, there is no experiment to demontrate that this method works.
 Conventional methods inevitably involve hyperparameters to tune by crossvalidation. However, for the proposed method, unbiased metadata is required, which is a more expensive hyperfactor in practice. Tuning hyperparameters is cheaper than collecting unbiased metadata for training the weighting function.
ICML 2019Better generalization with less data using robust gradient descent
GAN, Adversary Examples, Adversary Machine Learning
Label Noise
 NeurIPS 2019L_DMI: A Novel Informationtheoretic Loss Function for Training Deep Nets Robust to Label Noise
 NeurIPS 2019Are Anchor Points Really Indispensable in LabelNoise Learning?
 NeurIPS 2019Combinatorial Inference against Label Noise
NeurIPS 2019Noisetolerant fair classification
NOTE: Existing work on the problem operates under the assumption that the sensitive feature available in one’s training sample is perfectly reliable. This assumption may be violated in many realworld cases: for example, respondents to a survey may choose to conceal or obfuscate their group identity out of fear of potential discrimination. This poses the question of whether one can still learn fair classifiers given noisy sensitive features.
NeurIPS 2019Neural networks grown and selforganized by noise
NOTE: Living neural networks emerge through a process of growth and selforganization that begins with a single cell and results in a brain, an organized and functional computational device. Artificial neural networks, however, rely on humandesigned, handprogrammed architectures for their remarkable performance. Can we develop artificial computational devices that can grow and selforganize without human intervention? In this paper, we propose a biologically inspired developmental algorithm that can ‘grow’ a functional, layered neural network from a single initial cell. The algorithm organizes interlayer connections to construct a convolutional pooling layer, a key constituent of convolutional neural networks (CNN’s). Our approach is inspired by the mechanisms employed by the early visual system to wire the retina to the lateral geniculate nucleus (LGN), days before animals open their eyes. The key ingredients for robust selforganization are an emergent spontaneous spatiotemporal activity wave in the first layer and a local learning rule in the second layer that ‘learns’ the underlying activity pattern in the first layer. The algorithm is adaptable to a widerange of inputlayer geometries, robust to malfunctioning units in the first layer, and so can be used to successfully grow and selforganize pooling architectures of different poolsizes and shapes. The algorithm provides a primitive procedure for constructing layered neural networks through growth and selforganization. Broadly, our work shows that biologically inspired developmental algorithms can be applied to autonomously grow functional ‘brains’ insilico.
StochasticGradientNoise
 ICML 2019A TailIndex Analysis of Stochastic Gradient Noise in Deep Neural Networks
 NeurIPS 2019First Exit Time Analysis of Stochastic Gradient Descent Under HeavyTailed Gradient Noise
Denoiser, Noise Removal
 NeurIPS 2019Extending Stein’s unbiased risk estimator to train deep denoisers with correlated pairs of noisy images
 NeurIPS 2019Variational Denoising Network: Toward Blind Noise Modeling and Removal