Title Page
Contents
ABSTRACT 8
요약 9
Chapter 1. Introduction 10
Chapter 2. Related works 13
Chapter 3. Problem Formulation 15
3.1. Filter Pruning in a CNN 15
3.2. Problem of Restoring a Pruned Network without Data and Fine-Tuning 17
Chapter 4. Proposed Method of Restoring Pruned Filters 19
4.1. Data-Independent Reconstruction Loss 19
4.2. Our Train-Free Recovery Method with a Closed Form Solution 24
Chapter 5. Experiments 26
5.1. Experiments using CIFAR-10 and CIFAR-100 26
5.2. Experiments using ImageNet (ILSVRC2012) 27
5.3. Effectiveness in terms of Reconstruction 28
5.4. Practical Test with Data and Fine-Tuning 29
Chapter 6. Conclusion 31
Chapter 7. Appendix 32
7.1. Notation Table 32
7.2. Proofs on Our Theoretical Results 33
7.2.1. Proof of Lemma 4.2 34
7.2.2. Proof of Theorem 4.4 35
7.2.3. Proof of Theorem 4.5 35
7.3. Experimental Details 37
Bibliography 40
이력서 46
TABLE 5.1. Recovery results of VGG-16 on CIFAR-10 28
TABLE 5.2. Recovery results of ResNet-34 on ImageNet 28
TABLE 5.3. Recovery results of ResNet-101 on ImageNet 28
TABLE 5.4. Fine-tuned or trained accuracies of ResNet-50 on CIFAR-100 using L2-norm criterion, where we fine-tune each model for 20 epochs and train the... 30
TABLE 7.1. Table of notations 32
TABLE 7.2. Recovery results of ResNet-50 on CIFAR-100 37
TABLE 7.3. Recovery results of MobileNet-V2 on ImageNet 37
TABLE 7.4. hyperparameters of VGG16 on CIFAR-10 38
TABLE 7.5. hyperparameters of ResNet50 on CIFAR-100 38
TABLE 7.6. hyperparameters of ResNet34 on ImageNet 38
TABLE 7.7. hyperparameters of ResNet101 on ImageNet 39
FIGURE 1.1. The conceptual overview of our LBYL method, showing how the original output resulting from a pruned filter at ℓ-th layer, that is, the output of (ℓ+1)-th... 12
FIGURE 3.1. Comparison between pruning matrix and delivery matrix, where the 4-th and 6-th filters are being pruned among 6 original filters 18
FIGURE 5.1. Comparison on the three error components with NM, where each m n k in the x-axis represents the k-th conv module in the n-th block at the m-th layer... 29
FIGURE 5.2. Comparison on learning curves of fine-tuning restored networks for 20 epochs and that of training the same-sized small architecture from scratch for 80 epochs... 30