목차

Title Page

Contents

ABSTRACT 8

요약 9

Chapter 1. Introduction 10

Chapter 2. Related works 13

Chapter 3. Problem Formulation 15

3.1. Filter Pruning in a CNN 15

3.2. Problem of Restoring a Pruned Network without Data and Fine-Tuning 17

Chapter 4. Proposed Method of Restoring Pruned Filters 19

4.1. Data-Independent Reconstruction Loss 19

4.2. Our Train-Free Recovery Method with a Closed Form Solution 24

Chapter 5. Experiments 26

5.1. Experiments using CIFAR-10 and CIFAR-100 26

5.2. Experiments using ImageNet (ILSVRC2012) 27

5.3. Effectiveness in terms of Reconstruction 28

5.4. Practical Test with Data and Fine-Tuning 29

Chapter 6. Conclusion 31

Chapter 7. Appendix 32

7.1. Notation Table 32

7.2. Proofs on Our Theoretical Results 33

7.2.1. Proof of Lemma 4.2 34

7.2.2. Proof of Theorem 4.4 35

7.2.3. Proof of Theorem 4.5 35

7.3. Experimental Details 37

Bibliography 40

이력서 46

TABLE 5.1. Recovery results of VGG-16 on CIFAR-10 28

TABLE 5.2. Recovery results of ResNet-34 on ImageNet 28

TABLE 5.3. Recovery results of ResNet-101 on ImageNet 28

TABLE 5.4. Fine-tuned or trained accuracies of ResNet-50 on CIFAR-100 using L2-norm criterion, where we fine-tune each model for 20 epochs and train the... 30

TABLE 7.1. Table of notations 32

TABLE 7.2. Recovery results of ResNet-50 on CIFAR-100 37

TABLE 7.3. Recovery results of MobileNet-V2 on ImageNet 37

TABLE 7.4. hyperparameters of VGG16 on CIFAR-10 38

TABLE 7.5. hyperparameters of ResNet50 on CIFAR-100 38

TABLE 7.6. hyperparameters of ResNet34 on ImageNet 38

TABLE 7.7. hyperparameters of ResNet101 on ImageNet 39

FIGURE 1.1. The conceptual overview of our LBYL method, showing how the original output resulting from a pruned filter at ℓ-th layer, that is, the output of (ℓ+1)-th... 12

FIGURE 3.1. Comparison between pruning matrix and delivery matrix, where the 4-th and 6-th filters are being pruned among 6 original filters 18

FIGURE 5.1. Comparison on the three error components with NM, where each m n k in the x-axis represents the k-th conv module in the n-th block at the m-th layer... 29

FIGURE 5.2. Comparison on learning curves of fine-tuning restored networks for 20 epochs and that of training the same-sized small architecture from scratch for 80 epochs... 30