목차

Title Page

Contents

ABSTRACT 9

국문초록 11

CHAPTER 1. Introduction 13

1.1. Study Context and Objectives 13

1.2. Core Topics of the Thesis 15

CHAPTER 2. Related Works 17

2.1. Time-Explicit Video Prediction Models 17

2.2. Time-Implicit Video Prediction Models 17

2.3. CNN Models for Video Prediction 18

2.4. Time Series Forecasting Methods 18

CHAPTER 3. Methods 20

3.1. TDM-VP Architecture 20

3.2. Decomposition of 1x1 Convolution 24

3.2.1. Conventional Method on Spatio-Temporal Channels 24

3.2.2. Proposed Spatio-Temporal Decomposition 25

3.3. Time Series Forecasting on Temporal Channels 27

3.3.1. Proposed Trend-Remainder Decomposition 27

CHAPTER 4. Experiments 29

4.1. Configuration 29

4.1.1. Datasets 29

4.1.2. Evaluation Metrics 30

4.1.3. Experimental Details 30

4.2. Performance Comparisons 33

4.2.1. Quantitative Analysis 33

4.2.2. Qualitative Analysis 34

4.3. Ablation Study 35

4.3.1. Results of Structural Adjustments 35

CHAPTER 5. Conclusion 37

5.1. Contribution 37

5.2. Future Work 38

REFERENCES 39

[Table 4-1] Benchmark Dataset Description 29

[Table 4-2] Settings of each Components 30

[Table 4-3] Quantitative results on Moving MNIST dataset 33

[Table 4-4] Ablation Study of Proposed Method 35

[Figure 1-1] Temporal Processing Methods Comparison 14

[Figure 1-2] Proposed Method for Spatio-Temporal Channel Processing 15

[Figure 3-1] Overall Structure 20

[Figure 3-2] Translator Block 21

[Figure 3-3] Spatial Aggregation Block 22

[Figure 3-4] Channel Aggregation Block 23

[Figure 3-5] Standard 1x1 Convolution on Spatio-Temporal Channels 24

[Figure 3-6] Decomposition of 1x1 convolution 25

[Figure 3-7] Decomposition of temporal channels 27

[Figure 4-1] Moving MNIST Example 29

[Figure 4-2] Moving MNIST Qualitative Comparisons 34