Title Page
ABSTRACT
Contents
1. Introduction 13
1.1. Pedestrian Guidance System 13
1.2. Research Content 15
1.3. Research Target 16
2. Literature Review 17
2.1. Attention Mechanisms in computer vision 17
2.2. Squeeze-and-Excitation Networks 19
2.3. Convolutional Block Attention Modules 20
2.3.1. Spatial Attention Module (SAM) 21
2.3.2. Channel Attention Module (CAM) 22
2.4. Perspective Transformation in Image Processing 23
3. Real Time Semantic Segmnetation 25
3.1. Speed-Accuracy Trade-offs in Existing Models 25
3.2. Bilateral Segmentation Network (BiSeNet) 26
3.2.1. Attention Refinement Module 26
3.2.2. Feature Fusion Module 26
3.3. STDC2 27
3.3.1. STDC Module 27
4. Datasets 28
4.1. CityScapes 28
4.2. Korean Pedestrian Dataset 30
4.3. Training with Korean pedestrian dataset 31
4.3. Warp Perspective Transform 34
5. Experiment 37
5.1. Proposed Method 37
5.2. Experiment 38
5.2.1. Environment and setting 38
5.2.2. Result 38
6. Conclusion 40
Reference 41
Table 1. Categories of CityScapes dataset 29
Table 2. Categories of Korean Pedestrian dataset 31
Table 3. After label merging categories of Korean Pedestrian dataset 32
Table 4. Perspective comparison between datasets 33
Table 5. Comparison of Warp Perspective Transform 35
Table 6. Training result comparison with stacked / non stacked images 35
Table 7. IoU of each class by stacked / non stacked images 35
Table 7. Result comparison of Perspective transform training 36
Table 8. Experiment model structure 37
Table 9. Environment and setting 38
Table 10. Experiment Result 38
Table 11. Class accuracy for each experiment case 39
Fig 1. Example screen of the Pedestrian Guidance System 14
Fig 2. Illustration of human vision focus. 17
Fig 3. Illustration of attention changes in human vision 18
Fig 4. Transformer Architecture, Scaled Dot Product Attention, and Multi-Head Attention. 18
Fig 5. Convolutional Block Attention Module layout 20
Fig 6. Feature Maps representation as a Tensor 21
Fig 7. Spatial Attention Module 21
Fig 8. Channel Attention Module 22
Fig 9. Example of CityScapes Dataset 28
Fig 10. Inference Image with STDC2 trained with CityScapes dataset on Korean pedestrian environment 30
Fig 11. Example of feature redunduncy in Korean Pedestrian dataset 1 31
Fig 12. Example of feature redunduncy in Korean Pedestrian dataset 2 32
Fig 13. After label merging, the result of STDC2 trained on Korean Pedestrian dataset 34