Title Page
Contents
Abstract 9
Ⅰ. Introduction 11
Ⅱ. Background 16
2.1. Sensor-Based HAR 16
2.2. Convolutional Neural Network 17
2.3. Transformer model 18
2.4. Position Embedding 20
2.5. Long and Local dependencies 21
2.6. Related work on existing HAR models 22
Ⅲ. Human Activity Recognition model 24
3.1. Dataset description 24
3.1.1. KU-HAR dataset 24
3.1.2. UniMiB SHAR dataset 25
3.1.3. USC-HAD dataset 25
3.1.4. Dataset Preprocessing 27
3.2. Proposed model description 28
3.2.1. Input Definition 30
3.2.2. Convolutional Features Extractor Block 30
3.2.3. Multi-Head Self-Attention 31
3.2.4. Vector-based Relative Position Embedding 32
3.2.5. Feed-Forward Network 33
Ⅳ. Experiment & Result 34
4.1. Evaluation Metrics 34
4.2. Baseline Model 35
4.3. Experimental Setup Details 36
4.4. Experimental Result 38
Ⅴ. Discussion 41
5.1. Ablation Works of Improve Methods 42
5.1.1. Effect of Convolutional Feature Extract Block 42
5.1.2. Effect of Vector-based Relative Position Embedding 44
5.2. Ablation Works of Hyper-Parameters 47
5.2.1. Impact of Convolutional Layer Numbers 47
5.2.2. Impact of Convolutional Filter Numbers 48
5.2.3. Impact of Head Numbers 50
Ⅵ. Conclusion 51
References 53
국문 요약 64
〈Table 3-1-1〉 Main information of the datasets. 24
〈Table 3-1-2〉 Setting of data Pre-processing. "-": Overlap Rate is zero. 28
〈Table 4-3-1〉 Experimental Setup Details. Where, smoothing Factor: Smoothing factors for label smoothing regularization techniques. 37
〈Table 4-4-1〉 Experimental results of different model structure on USC-HAR, UniMiB SHAR, and KU-HAR dataset. The symbol "-" denotes no result. Where,... 40
〈Table 5-1〉 Experimental results after combining different modules. 41
〈Table 5-1-1〉 Experimental test results of CBEF structure. 44
〈Table 5-1-2〉 Performance test results of the baseline model using different position embedding methods. 45
〈Table 5-2-1〉 Our model under the Action of Different Number of Convolutional Layers. 47
〈Table 5-2-2〉 Our model under the Action of Different Convolutional Layer Filter Numbers. 48
〈Table 5-2-3〉 Our model under Different Head Count Actions. 50
〈Figure 2-1-1〉 Example of a HAR system based on sensors. 17
〈Figure 2-3-1〉 Transformer model architecture. 19
〈Figure 3-1-1〉 Activity class distribution of the datasets. (a) KU-HAR, (b) UniMiB SHAR, (c) USC-HAD 26
〈Figure 3-2-1〉 Overall Architecture of the Human Activity Classification Model. The right dashed box indicates the Convolutional Feature Extractor Block (CFEB).... 29
〈Figure 3-2-2〉 Self-attention modules with relative position embedding using vector parameters (vRPE-SA). Newly added parts are depicted in grey area. Firstly,... 33
〈Figure 4-4-1〉 Validation accuracy and loss curves for the two models on the USC-HAD, KU-HAR and UniMiB SHAR datasets. 38
〈Figure 5-1-1〉 Attention scores visualization of Baseline model(a) and enhanced versions(b). By using the Convolution feature extractor block module, the local... 43
〈Figure 5-2-1〉 Confusion Matrix. Baseline model(a), Baseline model with Initial Relative Position Embedding(b), and Baseline model with vRPE(c). 46
〈Figure 5-2-1〉 UniMiB SHAR Dataset's validation set loss for the impact of different number of convolutional filters on the proposed model. 49