목차

표제지

목차

I. 서론 5

II. 시스템 개요 7

2.1. 은닉 마르코프 네트워크(Hidden Markov Network) 7

2.1.1. 연쇄 상태 분할(Successive State Splitting) 알고리즘 7

2.1.2. SSS의 문제점 11

2.1.3. 음소결정트리(Phonetic Decision Tree) 13

2.1.4. PDT와 SSS의 관계 14

2.1.5. PDT-SSS 알고리즘 15

III. 잡음망에서의 음성신호 21

3.1. 잡음의 분류 21

3.1.1. 부가잡음 21

3.1.2. 채널 왜곡 22

3.1.3. 롬바드(Lombard) 효과 22

3.2. 기존의 잡음처리 기법 23

3.2.1. 스펙트럼 차감법(Spectral Subtraction) 23

3.2.2. 수정된 위너 필터링(MWF; Modified Wiener Filtering) 24

3.2.3. NOVO(Noise and Voice) 26

3.2.4. 결합된 잡음처리 기법 28

3.3. 잡음처리기법의 결합에 의한 음성인식 31

3.3.1. 음성의 특징 파라미터 32

3.3.2. 잡음처리기법의 결합 35

IV. 인식 실험 및 고찰 50

4.1. 음성 데이터 및 실험 조건 50

4.2. 음성인식기의 구성 51

4.3. 인식 결과 및 고찰 53

V. 결론 56

참고 문헌 58

Abstract 60

Table 3.1. Recognition rate for different value of (1-α). 36

Table 3.2. Recognition rate for MSS-NOVO. 49

Table 4.1. Conditions for test. 51

Fig. 2.1. Training of an initial model. 8

Fig. 2.2. Calculation of the distribution size. 8

Fig. 2.3. Split on the contextual domain. 10

Fig. 2.4. Split on the temporal domain. 11

Fig. 2.5. The Successive State Splitting algorithm. 12

Fig. 2.6. An example of phonetic decision tree. 14

Fig. 2.7. The structure of initial HM-Net model. 16

Fig. 2.8. An example of /b/ model. 20

Fig. 3.1. Model of Noisy Speech. 21

Fig. 3.4. Block diagram of spectral subtraction. 24

Fig. 3.5. Basic NOVO process. 27

Fig. 3.6. NOVO transform. 27

Fig. 3.7. Block diagram of SS-NOVO. 30

Fig. 3.8. Block diagram of MWF-NOVO. 31

Fig. 3.9. MFCC processing. 33

Fig. 3.10. Block diagram of MSS(Modified Spectral Subtraction). 36

Fig. 3.11. Speech signal before and after SS, MWF and MSS. 38

Fig. 3.12. Distance between parameters of clean and denoised speech by using noise reduction methods based on SS, MWF, and MSS for Kth's utterance. 41

Fig. 3.13. Distance between parameters of clean and denoised speech by using noise reduction methods based on SS, MWF, and MSS for Cyj's utterance. 43

Fig. 3.14. Comparition with the value of log frame's Power between clean and denoised speech by using noise reduction methods based on SS, MWF, and MSS for Kth's utterance. 45

Fig. 3.15. Comparition with the value of log frame's Power between clean and denoised speech by using noise reduction methods based on SS, MWF, and MSS for Cyj's utterance. 46

Fig. 3.16. Block diagram of MSS-NOVO. 48

Fig. 4.1. Training processing. 52

Fig. 4.2. Recognition processing. 52

Fig. 4.3. Recognition rate using noise reduction method. 54

Fig. 4.4. Recognition rate using combinational method. 55