Title Page
Contents
ABSTRACT 5
Abbreviatios 11
I. INTRODUCTION 12
II. METHODS AND EXPERIMENTS 18
II.1. CADD (Computer-Aided Drug Design) 18
II.1.1. Machine learning 18
II.1.2. Deep learning 19
II.2. Ligand filtering 21
II.3. Traing set and test 22
II.4. Program language (Python, R) 22
II.5. Converting SMILES code to numeric data 22
III. Results 32
III.1. Bayesian 32
III.2. Neural network 36
IV. Conclusion 37
V. References 38
VI. Appendix 41
VI.1. Code 41
VI.1.1. Extract only data with all activity values from JAK data and indicate it as Excel - R 41
VI.1.2. Searching for the types of elements in the Smile code (redundant removal) - python 42
VI.1.3. Smile code convert - python 42
VI.1.4. Neural network - python 43
VII. ABSTRACT IN KOREAN 48
Table 1. JAK proteins superimpose. 15
Table 2. JAK protein binding site sequence... 17
Table 3. The number of JAK inhibitor data in the range at... 21
Table 4. Table used to convert SMILES codes to numbers. Beige colors... 25
Table 5. Chiral structure with different active values. 26
Table 6. Neural network condition. The data set of JAK1 was selected and the results were compared. 28
Table 7. JAK1 Bayesian model analysis result 32
Table 8. JAK2 Bayesian model analysis result 33
Table 9. JAK3 Bayesian model analysis result 34
Table 10. TYK2 Bayesian model analysis result 35
Table 11. The result of deep learning by saving the descriptor... 36
Figure 1. JAK domain structure and residue number. JAK is a protein composed of... 12
Figure 2. JAK-related cytokine signaling. Through this pathway, it controls cell activity. 13
Figure 3. Example of receptors of cytokine/hormone that carry signals using... 14
Figure 4. Superimposing analysis of JAK(PDB ID: JAK1(AAH), AK2(6AAJ),... 15
Figure 5. JAK binding site sequence. The red box represents the binding site. 16
Figure 6. Conceptual analogy between real neurons (A) and artificial... 20
Figure 7. Comparison between shallow learning and deep learning in neural... 20
Figure 8. The process of dividing the training set and test set in the data set.... 22
Figure 9. Chemical motif detection by CNN in comparison with sequence motif... 24
Figure 10. SMILES code convert. This is a example number convert of the... 25
Figure 11. 2D descriptor and 3D descriptor value at structure in Table 5. 2D... 26
Figure 12. The neural network results are shown graphically. Orange represents... 37