Title Page
Abstract
Contents
Chapter 1. Introduction 16
1.1. Optimization in organic synthesis experiments 17
1.1.1. Challenges in searching suitable chemical reaction conditions 17
1.1.2. Design of experiment in organic chemistry 18
1.1.3. Bayesian optimization for exploring suitable reaction conditions 19
1.1.4. Data-driven approaches to predict optimal reaction conditions 21
1.1.5. Computational challenges in advanced approaches Five types of critical challenges 21
1.2. Formulating computational problems for the re- action optimization questions 22
1.2.1. Notations and descriptions 22
1.2.2. Computational equations for the main methods in dissertation 25
1.3. Three computational problems for the optimization challenges in organic synthesis 26
1.4. Outline of the dissertation 31
Chapter 2. Generative Modeling to Predict Multiple Suitable Conditions for Chemical Reactions 32
2.1. Motivation 33
2.1.1. Suggestion of reaction conditions 35
2.1.2. Existing method 35
2.1.3. Two major drawback 36
2.1.4. An aim 37
2.2. Methods 37
2.2.1. Problem Formulation 37
2.2.2. Prediction Model 39
2.2.3. Training 39
2.2.4. Inference 40
2.3. Experiments 41
2.3.1. Datasets 41
2.3.2. Implementation 44
2.3.3. Baseline Methods 45
2.3.4. Evaluation Protocol 46
2.3.5. Results and Discussion 47
2.4. Conclusion 51
Chapter 3. Uncertainty-Aware Prediction of Chemical Reaction Yields with Graph Neural Networks 52
3.1. Motivation 53
3.2. Methods 56
3.2.1. Data Representation 56
3.2.2. Prediction Model 57
3.2.3. Uncertainty-Aware Learning 59
3.2.4. Uncertainty-Aware Inference 60
3.2.5. Experimental Investigation 63
3.3. Results and Discussion 65
3.3.1. Prediction and Uncertainty Quantification 65
3.3.2. Out-Of-Sample Prediction 68
3.3.3. Selective Prediction with Rejection 68
3.4. Conclusion 74
Chapter 4. Exploring Optimal Reaction Conditions Guided by Graph Neural Networks and Bayesian Optimization 75
4.1. Motivation 76
4.2. Methods 82
4.2.1. Overview of hybrid-type dynamic optimization for exploring suitale chemical reaction 82
4.2.2. Dataset and graph-type representation for training MPNN 85
4.2.3. Message Passing Neural Networks for predicting suitable reagents 86
4.2.4. Bayesian optimization in HDO 88
4.2.5. Acquisition function for HDO and rules for expanding the search space 90
4.3. Performance benchmarking results 91
4.3.1. Details of the list of candidates for optimization 95
4.3.2. Performance of MPNN condition prediction models 101
4.3.3. Task1: Optimization of reaction conditions to benchmark the performance 105
4.3.4. Task2: Validation of the HDO compared to five human chemists 114
4.4. Results and discussion 121
Chapter 5. Conclusions 123
Bibliography 130
국문초록 144
Table 1.1. Notations. 24
Table 2.1. Description of cross-coupling reaction datasets extracted from Reaxys database 43
Table 2.2. Performance comparison results on chemical reactions in the test set 49
Table 2.3. Performance comparison results on chemical reactions with more than one ground-truth reaction condition 50
Table 3.1. Description of benchmark datasets 62
Table 3.2. Comparison of prediction and uncertainty quantification performance on benchmark datasets (Buchwald-Hartwig) 66
Table 3.3. Comparison of prediction and uncertainty quantification performance on benchmark datasets (Suzuki-Miyaura) 67
Table 3.4. Comparison of prediction and uncertainty quantification performance on out-of-sample splits of Buchwald-Hartwig dataset 69
Table 3.5. Comparison of selective prediction performance in terms of MAE (%p) 72
Table 3.6. Comparison of selective prediction performance in terms of RMSE (%p) 73
Table 4.1. Details of the two performance benchmarking tasks 93
Table 4.2. Validation dataset for MPNN's performance. 102
Table 4.3. Performance of accuracy in Suzuki-Miyaura reaction and Buchwald-Hartwig reaction 103
Table 4.4. Performance of accuracy in Ullmann reaction and Chan-ram reaction 104
Table 4.5. Comparison of reaction optimization performance with the baselines on Task 1 106
Figure 1.1. Prototypical chemical process optimization problem 18
Figure 1.2. Design of experiment. 19
Figure 1.3. Bayesian optimization : one-dimensional visualization. 20
Figure 2.1. Schematic comparison between existing methods and proposed method 36
Figure 2.2. Example of a chemical reaction and its ground-truth reaction conditions 38
Figure 2.3. Architecture of the VAE used in this study 44
Figure 2.4. Unique count of predicted reaction conditions per reaction according to T for ReactionVAE 48
Figure 3.1. Illustrative example of the graph representation for a molecule. 56
Figure 3.2. Architecture of the prediction model. 57
Figure 3.3. Monte-Carlo(MC) dropout. 61
Figure 3.4. Summary for comparison of selective prediction performance on benchmark datasets: (a) MAE (%p) on Buchwald-Hartwig; (b) MAE (%p) on... 71
Figure 4.1. Biased dataset in chemical reaction (Reaxys) 77
Figure 4.2. (a) Given a reaction representation, HDO specifies a search space using the best combination of conditions predicted by MPNN. (b) Initial exper-... 84
Figure 4.3. Illustration of the process of MPNN models to predict suitable reaction conditions given garph-type reaction representations g(⊕ denotes the...[이미지참조] 88
Figure 4.4. Suzuki-Miyaura reaction with search space in task1 96
Figure 4.5. Buchwald-Hartwig reaction with search space in task1 97
Figure 4.6. Arilation reaction with search space in task1 98
Figure 4.7. The candidates of conditions for Suzuki-Miyaura and Buchwald-Hartwig reaction in task2 99
Figure 4.8. The candidates of conditions for Ullmann and Chan-lam reaction in task2 100
Figure 4.9. Box-and-Whisker plot for comparison of performance (top-5%). 109
Figure 4.10. Task1 : Detailed results for Suzuki-Miyaura reaction experiments(1a-1) 110
Figure 4.11. Task1 : Detailed results for Suzuki-Miyaura reaction experiments(1a-2) 111
Figure 4.12. Task1 : Detailed results for Buchwald-Hartwig experiments(2a-e) 112
Figure 4.13. Performance comparison of the BO, 50 expert chemists, and HDO for the Arylation reaction. The results of 50 experiments with HDO and BO are... 113
Figure 4.14. Average cumulative maximum observed yields using the HDO (blue curve), and the average yield of the combination of conditions proposed... 115
Figure 4.15. Performance of HDO for the four named reactions. Comparison of the yield results of HDO with the conditions proposed by the experts for the... 116
Figure 4.16. Details of optimization for 4a-j, 5a-h, 6a-b, and 7a-b. 118
Figure 4.17. Task2 : The details of Suzuki-Miyaura reaction 119
Figure 4.18. 4(a) is an example of Suzuki-Miyaura coupling reaction. The HDO found the same conditions as 5 experts in two trials. 5(b) showed that in just... 120