Title Page
Abstract
Contents
List of abbreviations 11
Chapter 1. Introduction 12
1.1. Background 12
1.2. Problem statement 16
1.3. Research objective 17
1.3.1. Research questions 18
Chapter 2. Literature Review 19
Chapter 3. Methodology 24
3.1. Data description 25
3.1.1. Population 25
3.1.2. Dataset construction 27
3.1.3. Exploratory data analysis 29
3.2. Machine Learning analysis framework 30
3.2.1. Predictive models 30
3.2.2. Data pre-processing 30
3.2.3. Data splitting and Cross-validation 30
Chapter 4. Experiment results 31
4.1. Determining indicators characteristics by an exploratory data analysis (EDA) 31
4.2. Prediction of cocoa production profit by using Machine Learning models 37
4.2.1. Determining organic and conventional farming main indicator by SHAP analysis 43
4.2.2. Analysis of the change over time of organic and conventional farming main indicators 47
Chapter 5. Discussion 52
Chapter 6. Conclusions 54
References 63
국문초록 75
Appendices 76
Table 3.1. Research universe described by type of farming method and variety. 26
Table 3.2. Indicators by sustainability dimension 28
Table 2.3. Groups of variety and farming method. 29
Table 4.1. Performance of ML models based con CV test 37
Table 4.2. Performance parameters among ML models 38
Table 4.3. Model performance on organic and conventional farming profit prediction. 45
Table 4.4. Cocoa production indicators' change over time 49
Table 5.1. Correlation values for General data with profit (Y) 56
Table 5.2. Correlation values for organic cocoa farming with profit (Y) 57
Table 5.3. Correlation values for conventional cocoa farming with profit (Y) 57
Figure 3.1. Methodology framework 24
Figure 4.1. Distribution of indicators 32
Figure 4.2. Distribution of output indicator "Profit" 33
Figure 4.3. Correlation between indicators 34
Figure 4.4. Irrigated area by groups 35
Figure 4.5. Cocoa production in quantity by groups. 36
Figure 4.6. Decision Tree from Random Forest predicting the profitability of cocoa production 39
Figure 4.7. Actual vs predicted values for RF, LG and kNN based on global dataset 40
Figure 4.8. Indicators importance based on SHAP values for the global dataset 41
Figure 4.9. Indicator importance based on SHAP values for organic and conventional farms 43
Figure 4.10. Actual vs predicted values for RF, LG and kNN based on organic and conventional farms 46
Figure 4.11. Conventional farming main indicators changes over time 47
Figure 4.12. Organic farming main indicators changes over time 47
Equation 3.1. Eyendi's method Equation 29