This study proposes a novel Named Entity Recognition labelling process using large-scale language models, and demonstrates its potential to compensate for incomplete datasets and improve labelling. We show that the process can be used to proactively label and calibrate datasets, despite the fact that only 10% of the responses in the training dataset are correct.
Furthermore, this work provides important insights into the feasibility of integrating active learning and model community-based tagging. In doing so, we propose a perspective on entity name recognition in large-scale language models and suggest strategies for correctly tagging unlabelled data in a cost-effective manner, thereby improving the model's understanding of the data. Furthermore, we show that our method plays an important role in minimising the cost of data labelling and maintaining a certain level of model performance even with insufficient data. Based on these findings and proofs, the focus of this thesis is to provide practical suggestions for the effective use of large-scale language models and the improvement of labelling tasks.