As bio-signal measurement technology develops, interest in bio-signal-based HCI (Human-Computer Interface) is increasing. Among them, applications based on electrooculogram (EOG) are used as a means of communication for patients with degenerative neurological syndromes or quadriplegia. Various studies have been conducted on recognizing eye-written characters using EOG. However, EOG-based handwriting data is limited by the small sample size of available data due to collection constraints.
In this study, we paid attention to the limited number of data. To solve this problem, we proposed a recognition model that combines the idea of Reference data and VIT (vision transformers) based Siamese networks. The Siamese network is a learning methodology that determines the class matching of two inputs. we apply ViT which is suitable for time-series data analysis. We introduce Reference data to transform Siamese networks, which deal with binary classification tasks that only determine whether input pairs match classes, into unambiguous class prediction and multiple classification tasks.
In this study, we use 10 Arabic numeral datasets and 12 Katakana stroke datasets. The experimental results show that the ViT-based Siamese network achieves high performance, with recognition accuracy of up to 91.9% on the Arabic numerals dataset and up to 84.7% on the Katakana strokes dataset. We found that the Siamese network has robust recognition performance, reaching about 90% accuracy, except for some classes in zero-shot learning, which is one of the main features of Siamese networks.