KazBERT
Published:
KazBERT is a robust BERT-based model specifically designed and fine-tuned for Kazakh language tasks.
Achievements:
- Over 14,000 downloads
- 14 likes on Hugging Face
The model is trained using Masked Language Modeling (MLM) on a rich multilingual text corpus comprising Kazakh, Russian, and English texts.
Scientific Citations & Impact: KazBERT has been recognized and utilized by the academic community in several peer-reviewed publications:
- LLM-Assisted Weak Supervision for Low-Resource Kazakh Sequence Labeling: Synthetic Annotation and CRF-Refined NER/POS Models (MDPI Applied Sciences)
- Hybrid artificial intelligence architectures for automatic text correction in the Kazakh language (Frontiers in Artificial Intelligence)
- Application of Vector Models in Intelligent Information Retrieval Systems (Academic Scientific Journal of Computer Science)