Victory at SemEval-2026 Task 13 (Subtask C)
Published:
I am thrilled to announce that my teammate Agzam Shamsadinov and I won the prestigious SemEval-2026 Task 13 (Subtask C) international competition!
SemEval is an authoritative international series of workshops and NLP (Natural Language Processing) evaluation exercises. In Task 13C, we had to classify the authorship of source code into 4 classes: Human, AI, Hybrid, and Adversarial AI.
Our Solution: For this task, we developed a Multi-Modal Ensemble that performs an in-depth analysis of the source code from three different perspectives:
- Code Semantics - Utilizing the UniXcoder language model through Multiple Instance Learning (MIL) to properly handle long files.
- Textual Patterns - Employing our MazgaBERT model to highlight structural and textual writing patterns.
- Statistical Anomalies - Using classic XGBoost trained on hand-crafted features derived from stylistic and structural properties of the code.
Our System Paper with a detailed description of the solution will be presented at the SemEval workshop and subsequently published in the official ACL Anthology. More updates to come!