--- license: apache-2.0 language: - en - ru library_name: gigacheck tags: - token-classification - detr - ai-detection - multilingual - gigacheck datasets: - iitolstykh/LLMTrace_detection base_model: - mistralai/Mistral-7B-v0.3 --- # GigaCheck-Detector-Multi

🌐 LLMTrace Website | 📜 LLMTrace Paper on arXiv | 🤗 LLMTrace - Detection Dataset | Github |

## Model Card ### Model Description This is the official `GigaCheck-Detector-Multi` model from the `LLMTrace` project. It is a multilingual transformer-based model trained for **AI interval detection**. Its purpose is to identify and localize the specific spans of text within a document that were generated by an AI. The model was trained jointly on the English and Russian portions of the `LLMTrace Detection dataset`, which includes human, fully AI, and mixed-authorship texts with character-level annotations. For complete details on the training data, methodology, and evaluation, please refer to our research paper: link(coming soon) ### Intended Use & Limitations This model is intended for fine-grained analysis of documents, academic integrity tools, and research into human-AI collaboration. **Limitations:** * The model's performance may degrade on text generated by LLMs released after its training date (September 2025). * It is not infallible and may miss some AI-generated spans or incorrectly flag human-written parts. * The boundary predictions may not be perfectly precise in all cases. ## Evaluation The model was evaluated on the test split of the `LLMTrace Detection dataset`. The performance is measured using standard mean Average Precision (mAP) metrics for object detection, adapted for text spans. | Metric | Value | |---------------|--------| | mAP @ IoU=0.5 | 0.8976 | | mAP @ IoU=0.5:0.95 | 0.7921 | ## Citation If you use this model in your research, please cite our papers: ```bibtex @article{Layer2025LLMTrace, Title = {{LLMTrace: A Corpus for Classification and Fine-Grained Localization of AI-Written Text}}, Author = {Irina Tolstykh and Aleksandra Tsybina and Sergey Yakubson and Maksim Kuprashevich}, Year = {2025}, Eprint = {arXiv:2509.21269} } @article{tolstykh2024gigacheck, title={{GigaCheck: Detecting LLM-generated Content}}, author={Irina Tolstykh and Aleksandra Tsybina and Sergey Yakubson and Aleksandr Gordeev and Vladimir Dokholyan and Maksim Kuprashevich}, journal={arXiv preprint arXiv:2410.23728}, year={2024} } ```