AINL Eval has this year more than a dozen participants. We were flattered that GigaCheck command decided to participate in our challenge.
The challenge was to not only identify if a text was human or AI-generated but also pinpoint which exact model (e.g., GPT-4 Turbo, Gemma 2-27B) was used.
�
�GigaCheck team got the 1st place. Enhancing GigaCheck tool with an additional classification layer, they achieved:
✅ 91% accuracy on public test data, including texts from an unfamiliar model
✅ 86% accuracy on private test sets with previously unseen domains
🥈The team who won the 2nd place is from HSE a
nd ReText.Ai. They combined statistical and neural model features to improve overall detection performance.
Applying this approach, the team was able to achieve 85% accuracy on private test sets🚀
🥉The third-place team achieved 82% accuracy, beating our baseline (81% accuracy).
The details on the competition will be described in a paper included in AINL proceedings, stay tuned!