A newer version of the Gradio SDK is available: 6.17.3
title: RefCheck
emoji: π
colorFrom: blue
colorTo: indigo
sdk: gradio
app_file: app.py
python_version: 3.11
suggested_hardware: cpu-basic
fullWidth: true
short_description: Upload BibTeX, validate citations, download fixes.
tags:
- bibtex
- citations
- academic
- bibliography
RefCheck π
A Citation Hallucination Detector & Auto-Fixer
Validate and automatically correct your BibTeX bibliography against multiple academic databases.
Why RefCheck?
Academic papers often contain citation errors β wrong titles, incorrect authors, mismatched years, or even completely fabricated references (hallucinations from AI tools). RefCheck automatically:
- β Validates each citation against 6 academic databases
- π§ Auto-fixes metadata mismatches (title, authors, year, DOI)
- ποΈ Removes unverifiable/hallucinated entries
- π Reports a clear verification summary
Features
Multi-Source Verification
RefCheck cross-references your citations against:
| Source | Lookup Methods |
|---|---|
| arXiv | arXiv ID, Title search |
| CrossRef | DOI, Title search |
| DBLP | Title search |
| Semantic Scholar | DOI, Title search |
| OpenAlex | DOI, Title search |
| Google Scholar | Title search (disabled by default) |
Two-Pass Workflow
- Pass 1 β Validate & Fix: Checks each entry, auto-corrects metadata, removes invalid citations
- Pass 2 β Verify: Re-validates the cleaned file to confirm all entries are correct
Installation
# Clone the repository
git clone https://github.com/voidful/RefCheck.git
cd RefCheck
# Install dependencies
pip install -r requirements.txt
Requirements
- Python 3.9+
- Dependencies:
bibtexparser,requests,beautifulsoup4,rich,Unidecode,lxml
Usage
Hugging Face Space
This repository is ready to run as a Gradio Space. Create a Hugging Face Space with the Gradio SDK, push these files, and the Space will launch app.py.
The Space UI accepts a .bib upload and returns:
- a corrected BibTeX file
- a Markdown validation report
- a list of entries that still need manual review
Basic Usage
# Validate and auto-fix a bib file
python main.py --bib references.bib
Command-Line Options
| Option | Short | Description |
|---|---|---|
--bib |
-b |
Path to your .bib file (required) |
--output |
-o |
Output report path (optional) |
Example
# Process your bibliography
python main.py --bib paper/references.bib
# With custom output path
python main.py --bib refs.bib --output validation_report.md
How It Works
βββββββββββββββββββ
β Load .bib file β
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β For each entry: β
β 1. Query academic databases β
β 2. Compare metadata (title, author, yr)β
β 3. Calculate confidence score β
ββββββββββ¬βββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Decision: β
β β’ confidence > 85% β Auto-fix metadata β
β β’ Match found β Keep as-is β
β β’ No match β Remove entry β
ββββββββββ¬βββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Save updated .bib file β
β Run verification pass β
βββββββββββββββββββββββββββββββββββββββββββ
Output
RefCheck displays real-time progress and a final summary:
π BibGuard - Auto-Fix & Verify
Target: references.bib
Found 42 entries. Running validation and auto-fix...
Validating & Fixing βββββββββββββββββ 100% 42/42 β 38 β 2 β 2
βοΈ Updates:
- Fixed 2 entries (metadata updated)
- Removed 2 invalid/hallucinated entries
β File saved.
π Double checking (Re-validation)...
Verifying βββββββββββββββββββββββββββ 100% 40/40 β 40
==================================================
π Final Status
==================================================
Total: 40
β Verified: 40
β Issues: 0
β Not found: 0
Status Meanings
| Symbol | Meaning |
|---|---|
| β Verified | Entry matches a known publication |
| β οΈ Fixed | Metadata was auto-corrected |
| β Removed | Entry could not be verified (likely hallucination) |
Project Structure
RefCheck/
βββ main.py # Entry point & workflow orchestration
βββ requirements.txt # Python dependencies
βββ README.md
βββ src/
βββ fetcher.py # API clients for academic databases
βββ comparator.py # Metadata comparison & scoring
βββ parser.py # BibTeX parsing & saving
βββ utils.py # Progress display & text utilities
License
MIT License β see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Acknowledgments
Built with:
- bibtexparser for BibTeX handling
- Rich for beautiful terminal output
- APIs from arXiv, CrossRef, DBLP, Semantic Scholar, and OpenAlex