---
base_model:
- google/flan-t5-xxl
library_name: transformers
license: apache-2.0
datasets:
- iedeveci/WillowNLtoFOL
- yuan-yang/MALLS-v0
language:
- en
pipeline_tag: translation
tags:
- NLTOFOL
- NL
- FOL
---

# Model Card for fvossel/flan-t5-xxl-nl-to-fol

This model contains **LoRA adapter weights** for the base model [`google/flan-t5-xxl`](https://huggingface.co/google/flan-t5-xxl). It was trained to translate **natural language statements into First-Order Logic (FOL)** representations.

## Model Details

### Model Description

- **Developed by:** Vossel et al. at Osnabrück University  
- **Funded by:** Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331  
- **Model type:** Encoder-decoder sequence-to-sequence model (T5 architecture)  
- **Language(s) (NLP):** English, FOL  
- **License:** This repository contains **only LoRA adapter weights**, trained using the base model [`google/flan-t5-xxl`](https://huggingface.co/google/flan-t5-xxl), which is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). These adapter weights are also released under the **Apache 2.0 License**.  
- **Finetuned from model:** google/flan-t5-xxl

## Uses

### Direct Use

This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. Use cases include:

- Automated semantic parsing and formalization of NL statements into symbolic logic.  
- Supporting explainable AI systems that require symbolic reasoning based on language input.  
- Research in neurosymbolic AI, logic-based natural language understanding, and formal verification.  
- Integration into pipelines for natural language inference, question answering, or knowledge base population.

Users should verify and validate symbolic formulas generated by the model for correctness depending on the application.

### Downstream Use

The LoRA adapter can be further fine-tuned or combined with other models for domain-specific formalization tasks (e.g., legal, biomedical). Suitable for interactive systems requiring formal reasoning.

### Out-of-Scope Use

- Not designed for general natural language generation.  
- May struggle with ambiguous, highly figurative, or out-of-domain input.  
- Outputs should not be used as final decisions in critical areas without expert review.

### Recommendations

- Validate outputs carefully before use in critical applications.  
- Be aware of possible biases from training data and synthetic data sources.  
- Specialized for English NL and FOL; may not generalize to other languages or logics.  
- Use human-in-the-loop workflows for sensitive tasks.  
- Intended for research and prototyping, not standalone critical systems.

## How to Get Started with the Model

```python
import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load tokenizer and model
model_path = "fvossel/flan-t5-xxl-nl-to-fol"  # or local path if not pushed to HF
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16)

# Example NL input
nl_input = "All dogs are animals."

# Preprocess prompt
input_text = "translate English natural language statements into first-order logic (FOL): " + nl_input
inputs = tokenizer(input_text, return_tensors="pt", padding=True).to("cuda")

# Generate prediction
with torch.no_grad():
    outputs = model.generate(
        inputs["input_ids"],
        max_length=256,
        min_length=1,
        num_beams=5,
        length_penalty=2.0,
        early_stopping=False,
    )

# Decode and print result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Training Details

### Training Data

The model was fine-tuned on two datasets:

- **WillowNLtoFOL:** Contains over 16,000 NL-FOL pairs. Published in:  
  Deveci, İ. E. (2024). *Transformer models for translating natural language sentences into formal logical expressions.*  
  Licensed under CC BY-NC-ND 4.0.

- **MALLS-v0:** 34,000 NL-FOL pairs generated by GPT-4, syntactically checked.  
  Licensed under Attribution-NonCommercial 4.0, subject to OpenAI terms.

### Training Procedure

Fine-tuning was performed using LoRA adapters on the pre-trained `google/flan-t5-xxl` model with:

- Prompt-based instruction tuning  
- Single-GPU training with bf16 mixed precision

### Training Hyperparameters

- **Training regime:** bf16 mixed precision  
- **Batch size:** 8 (per device)  
- **Learning rate:** 1e-4  
- **Number of epochs:** 12  
- **Optimizer:** AdamW  
- **Adam epsilon:** 1e-8  
- **Scheduler:** Linear warmup with 500 steps  
- **Gradient accumulation steps:** 1  
- **Weight decay:** 0.01  
- **LoRA rank (r):** 16  
- **LoRA alpha:** 32  
- **LoRA dropout:** 0.05  
- **Target modules:** ["q", "k", "v", "o", "wi", "wo"]  
- **Bias:** none  
- **Task type:** SEQ_2_SEQ_LM  
- **Early stopping patience:** 4 epochs  
- **Evaluation strategy:** per epoch  
- **Save strategy:** per epoch  
- **DDP parameters:**  
  - `ddp_find_unused_parameters=False`  
  - `ddp_backend="nccl"`


# Citation

If you use this models for scientific purposes, **please cite the following paper**:

```
@misc{vossel2025advancingnaturallanguageformalization,
      title={Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs}, 
      author={Felix Vossel and Till Mossakowski and Björn Gehrke},
      year={2025},
      eprint={2509.22338},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.22338}, 
}
```