Update README.md

99b034e verified 9 months ago

5.61 kB

	---
	base_model:
	- google/flan-t5-xxl
	library_name: transformers
	license: apache-2.0
	datasets:
	- iedeveci/WillowNLtoFOL
	- yuan-yang/MALLS-v0
	language:
	- en
	pipeline_tag: translation
	tags:
	- NLTOFOL
	- NL
	- FOL
	---

	# Model Card for fvossel/flan-t5-xxl-nl-to-fol

	This model contains LoRA adapter weights for the base model [`google/flan-t5-xxl`](https://huggingface.co/google/flan-t5-xxl). It was trained to translate natural language statements into First-Order Logic (FOL) representations.

	## Model Details

	### Model Description

	- Developed by: Vossel et al. at Osnabrück University
	- Funded by: Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331
	- Model type: Encoder-decoder sequence-to-sequence model (T5 architecture)
	- Language(s) (NLP): English, FOL
	- License: This repository contains only LoRA adapter weights, trained using the base model [`google/flan-t5-xxl`](https://huggingface.co/google/flan-t5-xxl), which is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). These adapter weights are also released under the Apache 2.0 License.
	- Finetuned from model: google/flan-t5-xxl

	## Uses

	### Direct Use

	This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. Use cases include:

	- Automated semantic parsing and formalization of NL statements into symbolic logic.
	- Supporting explainable AI systems that require symbolic reasoning based on language input.
	- Research in neurosymbolic AI, logic-based natural language understanding, and formal verification.
	- Integration into pipelines for natural language inference, question answering, or knowledge base population.

	Users should verify and validate symbolic formulas generated by the model for correctness depending on the application.

	### Downstream Use

	The LoRA adapter can be further fine-tuned or combined with other models for domain-specific formalization tasks (e.g., legal, biomedical). Suitable for interactive systems requiring formal reasoning.

	### Out-of-Scope Use

	- Not designed for general natural language generation.
	- May struggle with ambiguous, highly figurative, or out-of-domain input.
	- Outputs should not be used as final decisions in critical areas without expert review.

	### Recommendations

	- Validate outputs carefully before use in critical applications.
	- Be aware of possible biases from training data and synthetic data sources.
	- Specialized for English NL and FOL; may not generalize to other languages or logics.
	- Use human-in-the-loop workflows for sensitive tasks.
	- Intended for research and prototyping, not standalone critical systems.

	## How to Get Started with the Model

	```python
	import torch
	from transformers import T5Tokenizer, T5ForConditionalGeneration

	# Load tokenizer and model
	model_path = "fvossel/flan-t5-xxl-nl-to-fol" # or local path if not pushed to HF
	tokenizer = T5Tokenizer.from_pretrained(model_path)
	model = T5ForConditionalGeneration.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16)

	# Example NL input
	nl_input = "All dogs are animals."

	# Preprocess prompt
	input_text = "translate English natural language statements into first-order logic (FOL): " + nl_input
	inputs = tokenizer(input_text, return_tensors="pt", padding=True).to("cuda")

	# Generate prediction
	with torch.no_grad():
	outputs = model.generate(
	inputs["input_ids"],
	max_length=256,
	min_length=1,
	num_beams=5,
	length_penalty=2.0,
	early_stopping=False,
	)

	# Decode and print result
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Training Details

	### Training Data

	The model was fine-tuned on two datasets:

	- WillowNLtoFOL: Contains over 16,000 NL-FOL pairs. Published in:
	Deveci, İ. E. (2024). Transformer models for translating natural language sentences into formal logical expressions.
	Licensed under CC BY-NC-ND 4.0.

	- MALLS-v0: 34,000 NL-FOL pairs generated by GPT-4, syntactically checked.
	Licensed under Attribution-NonCommercial 4.0, subject to OpenAI terms.

	### Training Procedure

	Fine-tuning was performed using LoRA adapters on the pre-trained `google/flan-t5-xxl` model with:

	- Prompt-based instruction tuning
	- Single-GPU training with bf16 mixed precision

	### Training Hyperparameters

	- Training regime: bf16 mixed precision
	- Batch size: 8 (per device)
	- Learning rate: 1e-4
	- Number of epochs: 12
	- Optimizer: AdamW
	- Adam epsilon: 1e-8
	- Scheduler: Linear warmup with 500 steps
	- Gradient accumulation steps: 1
	- Weight decay: 0.01
	- LoRA rank (r): 16
	- LoRA alpha: 32
	- LoRA dropout: 0.05
	- Target modules: ["q", "k", "v", "o", "wi", "wo"]
	- Bias: none
	- Task type: SEQ_2_SEQ_LM
	- Early stopping patience: 4 epochs
	- Evaluation strategy: per epoch
	- Save strategy: per epoch
	- DDP parameters:
	- `ddp_find_unused_parameters=False`
	- `ddp_backend="nccl"`


	# Citation

	If you use this models for scientific purposes, please cite the following paper:

	```
	@misc{vossel2025advancingnaturallanguageformalization,
	title={Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs},
	author={Felix Vossel and Till Mossakowski and Björn Gehrke},
	year={2025},
	eprint={2509.22338},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2509.22338},
	}
	```