--- base_model: - google/flan-t5-xxl library_name: transformers license: apache-2.0 datasets: - iedeveci/WillowNLtoFOL - yuan-yang/MALLS-v0 language: - en pipeline_tag: translation tags: - NLTOFOL - NL - FOL --- # Model Card for fvossel/flan-t5-xxl-nl-to-fol This model contains **LoRA adapter weights** for the base model [`google/flan-t5-xxl`](https://huggingface.co/google/flan-t5-xxl). It was trained to translate **natural language statements into First-Order Logic (FOL)** representations. ## Model Details ### Model Description - **Developed by:** Vossel et al. at Osnabrück University - **Funded by:** Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331 - **Model type:** Encoder-decoder sequence-to-sequence model (T5 architecture) - **Language(s) (NLP):** English, FOL - **License:** This repository contains **only LoRA adapter weights**, trained using the base model [`google/flan-t5-xxl`](https://huggingface.co/google/flan-t5-xxl), which is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). These adapter weights are also released under the **Apache 2.0 License**. - **Finetuned from model:** google/flan-t5-xxl ## Uses ### Direct Use This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. Use cases include: - Automated semantic parsing and formalization of NL statements into symbolic logic. - Supporting explainable AI systems that require symbolic reasoning based on language input. - Research in neurosymbolic AI, logic-based natural language understanding, and formal verification. - Integration into pipelines for natural language inference, question answering, or knowledge base population. Users should verify and validate symbolic formulas generated by the model for correctness depending on the application. ### Downstream Use The LoRA adapter can be further fine-tuned or combined with other models for domain-specific formalization tasks (e.g., legal, biomedical). Suitable for interactive systems requiring formal reasoning. ### Out-of-Scope Use - Not designed for general natural language generation. - May struggle with ambiguous, highly figurative, or out-of-domain input. - Outputs should not be used as final decisions in critical areas without expert review. ### Recommendations - Validate outputs carefully before use in critical applications. - Be aware of possible biases from training data and synthetic data sources. - Specialized for English NL and FOL; may not generalize to other languages or logics. - Use human-in-the-loop workflows for sensitive tasks. - Intended for research and prototyping, not standalone critical systems. ## How to Get Started with the Model ```python import torch from transformers import T5Tokenizer, T5ForConditionalGeneration # Load tokenizer and model model_path = "fvossel/flan-t5-xxl-nl-to-fol" # or local path if not pushed to HF tokenizer = T5Tokenizer.from_pretrained(model_path) model = T5ForConditionalGeneration.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16) # Example NL input nl_input = "All dogs are animals." # Preprocess prompt input_text = "translate English natural language statements into first-order logic (FOL): " + nl_input inputs = tokenizer(input_text, return_tensors="pt", padding=True).to("cuda") # Generate prediction with torch.no_grad(): outputs = model.generate( inputs["input_ids"], max_length=256, min_length=1, num_beams=5, length_penalty=2.0, early_stopping=False, ) # Decode and print result print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details ### Training Data The model was fine-tuned on two datasets: - **WillowNLtoFOL:** Contains over 16,000 NL-FOL pairs. Published in: Deveci, İ. E. (2024). *Transformer models for translating natural language sentences into formal logical expressions.* Licensed under CC BY-NC-ND 4.0. - **MALLS-v0:** 34,000 NL-FOL pairs generated by GPT-4, syntactically checked. Licensed under Attribution-NonCommercial 4.0, subject to OpenAI terms. ### Training Procedure Fine-tuning was performed using LoRA adapters on the pre-trained `google/flan-t5-xxl` model with: - Prompt-based instruction tuning - Single-GPU training with bf16 mixed precision ### Training Hyperparameters - **Training regime:** bf16 mixed precision - **Batch size:** 8 (per device) - **Learning rate:** 1e-4 - **Number of epochs:** 12 - **Optimizer:** AdamW - **Adam epsilon:** 1e-8 - **Scheduler:** Linear warmup with 500 steps - **Gradient accumulation steps:** 1 - **Weight decay:** 0.01 - **LoRA rank (r):** 16 - **LoRA alpha:** 32 - **LoRA dropout:** 0.05 - **Target modules:** ["q", "k", "v", "o", "wi", "wo"] - **Bias:** none - **Task type:** SEQ_2_SEQ_LM - **Early stopping patience:** 4 epochs - **Evaluation strategy:** per epoch - **Save strategy:** per epoch - **DDP parameters:** - `ddp_find_unused_parameters=False` - `ddp_backend="nccl"` # Citation If you use this models for scientific purposes, **please cite the following paper**: ``` @misc{vossel2025advancingnaturallanguageformalization, title={Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs}, author={Felix Vossel and Till Mossakowski and Björn Gehrke}, year={2025}, eprint={2509.22338}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.22338}, } ```