---
language:
- en
tags:
- building-energy-modeling
- bem
- energyplus
- civil-engineering
- large-language-model
- llama3
- fine-tuning
- generative-ai
license: other
base_model:
- meta-llama/Llama-3.2-11B-Vision-Instruct
new_version: xfu20/BEMGPT_1.0.1
pipeline_tag: document-question-answering
metrics:
- bleu
- bertscore
- rouge
---
<center>
<div style="text-align: center;">
  <img 
    src="https://i.postimg.cc/525CTw6T/Buildings62.png" 
    alt="BEMGPT"
    style="width: 50%; max-width: 50%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
  />
</div>
</center>

## 1. Model Information

Model Name: BEMGPT
Finetuned From: Meta Llama 3.2 11B Vision Instruct
Developers: Xiaoqin Fu and Liang Zhang @ TensorBuild Lab, Department of Civil and Architectural Engineering and Mechanics (CAEM), University of Arizona
Release Year: 2025

### Overview

BEMGPT is a large language model (LLM) fine-tuned specifically for Building Energy Modeling (BEM). It addresses the complexity of BEM data and the fragmented nature of domain knowledge by incorporating domain-specific expertise through fine-tuning.

By leveraging the generative capabilities of large language models, BEMGPT enhances knowledge representation, reasoning, and automation within the BEM workflow—making energy modeling more intelligent, efficient, and accessible.

## How to use

Question-Answering for Building Energy

### How to Get Started with the Model

Use the code below to download and run the model.

https://github.com/fuArizona/BEMGPT/tree/main/code


## 2. Training Details

### Training Source

EnergyPlus documentation such as engineering reference and input-output reference

#### Training Methods

 - LoRA (Low-Rank Adaptation)
 - QLoRA (Quantized Low-Rank Adaptation)
 - DoRA (Weight-Decomposed Low-Rank Adaptation)

......
   
## 3. Evaluation

### Testing Data & Metrics

#### Metrics
 - BLEU: Measures token-level overlap between generated and reference texts 
 - ROUGE: Recall-oriented similarity metrics, including ROUGE-1, ROUGE-2, and ROUGE-L 
 - BERTScore: Embedding-based semantic similarity
 - Score evaluated by the OpenAI model gpt-4o (2024-11-20 version): Human-like correctness score (0.00–100.00) 

Example GPT-4o Evaluation Prompt:

    prompt = f"""
   
    You are an expert evaluator and semantic grader. Grade the correctness of the provided answer as concise as possible on a decimal ranging from 0.00 to 100.00, where:
   
    100.00 = Completely correct, perfectly matches the reference text, or provides necessary information and details
   
    90.00 = Mostly correct, very accurate and comprehensive, minor details may be missing
   
    80.00 = Moderately correct, mostly accurate but may have minor errors or omissions
   
    70.00 = Partially correct, contains some inaccuracies, or misses information
   
    60.00 = Only little correct, contains few accuracies, or misses most information
   
    0.00 = Only repeated question, no answer, or completely incorrect, useless, and meaningless answer
   
    Decimals are acceptable and encouraged to reflect partial correctness.
   
    ......
   
    Use the reference text to evaluate the answer. If the reference text is empty, incomprehensible, unsuitable, or useless, please do not use the reference text and directly evaluate the answer.

    Ignore any interrogative sentences within the answer. Return only the decimal score (e.g., 97.23), with no additional text.

    Instruction: {instruction}
   
    Question: {question}
   
    Answer: {provided_answer}
   
    Reference Text: "{text}"
    """

## 4. Technical Specifications

#### Hardware
 - CPUs: 2 × Intel Xeon Gold 6438Y+ (32 cores @ 2.0GHz, 60MB cache, 205W)
 - Memory: 16 × 64GB DDR5 ECC RDIMM (4800MHz)
 - Storage:
   - 2 × 960GB Micron 7450 PRO M.2 NVMe
   - 15.36TB Micron 7450 PRO U.3 NVMe
 - GPUs:
   - 3 × NVIDIA H100 (80GB HBM2e, PCIe 5.0)
   - 1 × NVIDIA T400 (4GB GDDR6, PCIe 3.0)
 - Network: Broadcom NetXtreme 10Gb Ethernet (2× SFP+)

#### Software
 - OS: Ubuntu 22.04 LTS (64-bit)
 - Environment: Anaconda 24.11.3
 - Python: 3.12.9
 - Libraries: torch, transformers, huggingface_hub, datasets, peft, trl, nltk, rouge, bert_score, numpy, etc.

## 5. What are new?
 - Domain-Specific Fine-Tuning for BEM: BEMGPT is the first large language model fine-tuned specifically for BEM, addressing the lack of domain expertise in general-purpose LLMs.
 - Embedded Domain Knowledge: Unlike Retrieval-Augmented Generation (RAG), BEMGPT integrates domain knowledge directly into the model parameters, making it a cost-effective alternative that reduces inference latency and dependency on external databases.
 - Foundation for Intelligent, AI-Driven BEM Tools: BEMGPT establishes a new foundation for intelligent automation in BEM workflows, supporting more efficient, scalable, and accessible applications across the building energy sector.

## 6. Who should use this model?

#### Design and construction professionals
 - Architects and designers
 - Mechanical, Electrical, and Plumbing (MEP) engineers

#### Building owners and operators
 - Building owners
 - Facility and energy managers

#### Government agencies and policymakers
 - Local, state, and federal agencies
 - Urban planners
 - Code officials
   
#### Utility companies
 - Electric and gas utilities

#### Researchers and software developers
 - Researchers
 - Software developers

#### Manufacturers
 - Equipment manufacturers

## 7. Contact

 If you have any questions, please raise an issue or contact us at fuxiaoqin@arizona.edu and liangzhang1@arizona.edu.