aitf-kpm-ugm/Qwen3-4B-KPM3

Model Description

This repository contains a finetuned version of the aitf-kpm-ugm/Qwen3-4B-CPT-Base model, specifically trained for structured issue analysis based on news articles. The model is finetuned using Supervised Finetuning (SFT) to generate JSON-formatted output containing an analysis of a given news text, including the main issue, background, recent developments, official government statements, and public figures' opinions.

Usage

To use this model, you'll need the transformers, peft, and torch libraries. You can load the model and tokenizer as follows:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
from google.colab import userdata # For HF_TOKEN

# Model paths
base_model_name = "aitf-kpm-ugm/Qwen3-4B-CPT-Base"
tokenizer_name = "aitf-kpm-ugm/Qwen3-4B-CPT-Base"
adapter_path = "/content/drive/MyDrive/AITF tim 3/SFT/model/qwen3_4b8k_SFT_10K" # Update with your actual saved adapter path
hf_token = userdata.get('HF_TOKEN') # Or set directly if not using Colab secrets

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, token=hf_token)

# Set the chat template (crucial for consistent formatting)
tokenizer.chat_template = (
    "{^ for message in messages ^}"
    "{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>\n'}}"
    "{^ endfor ^}"
    "{^ if add_generation_prompt ^}"
    "{{ '<|im_start|>assistant\n' }}"
    "{^ endif ^}"
)

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    token=hf_token
)

# Load and merge LoRA adapter
model = PeftModel.from_pretrained(model, adapter_path)
model = model.merge_and_unload() # Merge LoRA weights into the base model
model.eval()

# Define the prompt building function (matching training format)
def build_prompt(news_text):
    messages = [
        {"role": "system", "content": "Kamu adalah AI analis isu publik. Tugasmu adalah menghasilkan analisis isu terstruktur berdasarkan berita dalam format JSON yang valid dan dapat diparsing. Jangan menghasilkan teks di luar JSON. Gunakan bahasa formal, objektif, dan hanya berdasarkan informasi pada berita dan jangan menambahkan asumsi di luar konteks.."},
        {"role": "user", "content": f"Susun analisis strategis dari berita berikut ke dalam JSON terstruktur. Sertakan penjelasan isu utama, konteks permasalahan, perkembangan terkini, pernyataan resmi pemerintah, serta opini tokoh publik yang dikutip.\n\n{news_text}"}
    ]
    return tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Example usage
news_article = "Kementerian Komunikasi dan Digital (Kemkomdigi) mengadakan rapat koordinasi untuk membahas regulasi baru terkait penggunaan teknologi AI dalam pelayanan publik. Menteri Komunikasi dan Digital, Budi Arie Setiadi, menyatakan bahwa regulasi ini penting untuk memastikan pemanfaatan AI yang etis dan aman. Diskusi melibatkan berbagai pakar dari industri teknologi dan akademisi." # Replace with your news text
prompt = build_prompt(news_article)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    output_tokens = model.generate(
        **inputs,
        max_new_tokens=2048, # Adjust as needed
        do_sample=True,
        temperature=0.2,
        top_p=0.9,
        repetition_penalty=1.1,
        # stopping_criteria will be needed if you want to stop generation at '}'
    )

response = tokenizer.decode(output_tokens[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Training Details

Base Model

Model: aitf-kpm-ugm/Qwen3-4B-CPT-Base
Tokenizer: aitf-kpm-ugm/Qwen3-4B-CPT-Base (with custom chat template)

Finetuning Methodology

The model was finetuned using the Supervised Finetuning (SFT) method with Low-Rank Adaptation (LoRA) for parameter-efficient training. The trl library's SFTTrainer was utilized.

Dataset

Name: dataset_sft_rag_standby_full_merged_final.jsonl
Generation Method:: The synthetic dataset used for training was generated using Claude Haiku.
Format: JSONL, containing messages field structured for chat template (system, user, assistant roles).
Size: Approximately 9,670 training examples and 1,080 evaluation examples after filtering for token length.
Token Length Filtering: Examples were filtered to have a maximum token length of 10240, ensuring efficient training on the Qwen3-4B architecture which supports long contexts.

LoRA Configuration

lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
bias: "none"
task_type: "CAUSAL_LM"
target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]

Training Arguments (`SFTConfig`)

output_dir: ./qwen3_4b8k_SFT
per_device_train_batch_size: 1
gradient_accumulation_steps: 16
max_seq_length: 10240
learning_rate: 1e-5
lr_scheduler_type: cosine
warmup_ratio: 0.05
num_train_epochs: 2
bf16: True (for mixed-precision training)
logging_steps: 10
eval_strategy: steps
eval_steps: 200
save_strategy: steps
save_steps: 200
gradient_checkpointing: True
report_to: wandb
packing: False

Completion-Only Loss

The training was configured to calculate the loss only on the assistant's response, ignoring the system and user prompts. This was achieved using DataCollatorForCompletionOnlyLM with response_template = "<|im_start|>assistant\n".

Evaluation

The model's performance was evaluated by comparing its output structure and content to the base CPT model, focusing on its ability to generate well-formed JSON analysis from news articles. Qualitative evaluation showed improved adherence to the desired JSON format and content structure after SFT.

Acknowledgements

Developed by AITF-KPM-UGM team.
Built upon the excellent work of the Qwen team for their Qwen3 models.
Leveraged Hugging Face transformers and trl libraries.

Downloads last month: 56

Safetensors

Model size

4B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aitf-kpm-ugm/Qwen3-4B-KPM3

Base model

Qwen/Qwen3-4B-Base

Finetuned

aitf-kpm-ugm/Qwen3-4B-CPT-Base