Instructions to use uisikdag/qwen3-14b-tr-wiki-monthly-qlora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use uisikdag/qwen3-14b-tr-wiki-monthly-qlora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-14B")
model = PeftModel.from_pretrained(base_model, "uisikdag/qwen3-14b-tr-wiki-monthly-qlora")

Transformers

How to use uisikdag/qwen3-14b-tr-wiki-monthly-qlora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="uisikdag/qwen3-14b-tr-wiki-monthly-qlora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("uisikdag/qwen3-14b-tr-wiki-monthly-qlora", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use uisikdag/qwen3-14b-tr-wiki-monthly-qlora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "uisikdag/qwen3-14b-tr-wiki-monthly-qlora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "uisikdag/qwen3-14b-tr-wiki-monthly-qlora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/uisikdag/qwen3-14b-tr-wiki-monthly-qlora

SGLang

How to use uisikdag/qwen3-14b-tr-wiki-monthly-qlora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "uisikdag/qwen3-14b-tr-wiki-monthly-qlora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "uisikdag/qwen3-14b-tr-wiki-monthly-qlora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "uisikdag/qwen3-14b-tr-wiki-monthly-qlora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "uisikdag/qwen3-14b-tr-wiki-monthly-qlora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use uisikdag/qwen3-14b-tr-wiki-monthly-qlora with Docker Model Runner:
```
docker model run hf.co/uisikdag/qwen3-14b-tr-wiki-monthly-qlora
```

FineTune Code

Trained on first 50000 records of the dataset

dataset = load_dataset("omarkamali/wikipedia-monthly", "20251001.tr", split="train[:50000]")

Inference Code

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

def main():
    model_name = "Qwen/Qwen3-14B"
    adapter_path = "uisikdag/qwen3-14b-tr-wiki-monthly-qlora"
    
    print(f"Loading {model_name}...")

    try:
        tokenizer = AutoTokenizer.from_pretrained(model_name)
    except Exception as e:
        print(f"Error loading tokenizer: {e}")
        return

    # --- FIX: Changed import location for is_bitsandbytes_available ---
    try:
        from transformers.utils import is_bitsandbytes_available
    except ImportError:
        # Fallback for older versions if needed, but the main cause is fixed
        from transformers.integrations import is_bitsandbytes_available
        
    if not is_bitsandbytes_available():
        print("Bitsandbytes not installed, falling back to float16. Ensure bnb is installed for 4-bit loading.")
        bnb_config = None
        torch_dtype = torch.float16
    else:
        from transformers import BitsAndBytesConfig
        bnb_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.float16,
        )
        torch_dtype = None
    # -------------------------------------------------------------------

    try:
        base_model = AutoModelForCausalLM.from_pretrained(
            model_name,
            quantization_config=bnb_config,
            device_map="auto",
            torch_dtype=torch_dtype
        )
    except Exception as e:
        print(f"Error loading base model: {e}")
        return

    print(f"Loading adapter from {adapter_path}...")
    try:
        model = PeftModel.from_pretrained(base_model, adapter_path)
    except Exception as e:
        print(f"Error loading adapter (ensure training finished): {e}")
        print("Falling back to base model only for demonstration...")
        model = base_model

    print("\n--- Qwen3 14B Inference (Type 'quit' to exit) ---")
    
    messages = []
    
    while True:
        user_input = input("User: ")
        if user_input.lower() in ["quit", "exit"]:
            break
            
        messages.append({"role": "user", "content": user_input})
        
        text = tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True
        )
        
        model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
        
        generated_ids = model.generate(
            **model_inputs,
            max_new_tokens=1024,
            do_sample=True,
            temperature=0.7
        )
        
        input_ids_length = model_inputs.input_ids.shape[1]
        
        generated_ids = [
            output_ids[input_ids_length:] for output_ids in generated_ids
        ]
        
        response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
        
        print(f"Assistant: {response}")
        messages.append({"role": "assistant", "content": response})

if __name__ == "__main__":
    main()

Downloads last month: 5

Model tree for uisikdag/qwen3-14b-tr-wiki-monthly-qlora

Base model

Qwen/Qwen3-14B-Base

Finetuned

Qwen/Qwen3-14B

Adapter

(234)

this model

uisikdag
/

qwen3-14b-tr-wiki-monthly-qlora

Model tree for uisikdag/qwen3-14b-tr-wiki-monthly-qlora

Dataset used to train uisikdag/qwen3-14b-tr-wiki-monthly-qlora