Instructions to use werty1248/Qwen2-7B-Korean-1930-Novel-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use werty1248/Qwen2-7B-Korean-1930-Novel-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="werty1248/Qwen2-7B-Korean-1930-Novel-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("werty1248/Qwen2-7B-Korean-1930-Novel-sft")
model = AutoModelForCausalLM.from_pretrained("werty1248/Qwen2-7B-Korean-1930-Novel-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use werty1248/Qwen2-7B-Korean-1930-Novel-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "werty1248/Qwen2-7B-Korean-1930-Novel-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "werty1248/Qwen2-7B-Korean-1930-Novel-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/werty1248/Qwen2-7B-Korean-1930-Novel-sft

SGLang

How to use werty1248/Qwen2-7B-Korean-1930-Novel-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "werty1248/Qwen2-7B-Korean-1930-Novel-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "werty1248/Qwen2-7B-Korean-1930-Novel-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "werty1248/Qwen2-7B-Korean-1930-Novel-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "werty1248/Qwen2-7B-Korean-1930-Novel-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use werty1248/Qwen2-7B-Korean-1930-Novel-sft with Docker Model Runner:
```
docker model run hf.co/werty1248/Qwen2-7B-Korean-1930-Novel-sft
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Model Card

요약 시나리오 기반 소설 생성 모델
werty1248/Korean-1930-Novel-Scene-Summarize 작업의 효과 확인용 모델입니다.

Training Details

Dataset

werty1248/Korean-1930-Novel-Scene-Summarize
저작권이 만료된 20세기 초 한국 근대문학 96편

Preprocessing

system prompt와 함께 소설의 첫 문단을 제공
이후 user가 시나리오(50%) 또는 이벤트(50%)를 제공하면 assistant가 답변을 생성
3-shot multi-turn 데이터 형식으로 변환하여 학습
프롬프트 예시는 아래에 있습니다.
Axolotl(full config는 아래에 있습니다)
- LoRA: (rank=32, alpha=128)
- NefTune_alpha: 5
- total_batch_size: 8
- num_epoch: 3
1xA100에서 약 8시간 학습

Template & How to use

유저 instruction을 무시하는 경향 있음
한자/영어 단어가 섞이는 현상 완화됨
한국어 능력이 더 떨어진 것 같음

Input(눈물을 마시는 새 도입부)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("werty1248/Qwen2-7B-Korean-1930-Novel-sft")
model = AutoModelForCausalLM.from_pretrained("werty1248/Qwen2-7B-Korean-1930-Novel-sft", torch_dtype=torch.bfloat16).to('cuda')

system_prompt = """당신은 소설 작성 어시스턴트입니다. 당신의 임무는 유저의 가이드에 따라 1900~1940년대 근대 한국 소설을 작성하는 것입니다.
- 주어진 시나리오 요약을 확인하고, 이전 대화를 참고하여 플롯을 구성하십시오.
- 풍부한 한국어 표현 및 대화를 사용하여 창의적으로 짧은 씬을 완성하세요.
- 씬의 대사에 근대 한국 특유의 표현, 어휘, 사투리, 존댓말과 반말을 반영하십시오.
- 씬의 주요 사건에 근대 한국의 역사적, 기술적 특성을 반영하십시오.
- 씬은 5~10문장으로 구성하세요.
"""

first_message = """### 첫 문단
하늘을 불사르던 용의 노여움도 잊혀지고
왕자들의 석비도 사토 속에 묻혀버린
그리고 그런 것들에 누구도 신경쓰지 않는
생존이 천박한 농담이 된 시대에

한 남자가 사막을 걷고 있었다.
"""

scenario = """### 등장인물
여관 주인, 케이건 드라카

### 시나리오
여관 주인이 문득 창밖을 바라본다. 여관 주변은 온통 사막이다. 지평선 저 멀리서 한 사내가 큼지막한 자루를 질질 끌면서 여관 방향으로 걸어온다.
"""
messages = [{'role':'system', 'content':system_prompt + first_message},
 {'role':'user', 'content':scenario},]

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=512)

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(
    prompt,
    do_sample=True,
    temperature=0.8,
    top_k=50,
    top_p=0.95,
    add_special_tokens=True
)
print(outputs[0]["generated_text"][len(prompt):])

Result

하늘은 깨끗하였다. 바람도 쏘이지 않았다. 바위 위에 굴러져 있는 포옹이 한 덩만하고 푸른 햇빛 아래 잘 보였다.

거기는 마침 한 채의 여관이 지체로 꽂혀 있었다. 좌우 멀리 바위가 섰을 뿐이다. 나무도 수목도 없는 온 천지가 훤히 여관 옆을 에워싸고 있는 것이다. 얼른 보면 한 조각 석축 속 같기도 하지만 건축이라느니 보다는 도회에 땅을 내어놓고 단순한 간판 하나를 걸어서 그대로 벌이하는 ‘여관’이다. 몇몇 좁은 길만 들어왔다가 사라졌을 뿐 한 길도 안 뚫려진 그 자리에 그대로 지탱되어 있는 여관이다. 길에 나타나는 사람은 없다. 좌우로 바위를 끼고 늘 비어 있는 산 중에서 간간이 고기 굽는 닥이와 삯바구니며 고깃가마를 걸고 행인만 몇 마리씩 지나가고 훤하게 선바닥에 햇빛만 비치고 하는 쓸쓸한 여관이다. 그것이 룸_service라는 세계와도 달리 주인도 없고 사장도 없고 손도 없고 한 간판만 걸고 천당처럼 외롭게 지키고 있는 모양이다.

그런 여관에서 이윽고 손이 한 사람 나타났다. 창문이 열렸다. 밖을 내다보는 한 점의 검은 그림자가 창 앞에서 끝끝내 오므러진다. 그러더니 다시 한참 후에 그 검은 그림자는 웃목으로 차차 나와 가지고는 그대로 사라지는 것이다.

그로부터 여관은 새로운 일을 당하였다. 바시리오 자루를 들고 한 사나이가 걸어 들어오지 않았는가. 얼굴은 먼빛으로 똑똑히 못 보이나 벌써 사면 사오 간밖에 안 되는 곳에 큼직한 자루를

Result of Original model(Qwen/Qwen2-7b-instruct)

"그게 누구냐? 왜 이렇게 먼 곳에서 이곳으로 와?" 여관 주인이 물어보며 창문을 벌리고 말했다. 그녀의 목소리는 사막의 건조함과 함께 거친 바람에 가라앉았다.

케이건 드라카라는 사내는 자신의 자루를 더 질질 끌며 여관 쪽으로 다가갔다. 그의 눈에는 희미하게 보이는 먼지와 먼지가 감싸고 있는 햇빛이 빛나는 듯했다. 그의 얼굴에는 땀방울이 굴러 내려가며, 그의 입은 말할 준비를 하고 있었다.

"여관 주인님, 저를 받아주실 수 있을까요? 제가 이곳에서 잠시 쉬고 싶습니다." 그의 목소리는 사막의 무디한 조용함을 깨우며 전해졌다.

여관 주인은 그의 체육복과 땀汃汃한 모습을 보며, 이 사내의 상황에 대해 걱정스러웠다. "그래, 네가 필요하다고 느꼈다면, 여기서 쉬어라. 하지만 왜 이런 곳까지 왔는지 설명해 줘야 할 것 같아." 그녀의 말투는 친근하면서도 분명한 지시였다.

Others

Axolotl config

base_model: Qwen/Qwen2-7B-Instruct
trust_remote_code: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: train_data.jsonl
    type: sharegpt
dataset_prepared_path:
val_set_size: 0.05
output_dir: ./outputs/out

sequence_len: 4096
sample_packing: true
eval_sample_packing: true
pad_to_sequence_len: true

adapter: lora
lora_model_dir:
lora_r: 32
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

neftune_noise_alpha: 5
gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 3
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 4
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
special_tokens:

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for werty1248/Qwen2-7B-Korean-1930-Novel-sft

Quantizations

1 model

werty1248
/

Qwen2-7B-Korean-1930-Novel-sft