FloatDo
/

exaone-4.0-1.2b-float-right-tagger

Text Classification

text-generation

Model card Files Files and versions

exaone-4.0-1.2b-float-right-tagger / README.md

Calvin806's picture

Update README.md

4a9fca0 verified 4 months ago

|

History Blame Contribute Delete

2.75 kB

	---
	license: other
	license_name: exaone
	license_link: https://huggingface.co/LGAI-EXAONE/EXAONE-Deep-2.4B/blob/main/LICENSE
	base_model: LGAI-EXAONE/EXAONE-4.0-1.2B
	tags:
	- exaone
	- lora
	- finetune
	- korean
	- tagger
	- text-classification
	- text-generation
	library_name: transformers
	---

	# EXAONE-4.0-1.2B Tagger (Merged)

	This repository contains a merged checkpoint of:
	- Base: `LGAI-EXAONE/EXAONE-4.0-1.2B`
	- LoRA fine-tune: a lightweight SFT adapter trained to behave as a Korean tag generator.

	The model is designed to output a JSON array of 3–10 high-level tags for a given Korean sentence.

	GGUF : https://huggingface.co/FloatDo/exaone-4.0-1.2b-float-right-tagger-GGUF

	## Intended Behavior

	Given an input sentence, the model should output ONLY a JSON array:
	- 3–10 tags
	- high-level topics (not overly detailed)
	- no underscores `_`
	- no extra text (ideally)

	In practice, some runs may emit extra text (e.g., reasoning markers).
	For production, parse the first JSON array from the output.

	## Quick Start (Transformers)

	```python
	import re, json, torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	MODEL = "<this_repo_or_local_path>"

	def extract_first_json_array(s: str):
	m = re.search(r"$begin:math:display$\[\\s\\S\]\*\?$end:math:display$", s)
	return json.loads(m.group(0)) if m else None

	tok = AutoTokenizer.from_pretrained(MODEL, trust_remote_code=True, use_fast=True)
	if tok.pad_token is None:
	tok.pad_token = tok.eos_token

	model = AutoModelForCausalLM.from_pretrained(
	MODEL, trust_remote_code=True, torch_dtype="auto", device_map="cuda"
	).eval()

	messages = [
	{"role":"system","content":"너는 태그 생성기다. 반드시 JSON 배열만 출력한다. 다른 글자 금지."},
	{"role":"user","content":"규칙: 태그 3~10개, 큰 주제, 언더스코어 금지, JSON 배열만. 문장: 직장 상사가 계속 야근을 시켜서 스트레스 받는다. 퇴사 고민 중."}
	]

	prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	enc = tok(prompt, return_tensors="pt").to("cuda")

	out = model.generate(**enc, max_new_tokens=64, do_sample=False, temperature=0.0,
	pad_token_id=tok.pad_token_id, eos_token_id=tok.eos_token_id)

	text = tok.decode(out[0], skip_special_tokens=True)
	tags = extract_first_json_array(text)
	print("RAW:", text)
	print("TAGS:", tags)


	Training Notes
	• This is not a general chat model tuning.
	• The objective is to improve consistency of tag-only outputs for Korean input.
	• If you need strict JSON-only output, use a post-processor that extracts the first JSON array.

	Quantization / GGUF

	A GGUF / quantized release may be provided separately.