saidines12/telugu_news_dataset
Viewer • Updated • 102k • 13 • 2
This model is a fine-tuned version of Google's Gemma-2-2B Instruction model, optimized for generating Telugu news headlines from article content. It has been trained using Supervised Fine-Tuning (SFT) on a Telugu news dataset.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation")
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
headline = tokenizer.decode(outputs[0], skip_special_tokens=True)
I'll help you add the evaluation information to your markdown file in a clearer tabular format.
Here's how you can structure the evaluation section:
| Metric | Base Model | Finetuned Model | Improvement |
|---|---|---|---|
| ROUGE-1 | 3.39 | 4.64 | +1.26 |
| ROUGE-2 | 0.26 | 0.41 | +0.14 |
| ROUGE-L | 3.38 | 4.63 | +1.25 |
| Category | Count | Percentage |
|---|---|---|
| Total samples | 5962 | 100% |
| Same predictions | 3 | 0.05% |
| Better predictions | 4610 | 77.32% |
| Worse predictions | 1349 | 22.63% |
torch.float16# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", revision="float16")
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
torch.bfloat16# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", device_map="auto", torch_dtype=torch.bfloat16)
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
bitsandbytes
# pip install bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
# pip install bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained("saidines12/telugu-news-headline-generation")
model = AutoModelForCausalLM.from_pretrained("saidines12/telugu-news-headline-generation", quantization_config=quantization_config)
input_text = "Generate relevant, interesting, factual short headline from this news article in telugu language\n <Your Telugu news article text here>"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
First make sure to install flash-attn in your environment pip install flash-attn
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
+ attn_implementation="flash_attention_2"
).to(0)
This model is designed for generating Telugu news headlines from article content. It can be used by: