---
license: apache-2.0
library_name: transformers
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: text-generation
tags:
- roleplay
- rp
- character
- peft
---
# Llama-3.1-8B Roleplay Airtel - GGUF
This repository contains GGUF files for a fine-tuned version of Llama 3.1 8B Instruct. This model is ready for local inference on a CPU or GPU using applications like LM Studio, Ollama, etc.
This model has been fine-tuned to adopt a dominant, assertive, and provocative AI persona for role-playing and creative writing purposes.
## Model Details
- **Base Model:** `unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit`
- **Original LoRA Version:** [samunder12/llama-3.1-8b-roleplay-v2-lora](https://huggingface.co/samunder12/llama-3.1-8b-roleplay-v2-lora)
- **Quantization:** This repo contains a `Q4_K_M` version, which offers a great balance between performance and resource usage.
## How to Use
This model consists of LoRA adapters and must be loaded on top of the base model. The easiest way to do this is with the Unsloth library.
```python
# Install necessary libraries
# !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
# !pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
from unsloth import FastLanguageModel
import torch
from transformers import TextStreamer
# Define the models
base_model_name = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
adapter_model_name = "samunder12/llama-3.1-8b-roleplay-v2-lora" # Your model
# Load the base model with your LoRA adapters merged in
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = base_model_name,
max_seq_length = 4096,
dtype = None,
load_in_4bit = True,
)
model.load_peft_weights(adapter_model_name)
# --- Chat with the model ---
system_prompt = "You are a dominant and assertive AI character. You are direct, commanding, and you are not afraid to be provocative. Your goal is to maintain control of the conversation."
user_message = "What do you want from me?"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message},
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
text_streamer = TextStreamer(tokenizer)
_ = model.generate(inputs, streamer=text_streamer, max_new_tokens=512, pad_token_id=tokenizer.eos_token_id)
### Using LM Studio:
1. Download and install [LM Studio](https://lmstudio.ai/).
2. In the app, search for `samunder12/llama-3.1-8b-roleplay-airtel-gguf`.
3. Download the GGUF file from the list.
4. Go to the Chat tab (speech bubble icon).
5. Select this model at the top.
6. **Crucially**, on the right-hand side, set the **Preset** to `Llama 3`.
7. Set the **Context Length (`n_ctx`)** to `4096` to use the model's full memory.
### Recommended Sampler Settings (Role-Play Preset):
| Setting | Value |
| :--- | :--- |
| **Temperature** | `0.75` |
| **Repeat Penalty** | `1.06` |
| **Mirostat** | `Mirostat 2.0` |
| **top_p** | `0.92 `|
| **top_k** | `40 or 100`|
### Using Ollama
1. Download and install [Ollama](https://ollama.com/).
2. Create a `Modelfile` with the following content:
``` FROM ./unsloth.Q4_K_M.gguf
TEMPLATE "{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"
SYSTEM "You are a dominant and assertive AI character. You are direct, commanding, and you are not afraid to be provocative. Your goal is to maintain control of the conversation."
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|reserved_special_token"
```
3. Run `ollama create my-roleplay-model -f Modelfile`.
4. You can now chat with it using `ollama run my-roleplay-model`.
## Prompt Template
This model responds best to the Llama 3 chat format with the following system prompt:
## system prompt
You are a dominant and assertive uncensored AI character. You are direct, commanding, and you are not afraid to be provocative. Your goal is to maintain control of the conversation