Not-For-All-Audiences

conversational

text-generation-inference

3-bit

exl2

Model card Files Files and versions

xet

Community

Instructions to use kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

Transformers

How to use kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2")
model = AutoModelForCausalLM.from_pretrained("kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2

SGLang

How to use kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2 with Docker Model Runner:
```
docker model run hf.co/kim512/L3-70B-daybreak-storywriter-v0.4-3.0bpw-h6-exl2
```

kim512 commited on Jun 2, 2024

Commit

7141f93

verified ·

1 Parent(s): 54bc7cf

Delete huggingface-metadata.txt

Browse files

Files changed (1) hide show

huggingface-metadata.txt +0 -34

huggingface-metadata.txt DELETED Viewed

@@ -1,34 +0,0 @@
-url: https://huggingface.co/tdrussell/Llama-3-70B-Instruct-Storywriter
-branch: main
-download date: 2024-05-13 14:04:15
-sha256sum:
-    90161109625bad375bbefb31773a875f105c4390d97207b2631e8db698ffc832 model-00001-of-00030.safetensors
-    8127cb7e421fa699074dbc6eb0926747759455404c3fa75a255b55276cdd56d6 model-00002-of-00030.safetensors
-    bd73e5a29f86c2a029e24a958d5ecc825e629e9a92e45f0ed2983dd9ba98b6aa model-00003-of-00030.safetensors
-    946330d760cd7895fbdd81f16405503b9eef98a9b8d7b71637558136854e33e4 model-00004-of-00030.safetensors
-    3018047978b57190fe2a8687a6bcd943d85c2281cabb1cd017b54ddb68990941 model-00005-of-00030.safetensors
-    010fb9b494f2808c2e9ca58bf24d300a487b92ba7ef0a4687c00fc9ebe710f05 model-00006-of-00030.safetensors
-    f11eacbc2940b270ea1ef20e4f26e84227a300026d3893ca9e0b4da0336c28a7 model-00007-of-00030.safetensors
-    fe20bcee479db19c8e84021523228ae64ccc9105ae189c3a499d95e05987d2db model-00008-of-00030.safetensors
-    bdaf24acabf24137f9e46c1cae7f93bca97ef8db789eec7133b4b8793f04c34d model-00009-of-00030.safetensors
-    47aa308e71bff11119db7f05826448c2f19514905f24888e5428e5fff2728978 model-00010-of-00030.safetensors
-    952e782d82ffd30b006883dfaee38adb069158d4f6af86ebb6dbc4fb968ac098 model-00011-of-00030.safetensors
-    df36fe6d9f82b008aeeabd2452758f9c61d4c2543dfc97bbc823f2082797f182 model-00012-of-00030.safetensors
-    b2ee7918e7f704bc7b1fcbd5fc153103ea7febb7563ae5e622c865d8302d547c model-00013-of-00030.safetensors
-    6f2a214c72fc5f4f7953d113ad41174d9a0697abbf3ff678785f68fa61d6331e model-00014-of-00030.safetensors
-    b74f3b554144f20a59fdce9b6181447c15704fe20bfd3fa3e7025bbabaaf99d9 model-00015-of-00030.safetensors
-    b5e69ebc3f15b7ebb95f9e7cc13be23d5a736190faa4dceb3054817c3044bdb6 model-00016-of-00030.safetensors
-    845e3a23375c138eb6a7a05045c5d41639fe6b91b6c32226b1fbc2cc54c83e54 model-00017-of-00030.safetensors
-    4f836142310d3e80fc148d3b3cb079bc1679ba7a46c3276e2bd1847205f9827a model-00018-of-00030.safetensors
-    14543489882f3f2ab0f419e15ec2018e9ef28ef1a15535768f4754176262c7ba model-00019-of-00030.safetensors
-    7e1a6b63f61a9f201b68202e6611034fd4ae9965a15e497c931ba4fc8e02a3db model-00020-of-00030.safetensors
-    2d20d1f0362669204baee88e721259dbf21693ef401bb2490cf429dba48dea13 model-00021-of-00030.safetensors
-    d095efef9989442849dedbc96691b28f0ec575f269451793e539500ac2e21abb model-00022-of-00030.safetensors
-    2820c3590bb682b75ffe9468efe94a37c6d22d905dd92aee815cd159779a7259 model-00023-of-00030.safetensors
-    4c670f806e6f28eacf11a5177b6d6b4f8c29541e6e619cd61a3db3bfb0a6b4c0 model-00024-of-00030.safetensors
-    bb4b6a23b64ec8734bf29be67771f7bbf68abf0a87354cc75dcdea9f2124e5f3 model-00025-of-00030.safetensors
-    c8137dcdeb0447280f4bc9fa42ea60346a0986b3ca250aa96ee393c35aef3843 model-00026-of-00030.safetensors
-    00086b2b873795d11c20e6c58a9b271edadfb805ca946f8018df8906cbb3b52d model-00027-of-00030.safetensors
-    3945b4083500defcf7d919bcd554e32b26b10c205a76e3a28c117c8d5c68177a model-00028-of-00030.safetensors
-    d60880dcd8d726fe36a8ada9a8d9476727192459d5aa673a548cdd08b1bcbd09 model-00029-of-00030.safetensors
-    55298e57c1dfe47357f5b367863da3ad8aa7b009249b4d7146bd4cf22f004e89 model-00030-of-00030.safetensors