Instructions to use NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT", dtype="auto")

PEFT
How to use NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT

SGLang

How to use NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT with Docker Model Runner:
```
docker model run hf.co/NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT
```

DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT / README.md

NotShrirang

Update README.md

6a279c0 verified over 1 year ago

preview code

raw

history blame

4.16 kB

	---
	library_name: transformers
	tags:
	- code
	- peft
	- sql-generation
	- text-generation-inference
	license: apache-2.0
	datasets:
	- gretelai/synthetic_text_to_sql
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
	pipeline_tag: text-generation
	---

	# DeepSeek R1 Distill Qwen 1.5B finetuned for SQL query generation
	This model is a fine-tuned version of DeepSeek R1 Distill Qwen 1.5B, specifically optimized for SQL query generation. It has been trained on the GretelAI Synthetic Text-to-SQL dataset to enhance its ability to convert natural language prompts into accurate SQL queries.

	Due to its lightweight architecture, this model can be deployed efficiently on local machines without requiring a GPU, making it ideal for on-premises inference in resource-constrained environments. It offers a balance between performance and efficiency, making it suitable for businesses and developers looking for a cost-effective SQL generation solution.

	## Training Methodology
	1. Fine-tuning approach: LoRA (Low-Rank Adaptation) for efficient parameter tuning.
	2. Precision: bfloat16 (bf16) to reduce memory consumption while maintaining numerical stability.
	3. Gradient Accumulation: Used to handle larger batch sizes within GPU memory limits.
	4. Optimizer: AdamW with learning rate scheduling.
	5. Cosine Scheduler: Used cosine learning rate scheduler for training stability. (500 warm-up steps, 2000 steps for the cosine schedule.)

	## Use Cases
	1. Assisting developers and analysts in writing SQL queries.
	2. Automating SQL query generation from user prompts in chatbots.
	3. Enhancing SQL-based retrieval-augmented generation (RAG) systems.

	## Limitations & Considerations
	1. The model may generate incorrect or suboptimal SQL queries for complex database schemas.
	2. It does not perform schema reasoning and requires clear table/column references in the input.
	3. Further fine-tuning on domain-specific SQL data may be required for better accuracy.

	## How to Use
	You can load the model using 🤗 Transformers:

	```python
	from peft import AutoPeftModelForCausalLM
	from transformers import AutoTokenizer
	import torch

	model = AutoPeftModelForCausalLM.from_pretrained("NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT")
	tokenizer = AutoTokenizer.from_pretrained("NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT")

	prompt = "Write a SQL query to get the total revenue from the sales table."
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Training Details

	- Total Steps: 25,000
	- Batch Size: 4
	- Optimizer: AdamW
	- Learning Rate: 5e-5

	### Training and Validation Loss Progression

	\| Step \| Training Loss \| Validation Loss \|
	\|-------\|--------------\|----------------\|
	\| 1000 \| 1.0017 \| 1.0256 \|
	\| 2000 \| 1.1644 \| 0.8818 \|
	\| 3000 \| 0.7851 \| 0.8507 \|
	\| 4000 \| 0.7416 \| 0.8322 \|
	\| 5000 \| 0.6960 \| 0.8184 \|
	\| 6000 \| 1.0118 \| 0.8068 \|
	\| 7000 \| 0.9897 \| 0.7997 \|
	\| 8000 \| 0.9165 \| 0.7938 \|
	\| 9000 \| 0.8048 \| 0.7875 \|
	\| 10000 \| 0.8869 \| 0.7822 \|
	\| 11000 \| 0.8387 \| 0.7788 \|
	\| 12000 \| 0.8117 \| 0.7746 \|
	\| 13000 \| 0.7259 \| 0.7719 \|
	\| 14000 \| 0.8100 \| 0.7678 \|
	\| 15000 \| 0.6901 \| 0.7626 \|
	\| 16000 \| 0.9630 \| 0.7600 \|
	\| 17000 \| 0.6599 \| 0.7571 \|
	\| 18000 \| 0.6770 \| 0.7541 \|
	\| 19000 \| 0.7360 \| 0.7509 \|
	\| 20000 \| 0.7170 \| 0.7458 \|
	\| 21000 \| 0.7993 \| 0.7446 \|
	\| 22000 \| 0.5846 \| 0.7412 \|
	\| 23000 \| 0.8269 \| 0.7411 \|
	\| 24000 \| 0.5817 \| 0.7379 \|
	\| 25000 \| 0.5772 \| 0.7357 \|

	- Developed by: [NotShrirang](https://huggingface.co/NotShrirang)
	- Language(s) (NLP): [en]
	- License: [apache-2.0]
	- Finetuned from model : [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)