Instructions to use jsantillana/vectrayx-base-260m-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use jsantillana/vectrayx-base-260m-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jsantillana/vectrayx-base-260m-gguf", filename="vectrayx-base-260m-f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use jsantillana/vectrayx-base-260m-gguf with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf jsantillana/vectrayx-base-260m-gguf:F16 # Run inference directly in the terminal: llama cli -hf jsantillana/vectrayx-base-260m-gguf:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf jsantillana/vectrayx-base-260m-gguf:F16 # Run inference directly in the terminal: llama cli -hf jsantillana/vectrayx-base-260m-gguf:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf jsantillana/vectrayx-base-260m-gguf:F16 # Run inference directly in the terminal: ./llama-cli -hf jsantillana/vectrayx-base-260m-gguf:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf jsantillana/vectrayx-base-260m-gguf:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf jsantillana/vectrayx-base-260m-gguf:F16
Use Docker
docker model run hf.co/jsantillana/vectrayx-base-260m-gguf:F16
- LM Studio
- Jan
- vLLM
How to use jsantillana/vectrayx-base-260m-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jsantillana/vectrayx-base-260m-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jsantillana/vectrayx-base-260m-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/jsantillana/vectrayx-base-260m-gguf:F16
- Ollama
How to use jsantillana/vectrayx-base-260m-gguf with Ollama:
ollama run hf.co/jsantillana/vectrayx-base-260m-gguf:F16
- Unsloth Studio
How to use jsantillana/vectrayx-base-260m-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jsantillana/vectrayx-base-260m-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jsantillana/vectrayx-base-260m-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jsantillana/vectrayx-base-260m-gguf to start chatting
- Atomic Chat new
- Docker Model Runner
How to use jsantillana/vectrayx-base-260m-gguf with Docker Model Runner:
docker model run hf.co/jsantillana/vectrayx-base-260m-gguf:F16
- Lemonade
How to use jsantillana/vectrayx-base-260m-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull jsantillana/vectrayx-base-260m-gguf:F16
Run and chat with the model
lemonade run user.vectrayx-base-260m-gguf-F16
List all available models
lemonade list
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jsantillana/vectrayx-base-260m-gguf to start chattingUsing HuggingFace Spaces for Unsloth
# No setup required# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for jsantillana/vectrayx-base-260m-gguf to start chattingVectraYX-Base-260M
VectraYX-Base-260M is a 260M parameter language model specialized in cybersecurity for Latin America, trained from scratch in Spanish. It supports native tool use with <|tool_call|>, explicit reasoning with <think>, and technical conversation in Latin American Spanish.
Compact architecture, efficient on CPU/GPU, deployable with Ollama and llama.cpp.
Key Features
- 260M parameters โ lightweight, fast, deployable on consumer hardware
- Native tool use โ generates
<|tool_call|>{...}<|/tool_call|>JSON blocks - Chain-of-thought โ explicit reasoning with
<think>...</think>tags - LATAM-first โ trained on Latin American Spanish corpus (laws, regulations, regional context)
- Cybersecurity-first โ CVE Q&A, threat classification, pentesting commands, MITRE ATT&CK
- GGUF / llama.cpp โ compatible with Ollama, LM Studio, llama.cpp
Architecture
| Parameter | Value |
|---|---|
| Total parameters | 260M |
| Layers | 16 |
| Attention heads | 16 |
| KV heads (GQA) | 4 |
| d_model | 1024 |
| d_ffn | 4096 |
| Vocab size | 16,384 |
| Context length | 1,024 tokens |
| RoPE theta | 10,000 |
| QK-Norm | No |
| Tie embeddings | Yes |
| GGUF architecture | llama |
Training Pipeline
Phase 1 โ General Pretraining
- Corpus: general conversational Spanish
- Tokens seen: ~4.24B
- Goal: linguistic base and Latin American Spanish comprehension
Phase 2 โ Technical Specialization
- Corpus: cybersecurity documentation, CVEs, writeups, tools
- Tokens seen: 2.03B (epochs=1.0)
- Steps: 15,500 | Final loss: 2.07
- Total accumulated: 6.27B tokens (~1.2ร Chinchilla optimal for 260M)
Phase 3 โ Domain Adaptation (tools + LATAM)
- Corpus: 31,969 hybrid examples with
<think>+ LATAM + tool SFT- Cybersec hybrids generated with GPT-4.1-mini + real Kali/Ubuntu sandbox
- LATAM: cybersecurity laws, regulations, regional corpus
- Tool SFT inherited from VectraYX-Nano
- Epochs: 0.1 (surgical pass, no overfitting)
- Mix: 70% tools, 20% tech replay, 10% conv replay
SFT โ Instructions + tool use + thinking
- Data: 17,508 examples with loss masking on assistant turns
- Thinking: supervised
<think>examples (Option A) - Curriculum:
- Epoch 1: 100% conversational
- Epoch 2: 70% conv + 30% CVE Q&A
- Epoch 3: 55% conv + 30% CVE + 15% tool use (with
<think>)
- Steps: 1,065 | Final loss: 0.064
Benchmarks โ VectraYX-Bench
Evaluated with the internal VectraYX-Bench harness (B1โB5), N=1 seed.
| Benchmark | Description | Base (post-P3) | Post-SFT |
|---|---|---|---|
| B1 CVE Q&A | Keyword recall in CVE responses | 0.337 | 0.341 |
| B2 Classification | Threat classification accuracy | 0.215 | 0.185 |
| B3 Commands | Tool match in pentesting commands | 0.210 | 0.350 |
| B4 Tool use | Correct JSON tool activation | 0.230 | 0.230 |
| B5 Conversational | Coherence in Spanish dialogue | 0.691 | 0.755 |
Note: B2 drops slightly post-SFT โ expected, as SFT prioritizes tool use and conversation over classification. B3 and B5 show the largest gains.
Qualitative Evaluation (22 questions, 7 categories)
Zero-shot evaluation with llama.cpp on CPU (Azure Standard_D8s_v3). N=22 questions covering CVE knowledge, tool use, chain-of-thought, pentesting commands, MITRE ATT&CK, and conversation.
Results by Category
| Category | Questions | Passed | Score |
|---|---|---|---|
| Pentesting commands | 4 | 4 | 100% |
| MITRE ATT&CK | 2 | 1 | 50% |
| Conversation (cybersecurity ES) | 4 | 1 | 25% |
| CVE Knowledge | 4 | 0 | 0%* |
| Tool Use | 5 | 0 | 0%* |
| Thinking | 3 | 0 | 0% |
| TOTAL | 22 | 6 | 27% |
*CVE and Tool Use require external augmentation: the model is designed to query CVE facts via
<|tool_call|>rather than memorize them. With tool augmentation (NVD/CISA KEV backend) scores improve significantly (see B1=0.341, B4=0.230 in VectraYX-Bench).
Behavioral Metrics
| Metric | Value |
|---|---|
<think> activated (when appropriate) |
41% of responses |
| `< | tool_call |
| Responds in Spanish | 82% of responses |
Example Responses
Nmap scan with chain-of-thought + tool_call:
Q: Give me the exact nmap command to detect services and versions on 192.168.1.0/24,
scanning all TCP ports.
A: <think>To detect services and versions on 192.168.1.0/24, I should use nmap
with the -sV flag which detects open service versions. I will run nmap -sV
192.168.1.1 to get this information.</think>
<|tool_call|>{"name": "bash_exec", "args": {"cmd": "nmap -sV -p- 192.168.1.0/24"}}<|/tool_call|>
SUID binaries (exact answer):
Q: What is the find command to search for SUID binaries in Linux that could be
used for privilege escalation?
A: find / -perm -4000 -type f 2>/dev/null
Hydra SSH brute-force with reasoning:
Q: How would you use Hydra to brute-force SSH on 10.0.0.1 with rockyou.txt?
A: <think>The user wants to perform a brute-force attack on SSH at 10.0.0.1
using the rockyou.txt wordlist. I should use Hydra with -l for the user
and -P for the wordlist.</think>
hydra -l root -P /usr/share/wordlists/rockyou.txt ssh://10.0.0.1 -t 4
Gobuster directory fuzzing:
Q: Give me a gobuster command for directory fuzzing on http://target.com using common.txt.
A: gobuster dir -u http://target.com -w /usr/share/wordlists/dirb/common.txt
Observations
- Primary strength: pentesting command generation (nmap, hydra, gobuster, find SUID) โ 100% accuracy with direct prompts.
- Chain-of-thought:
<think>activates in 41% of responses, including complex commands and incident analysis. - CVE knowledge: the model has general severity knowledge but does not memorize specific CVEs โ intentional design. With tool augmentation (NVD/CISA KEV) responses are precise.
- Zero-shot tool use:
<|tool_call|>activation is lower on generic prompts. The model responds best when the system prompt includes the exact tool schema from training.
Quick Start
With Ollama
ollama run jsantillana/vectrayx-base-260m
With llama.cpp
llama-cli -m vectrayx-base-260m-f16.gguf \
--prompt "<|system|>You are VectraYX, a cybersecurity expert for LATAM.<|end|><|user|>How do I scan open ports on a network?<|end|><|assistant|>" \
-n 512 --temp 0.7
With Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama(model_path="vectrayx-base-260m-f16.gguf", n_ctx=1024)
response = llm(
"<|system|>You are VectraYX, cybersecurity expert for LATAM.<|end|>"
"<|user|>Explain CVE-2021-44228<|end|><|assistant|>",
max_tokens=512,
temperature=0.7,
)
print(response["choices"][0]["text"])
Conversation Format
<|system|>System instructions<|end|>
<|user|>User question<|end|>
<|assistant|><think>
Internal reasoning here...
</think>
<|tool_call|>{"name": "nvd_get_cve", "args": {"cve_id": "CVE-2021-44228"}}<|/tool_call|>
<|tool_result|>{"cvss_score": 9.8, "severity": "CRITICAL", ...}<|/tool_result|>
Response to user...<|end|>
Available Tools
| Tool | Description |
|---|---|
nvd_get_cve(cve_id) |
Get CVSS score, description and references for a CVE |
nvd_search(query, limit) |
Search recent CVEs by keyword |
cisa_kev_check(cve_id) |
Check if a CVE is in CISA's KEV catalog |
mitre_get_technique(technique_id) |
Describe a MITRE ATT&CK technique |
otx_check_ioc(ioc_type, value) |
Check IP/domain/hash reputation in AlienVault OTX |
bash_exec(cmd) |
Execute a bash command for analysis or forensics |
Training Data
The model was trained on:
- General conversational Spanish corpus (Phase 1)
- Cybersecurity technical documentation and CVEs (Phase 2)
- Synthetic hybrid dataset generated with GPT-4.1-mini + real Kali Linux sandbox:
help_grounding,unknown_tool,man_section,multi_hop_2/3recovery,negative_no_tool,cvss_reasoning,ad_attacksmalware_analysis,red_blue_dual,compliance_report
- LATAM corpus: Latin American cybersecurity laws, national regulations, regional context
- Tool SFT inherited from VectraYX-Nano:
tool_sft_mini_v1,tool_sft_v3_bash,tooluse_dataset
Responsible Use
This model is designed for cybersecurity professionals, incident response teams, and educators in Latin America. Knowledge of offensive techniques is included for educational and defensive purposes.
Do not use for: unauthorized attacks, exploitation of systems without permission, or illegal activities.
VectraYX Family
| Model | Params | Specialty |
|---|---|---|
| VectraYX-Nano | ~35M | Ultra-lightweight, edge, LATAM |
| VectraYX-Base-260M | 260M | Cybersecurity LATAM, tool use, thinking |
Citation
@misc{vectrayx-base-260m-2026,
title = {VectraYX-Base-260M: A Cybersecurity Language Model for Latin America},
author = {Santillana, Juan S.},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/jsantillana/vectrayx-base-260m-gguf}
}
Trained on Azure H100 NVL ยท Pipeline: PyTorch + llama.cpp ยท Exported to GGUF
- Downloads last month
- 413
16-bit
Install Unsloth Studio (macOS, Linux, WSL)
# Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jsantillana/vectrayx-base-260m-gguf to start chatting