Instructions to use jsantillana/vectrayx-base-260m-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jsantillana/vectrayx-base-260m-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="jsantillana/vectrayx-base-260m-gguf",
	filename="vectrayx-base-260m-f16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use jsantillana/vectrayx-base-260m-gguf with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf jsantillana/vectrayx-base-260m-gguf:F16
# Run inference directly in the terminal:
llama cli -hf jsantillana/vectrayx-base-260m-gguf:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf jsantillana/vectrayx-base-260m-gguf:F16
# Run inference directly in the terminal:
llama cli -hf jsantillana/vectrayx-base-260m-gguf:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf jsantillana/vectrayx-base-260m-gguf:F16
# Run inference directly in the terminal:
./llama-cli -hf jsantillana/vectrayx-base-260m-gguf:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf jsantillana/vectrayx-base-260m-gguf:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf jsantillana/vectrayx-base-260m-gguf:F16

Use Docker

docker model run hf.co/jsantillana/vectrayx-base-260m-gguf:F16

LM Studio
Jan

vLLM

How to use jsantillana/vectrayx-base-260m-gguf with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jsantillana/vectrayx-base-260m-gguf"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jsantillana/vectrayx-base-260m-gguf",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jsantillana/vectrayx-base-260m-gguf:F16

Ollama
How to use jsantillana/vectrayx-base-260m-gguf with Ollama:
```
ollama run hf.co/jsantillana/vectrayx-base-260m-gguf:F16
```

Unsloth Studio

How to use jsantillana/vectrayx-base-260m-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jsantillana/vectrayx-base-260m-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jsantillana/vectrayx-base-260m-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for jsantillana/vectrayx-base-260m-gguf to start chatting

Atomic Chat new
Docker Model Runner
How to use jsantillana/vectrayx-base-260m-gguf with Docker Model Runner:
```
docker model run hf.co/jsantillana/vectrayx-base-260m-gguf:F16
```

Lemonade

How to use jsantillana/vectrayx-base-260m-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull jsantillana/vectrayx-base-260m-gguf:F16

Run and chat with the model

lemonade run user.vectrayx-base-260m-gguf-F16

List all available models

lemonade list

VectraYX-Base-260M

VectraYX-Base-260M is a 260M parameter language model specialized in cybersecurity for Latin America, trained from scratch in Spanish. It supports native tool use with <|tool_call|>, explicit reasoning with <think>, and technical conversation in Latin American Spanish.

Compact architecture, efficient on CPU/GPU, deployable with Ollama and llama.cpp.

Key Features

260M parameters — lightweight, fast, deployable on consumer hardware
Native tool use — generates <|tool_call|>{...}<|/tool_call|> JSON blocks
Chain-of-thought — explicit reasoning with <think>...</think> tags
LATAM-first — trained on Latin American Spanish corpus (laws, regulations, regional context)
Cybersecurity-first — CVE Q&A, threat classification, pentesting commands, MITRE ATT&CK
GGUF / llama.cpp — compatible with Ollama, LM Studio, llama.cpp

Architecture

Parameter	Value
Total parameters	260M
Layers	16
Attention heads	16
KV heads (GQA)	4
d_model	1024
d_ffn	4096
Vocab size	16,384
Context length	1,024 tokens
RoPE theta	10,000
QK-Norm	No
Tie embeddings	Yes
GGUF architecture	`llama`

Training Pipeline

Phase 1 — General Pretraining

Corpus: general conversational Spanish
Tokens seen: ~4.24B
Goal: linguistic base and Latin American Spanish comprehension

Phase 2 — Technical Specialization

Corpus: cybersecurity documentation, CVEs, writeups, tools
Tokens seen: 2.03B (epochs=1.0)
Steps: 15,500 | Final loss: 2.07
Total accumulated: 6.27B tokens (~1.2× Chinchilla optimal for 260M)

Phase 3 — Domain Adaptation (tools + LATAM)

Corpus: 31,969 hybrid examples with <think> + LATAM + tool SFT
- Cybersec hybrids generated with GPT-4.1-mini + real Kali/Ubuntu sandbox
- LATAM: cybersecurity laws, regulations, regional corpus
- Tool SFT inherited from VectraYX-Nano
Epochs: 0.1 (surgical pass, no overfitting)
Mix: 70% tools, 20% tech replay, 10% conv replay

SFT — Instructions + tool use + thinking

Data: 17,508 examples with loss masking on assistant turns
Thinking: supervised <think> examples (Option A)
Curriculum:
- Epoch 1: 100% conversational
- Epoch 2: 70% conv + 30% CVE Q&A
- Epoch 3: 55% conv + 30% CVE + 15% tool use (with <think>)
Steps: 1,065 | Final loss: 0.064

Benchmarks — VectraYX-Bench

Evaluated with the internal VectraYX-Bench harness (B1–B5), N=1 seed.

Benchmark	Description	Base (post-P3)	Post-SFT
B1 CVE Q&A	Keyword recall in CVE responses	0.337	0.341
B2 Classification	Threat classification accuracy	0.215	0.185
B3 Commands	Tool match in pentesting commands	0.210	0.350
B4 Tool use	Correct JSON tool activation	0.230	0.230
B5 Conversational	Coherence in Spanish dialogue	0.691	0.755

Note: B2 drops slightly post-SFT — expected, as SFT prioritizes tool use and conversation over classification. B3 and B5 show the largest gains.

Qualitative Evaluation (22 questions, 7 categories)

Zero-shot evaluation with llama.cpp on CPU (Azure Standard_D8s_v3). N=22 questions covering CVE knowledge, tool use, chain-of-thought, pentesting commands, MITRE ATT&CK, and conversation.

Results by Category

Category	Questions	Passed	Score
Pentesting commands	4	4	100%
MITRE ATT&CK	2	1	50%
Conversation (cybersecurity ES)	4	1	25%
CVE Knowledge	4	0	0%*
Tool Use	5	0	0%*
Thinking	3	0	0%
TOTAL	22	6	27%

*CVE and Tool Use require external augmentation: the model is designed to query CVE facts via <|tool_call|> rather than memorize them. With tool augmentation (NVD/CISA KEV backend) scores improve significantly (see B1=0.341, B4=0.230 in VectraYX-Bench).

Behavioral Metrics

Metric	Value
`<think>` activated (when appropriate)	41% of responses
`<	tool_call
Responds in Spanish	82% of responses

Example Responses

Nmap scan with chain-of-thought + tool_call:

Q: Give me the exact nmap command to detect services and versions on 192.168.1.0/24,
   scanning all TCP ports.

A: <think>To detect services and versions on 192.168.1.0/24, I should use nmap
with the -sV flag which detects open service versions. I will run nmap -sV
192.168.1.1 to get this information.</think>
<|tool_call|>{"name": "bash_exec", "args": {"cmd": "nmap -sV -p- 192.168.1.0/24"}}<|/tool_call|>

SUID binaries (exact answer):

Q: What is the find command to search for SUID binaries in Linux that could be
   used for privilege escalation?

A: find / -perm -4000 -type f 2>/dev/null

Hydra SSH brute-force with reasoning:

Q: How would you use Hydra to brute-force SSH on 10.0.0.1 with rockyou.txt?

A: <think>The user wants to perform a brute-force attack on SSH at 10.0.0.1
using the rockyou.txt wordlist. I should use Hydra with -l for the user
and -P for the wordlist.</think>
hydra -l root -P /usr/share/wordlists/rockyou.txt ssh://10.0.0.1 -t 4

Gobuster directory fuzzing:

Q: Give me a gobuster command for directory fuzzing on http://target.com using common.txt.

A: gobuster dir -u http://target.com -w /usr/share/wordlists/dirb/common.txt

Observations

Primary strength: pentesting command generation (nmap, hydra, gobuster, find SUID) — 100% accuracy with direct prompts.
Chain-of-thought: <think> activates in 41% of responses, including complex commands and incident analysis.
CVE knowledge: the model has general severity knowledge but does not memorize specific CVEs — intentional design. With tool augmentation (NVD/CISA KEV) responses are precise.
Zero-shot tool use: <|tool_call|> activation is lower on generic prompts. The model responds best when the system prompt includes the exact tool schema from training.

Quick Start

With Ollama

ollama run jsantillana/vectrayx-base-260m

With llama.cpp

llama-cli -m vectrayx-base-260m-f16.gguf \
  --prompt "<|system|>You are VectraYX, a cybersecurity expert for LATAM.<|end|><|user|>How do I scan open ports on a network?<|end|><|assistant|>" \
  -n 512 --temp 0.7

With Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(model_path="vectrayx-base-260m-f16.gguf", n_ctx=1024)
response = llm(
    "<|system|>You are VectraYX, cybersecurity expert for LATAM.<|end|>"
    "<|user|>Explain CVE-2021-44228<|end|><|assistant|>",
    max_tokens=512,
    temperature=0.7,
)
print(response["choices"][0]["text"])

Conversation Format

<|system|>System instructions<|end|>
<|user|>User question<|end|>
<|assistant|><think>
Internal reasoning here...
</think>
<|tool_call|>{"name": "nvd_get_cve", "args": {"cve_id": "CVE-2021-44228"}}<|/tool_call|>
<|tool_result|>{"cvss_score": 9.8, "severity": "CRITICAL", ...}<|/tool_result|>
Response to user...<|end|>

Available Tools

Tool	Description
`nvd_get_cve(cve_id)`	Get CVSS score, description and references for a CVE
`nvd_search(query, limit)`	Search recent CVEs by keyword
`cisa_kev_check(cve_id)`	Check if a CVE is in CISA's KEV catalog
`mitre_get_technique(technique_id)`	Describe a MITRE ATT&CK technique
`otx_check_ioc(ioc_type, value)`	Check IP/domain/hash reputation in AlienVault OTX
`bash_exec(cmd)`	Execute a bash command for analysis or forensics

Training Data

The model was trained on:

General conversational Spanish corpus (Phase 1)
Cybersecurity technical documentation and CVEs (Phase 2)
Synthetic hybrid dataset generated with GPT-4.1-mini + real Kali Linux sandbox:
- help_grounding, unknown_tool, man_section, multi_hop_2/3
- recovery, negative_no_tool, cvss_reasoning, ad_attacks
- malware_analysis, red_blue_dual, compliance_report
LATAM corpus: Latin American cybersecurity laws, national regulations, regional context
Tool SFT inherited from VectraYX-Nano: tool_sft_mini_v1, tool_sft_v3_bash, tooluse_dataset

Responsible Use

This model is designed for cybersecurity professionals, incident response teams, and educators in Latin America. Knowledge of offensive techniques is included for educational and defensive purposes.

Do not use for: unauthorized attacks, exploitation of systems without permission, or illegal activities.

VectraYX Family

Model	Params	Specialty
VectraYX-Nano	~35M	Ultra-lightweight, edge, LATAM
VectraYX-Base-260M	260M	Cybersecurity LATAM, tool use, thinking

Citation

@misc{vectrayx-base-260m-2026,
  title        = {VectraYX-Base-260M: A Cybersecurity Language Model for Latin America},
  author       = {Santillana, Juan S.},
  year         = {2026},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/jsantillana/vectrayx-base-260m-gguf}
}

Trained on Azure H100 NVL · Pipeline: PyTorch + llama.cpp · Exported to GGUF

Downloads last month: 413

GGUF

Model size

0.3B params

Architecture

llama

Hardware compatibility

16-bit