--- datasets: - nvidia/Nemotron-CC-v2 - nvidia/Nemotron-Post-Training-Dataset-v2 - nvidia/Nemotron-Instruction-Following-Chat-v1 - nvidia/Nemotron-Science-v1 - nvidia/Nemotron-Agentic-v1 - nvidia/Nemotron-Competitive-Programming-v1 - nvidia/Nemotron-Math-Proofs-v1 - nvidia/Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1 - nvidia/Nemotron-RL-instruction_following - nvidia/Nemotron-RL-agent-calendar_scheduling - nvidia/Nemotron-RL-instruction_following-structured_outputs --- Nvidia.Agentic.Coder-4B-GGUF 📌 Model Overview Model Name: WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF Organization: Within Us AI Model Type: Code LLM (Agentic, Instruction-Following) Parameter Size: 4B Format: GGUF (quantized for local inference) Primary Use: Agentic coding, tool-using workflows, software engineering reasoning This model is part of the Within Us AI ecosystem focused on building agentic, reasoning-driven coding systems designed to think, act, and verify like real engineers.  ⸻ 🧬 Architecture & Lineage * Base Family: NVIDIA Nemotron-style 4B class models (inferred lineage from naming + ecosystem alignment) * Format Conversion: GGUF quantization for efficient local inference * Training Approach: * Instruction-tuned for coding tasks * Agentic workflow emphasis (multi-step reasoning, tool usage) * Likely merged / fine-tuned using Within Us AI proprietary pipelines Related ecosystem models include: * NVIDIA-Nemotron-3-Nano-4B * Other 4B agentic coders and merges in the same class  ⸻ ⚙️ Key Capabilities 🧑‍💻 Code Intelligence * Multi-language code generation * Bug fixing and refactoring * Structured output generation 🤖 Agentic Behavior * Step-by-step reasoning * Task decomposition * Tool-calling alignment (design goal) 🧠 Reasoning Focus * Instruction-following with logical chaining * Designed for evaluation-style datasets (tests-as-truth philosophy) ⸻ 📦 GGUF Quantization GGUF allows efficient local inference with tools like: * llama.cpp * LM Studio * Ollama (GGUF-compatible builds) Typical quantizations for 4B GGUF models include: * Q2_K (~1.8GB) * Q3_K (~2.0–2.3GB) * Q4_K (~2.5GB, recommended balance)  ⸻ 🚀 Intended Use ✅ Ideal Use Cases * Local AI coding assistants * Autonomous coding agents * SWE-bench style evaluation * Tool-augmented workflows * Offline developer copilots ⚠️ Limitations * Smaller 4B parameter size limits deep reasoning vs larger models * Performance depends heavily on prompt structure * Tool-use requires external orchestration (not built-in runtime) ⸻ 🛠️ Usage Example (llama.cpp) ./main -m Nvidia.Agentic.Coder-4B.Q4_K.gguf \ -p "Write a Python function to parse JSON logs and extract errors." \ -n 512 ⸻ 🧪 Training Philosophy (Within Us AI) Within Us AI focuses on: * Agentic AI systems * Test-driven training (tests-as-truth) * Diff-first patching workflows * Secure and auditable code generation * Evaluation-first development pipelines  ⸻ 📊 Evaluation No formal benchmark results published yet. Expected strengths: * Strong instruction adherence * Lightweight agentic reasoning * Efficient local deployment ⸻ 📚 Datasets & Training Sources This model follows the Within Us AI methodology: * Proprietary datasets created by Within Us AI * May include third-party datasets for training (no ownership claimed) * Emphasis on: * Code reasoning traces * Agentic workflows * Evaluation-driven samples ⸻ 📜 License License Type: Custom / Other (Within Us AI License) Terms: * Within Us AI created the fine-tuning, merging, and training methodology * Base model architecture originates from third-party LLM ecosystems (e.g., NVIDIA / Nemotron class) * Third-party datasets may be used without claiming ownership * Full credit and acknowledgment belong to original dataset and base model creators ⸻ 🙏 Acknowledgements Special thanks to: * NVIDIA Nemotron ecosystem contributors * Open-source GGUF tooling community * Dataset creators across Hugging Face * The broader open-source AI research community ⸻ 🔗 Links * Model: https://huggingface.co/WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF * Organization: https://huggingface.co/WithinUsAI