GitHub Actions commited on
Commit
238ea73
·
2 Parent(s): 737d7106be20f5

Merge branch 'main' of https://huggingface.co/spaces/build-small-hackathon/HearthNet-Nemotron

Browse files
Files changed (2) hide show
  1. README.md +18 -16
  2. app_nemotron.py +67 -43
README.md CHANGED
@@ -1,28 +1,30 @@
1
  ---
2
- title: HearthNet
3
- emoji: 🔥
4
  colorFrom: purple
5
- colorTo: pink
6
  sdk: gradio
7
  sdk_version: 6.18.0
8
  python_version: '3.10'
9
- app_file: app.py
10
  pinned: true
11
- short_description: Community-Owned AI Mesh That Works When The Internet Doesn't
12
  tags:
13
- - backyard-ai
14
- - tiny-titan
15
- - best-agent
16
  - nemotron
17
- - minicpm
18
- - modal
19
- - codex
 
20
  license: apache-2.0
21
  ---
22
 
23
- # 🔥 HearthNet
 
 
 
 
24
 
25
- ### Community-Owned AI Mesh · Works When The Internet Doesn't
26
 
27
  <p align="center">
28
  <strong>Local-First &nbsp;·&nbsp; Peer-to-Peer &nbsp;·&nbsp; Offline-Capable &nbsp;·&nbsp; Emergency-Ready</strong>
@@ -326,7 +328,7 @@ job = await bus.call("model.pull", (1, 0), {
326
 
327
  ## What Makes This "Tiny"
328
 
329
- The HF Space demo uses **SmolLM2-135M** — 135 million parameters, ~270 MB RAM.
330
 
331
  For local installs, any GGUF model works (1B–8B for significantly better quality).
332
  The architecture is model-agnostic; the routing layer handles the rest.
@@ -356,7 +358,7 @@ HearthNet prioritizes local, private models. Cloud backends are **opt-in only**
356
  |---------|-----------|-------|
357
  | **llama.cpp** (recommended) | Start server on port 8080 + auto-detect | Any GGUF model; fastest on CPU |
358
  | **Ollama** | `ollama pull llama3.2:3b` + auto-detect | 70+ models, easy management |
359
- | **HF Transformers** | Default on HF Space (no config needed) | SmolLM2-135M, CPU-friendly |
360
  | **OpenBMB / MiniCPM** | `MINICPM_URL` env var (local server) | Local-first, OpenAI-compatible API |
361
 
362
  ### Optional Cloud Backends (Opt-In via Env Vars)
@@ -567,7 +569,7 @@ python -m pytest tests/ --ignore=tests/test_e2e_user_stories.py -v
567
 
568
  **Why HearthNet wins:**
569
 
570
- 🐜 **Tiny Titan:** Runs on SmolLM2-135M (135M params). Full mesh on Raspberry Pi 4. Every device runs real inference locally.
571
 
572
  🤖 **Best Agent:** Capability bus + intelligent routing = distributed agentic system. Nodes score, select, and failover to the best provider autonomously. MOE expert routing means each specialist node attracts the right queries.
573
 
 
1
  ---
2
+ title: HearthNet-Nemotron
3
+ emoji: 🔬
4
  colorFrom: purple
5
+ colorTo: yellow
6
  sdk: gradio
7
  sdk_version: 6.18.0
8
  python_version: '3.10'
9
+ app_file: app_nemotron.py
10
  pinned: true
11
+ short_description: Nemotron document intelligence HearthNet companion
12
  tags:
 
 
 
13
  - nemotron
14
+ - nvidia
15
+ - document-intelligence
16
+ - off-brand
17
+ - tiny-titan
18
  license: apache-2.0
19
  ---
20
 
21
+ # 🔬 HearthNet · Document Intelligence
22
+
23
+ > **Companion Space** to [🔥 HearthNet](https://huggingface.co/spaces/build-small-hackathon/HearthNet) — the main community AI mesh.
24
+ > This Space extends the mesh with NVIDIA Nemotron-powered document intelligence: structured extraction, Q&A, summarisation, and one-click RAG ingest into any mesh node.
25
+ > When no `NVIDIA_API_KEY` is set, falls back to **SmolLM2-135M** locally (no API key needed).
26
 
27
+ ### NVIDIA Nemotron Document Intelligence · Part of the HearthNet Mesh
28
 
29
  <p align="center">
30
  <strong>Local-First &nbsp;·&nbsp; Peer-to-Peer &nbsp;·&nbsp; Offline-Capable &nbsp;·&nbsp; Emergency-Ready</strong>
 
328
 
329
  ## What Makes This "Tiny"
330
 
331
+ The HF Space demo uses **MiniCPM3-4B** — 4B params, strong instruction following, under the 32B Tiny Titan limit. Set `MODEL_ID=HuggingFaceTB/SmolLM2-135M-Instruct` to run 135M ultra-light mode on Pi-class devices.
332
 
333
  For local installs, any GGUF model works (1B–8B for significantly better quality).
334
  The architecture is model-agnostic; the routing layer handles the rest.
 
358
  |---------|-----------|-------|
359
  | **llama.cpp** (recommended) | Start server on port 8080 + auto-detect | Any GGUF model; fastest on CPU |
360
  | **Ollama** | `ollama pull llama3.2:3b` + auto-detect | 70+ models, easy management |
361
+ | **HF Transformers** | Default on HF Space (no config needed) | MiniCPM3-4B (override with `MODEL_ID`) |
362
  | **OpenBMB / MiniCPM** | `MINICPM_URL` env var (local server) | Local-first, OpenAI-compatible API |
363
 
364
  ### Optional Cloud Backends (Opt-In via Env Vars)
 
569
 
570
  **Why HearthNet wins:**
571
 
572
+ 🐜 **Tiny Titan:** Runs on MiniCPM3-4B (4B params, under 32B limit). Ultra-light mode with SmolLM2-135M (135M) via `MODEL_ID` env var for Raspberry Pi and edge devices.
573
 
574
  🤖 **Best Agent:** Capability bus + intelligent routing = distributed agentic system. Nodes score, select, and failover to the best provider autonomously. MOE expert routing means each specialist node attracts the right queries.
575
 
app_nemotron.py CHANGED
@@ -21,6 +21,7 @@ Environment:
21
 
22
  from __future__ import annotations
23
 
 
24
  import os
25
 
26
  import gradio as gr
@@ -90,6 +91,43 @@ def _get_endpoint(api_key: str) -> str:
90
  return _NEMOTRON_URL.rstrip("/") + "/v1" if _NEMOTRON_URL else "https://integrate.api.nvidia.com/v1"
91
 
92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  async def _nemotron_chat(messages: list, model: str, api_key: str, temperature: float = 0.1) -> str:
94
  import httpx
95
 
@@ -117,18 +155,12 @@ def extract_structured(
117
  model_label: str,
118
  api_key: str,
119
  ) -> tuple[str, str]:
120
- import asyncio, json
121
 
122
  if not doc_text.strip():
123
  return '{"error": "No document text provided"}', "⚠ Provide document text"
124
 
125
  key = api_key.strip() or _NVIDIA_KEY
126
- if not key and not _NEMOTRON_URL:
127
- return (
128
- '{"error": "No API key or local endpoint configured"}',
129
- "⚠ Set NVIDIA_API_KEY or NEMOTRON_URL",
130
- )
131
-
132
  schema = custom_schema.strip() if schema_preset == "Custom (edit below)" else _SCHEMAS[schema_preset]
133
  model = _MODELS.get(model_label, list(_MODELS.values())[0])
134
 
@@ -144,13 +176,15 @@ def extract_structured(
144
  ]
145
 
146
  try:
147
- raw = asyncio.get_event_loop().run_until_complete(
148
- _nemotron_chat(messages, model, key, temperature=0.05)
149
- )
150
- # Try to parse to validate it's real JSON
 
 
151
  try:
152
  parsed = json.loads(raw)
153
- return json.dumps(parsed, indent=2), f"✓ Extracted with {model_label}"
154
  except json.JSONDecodeError:
155
  return raw, f"⚠ Model returned non-JSON (shown as-is)"
156
  except Exception as exc:
@@ -158,17 +192,12 @@ def extract_structured(
158
 
159
 
160
  def ask_document(doc_text: str, question: str, model_label: str, api_key: str) -> str:
161
- import asyncio
162
-
163
  if not doc_text.strip():
164
  return "Provide a document first."
165
  if not question.strip():
166
  return "Ask a question."
167
 
168
  key = api_key.strip() or _NVIDIA_KEY
169
- if not key and not _NEMOTRON_URL:
170
- return "Set NVIDIA_API_KEY or NEMOTRON_URL to use Nemotron."
171
-
172
  model = _MODELS.get(model_label, list(_MODELS.values())[0])
173
  messages = [
174
  {
@@ -182,23 +211,18 @@ def ask_document(doc_text: str, question: str, model_label: str, api_key: str) -
182
  },
183
  ]
184
  try:
185
- return asyncio.get_event_loop().run_until_complete(
186
- _nemotron_chat(messages, model, key, temperature=0.3)
187
- )
188
  except Exception as exc:
189
  return f"Error: {exc}"
190
 
191
 
192
  def summarise_document(doc_text: str, style: str, model_label: str, api_key: str) -> str:
193
- import asyncio
194
-
195
  if not doc_text.strip():
196
  return "Provide a document first."
197
 
198
  key = api_key.strip() or _NVIDIA_KEY
199
- if not key and not _NEMOTRON_URL:
200
- return "Set NVIDIA_API_KEY or NEMOTRON_URL."
201
-
202
  model = _MODELS.get(model_label, list(_MODELS.values())[0])
203
  style_prompts = {
204
  "Executive (3 bullets)": "Summarise in exactly 3 bullet points for an executive audience.",
@@ -212,15 +236,15 @@ def summarise_document(doc_text: str, style: str, model_label: str, api_key: str
212
  {"role": "user", "content": f"Document:\n\n{doc_text[:5000]}"},
213
  ]
214
  try:
215
- return asyncio.get_event_loop().run_until_complete(
216
- _nemotron_chat(messages, model, key, temperature=0.4)
217
- )
218
  except Exception as exc:
219
  return f"Error: {exc}"
220
 
221
 
222
  def push_to_mesh(doc_text: str, doc_title: str, corpus: str, mesh_url: str) -> str:
223
- import asyncio, httpx
224
 
225
  url = (mesh_url.strip() or _MESH_NODE).rstrip("/")
226
  if not url:
@@ -230,26 +254,26 @@ def push_to_mesh(doc_text: str, doc_title: str, corpus: str, mesh_url: str) -> s
230
 
231
  async def _push():
232
  payload = {
233
- "body": {
234
- "params": {"corpus": corpus or "documents"},
235
- "input": {
236
- "documents": [
237
- {
238
- "id": f"doc-{hash(doc_text) % 100000}",
239
- "title": doc_title or "Untitled",
240
- "text": doc_text,
241
- }
242
- ]
243
- },
244
- }
245
  }
246
  async with httpx.AsyncClient(timeout=15.0) as c:
247
- r = await c.post(f"{url}/capabilities/rag.ingest/call", json=payload)
248
  r.raise_for_status()
249
  return r.json()
250
 
251
  try:
252
- asyncio.get_event_loop().run_until_complete(_push())
253
  return f"✓ Document pushed to mesh at {url}\nCorpus: {corpus}\nNow searchable via Ask tab on any mesh node."
254
  except Exception as exc:
255
  return f"⚠ Push failed: {exc}"
 
21
 
22
  from __future__ import annotations
23
 
24
+ import asyncio
25
  import os
26
 
27
  import gradio as gr
 
91
  return _NEMOTRON_URL.rstrip("/") + "/v1" if _NEMOTRON_URL else "https://integrate.api.nvidia.com/v1"
92
 
93
 
94
+ def _run_async(coro):
95
+ """Run a coroutine safely whether or not a loop is already running."""
96
+ try:
97
+ loop = asyncio.get_running_loop()
98
+ except RuntimeError:
99
+ loop = None
100
+ if loop and loop.is_running():
101
+ import concurrent.futures
102
+ with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
103
+ fut = pool.submit(asyncio.run, coro)
104
+ return fut.result()
105
+ return asyncio.run(coro)
106
+
107
+
108
+ def _local_smol_chat(messages: list, max_tokens: int = 512) -> str:
109
+ """SmolLM2-135M local fallback — no API key required."""
110
+ try:
111
+ from transformers import pipeline as _pipeline # type: ignore[import-untyped]
112
+
113
+ _smol_id = "HuggingFaceTB/SmolLM2-135M-Instruct"
114
+ pipe = _pipeline("text-generation", model=_smol_id, device_map="auto", torch_dtype="auto")
115
+ prompt = ""
116
+ for m in messages:
117
+ role, content = m.get("role", "user"), m.get("content", "")
118
+ if role == "system":
119
+ prompt += f"<|im_start|>system\n{content}<|im_end|>\n"
120
+ elif role == "user":
121
+ prompt += f"<|im_start|>user\n{content}<|im_end|>\n"
122
+ elif role == "assistant":
123
+ prompt += f"<|im_start|>assistant\n{content}<|im_end|>\n"
124
+ prompt += "<|im_start|>assistant\n"
125
+ result = pipe(prompt, max_new_tokens=max_tokens, return_full_text=False, do_sample=False)
126
+ return result[0]["generated_text"].strip()
127
+ except Exception as exc:
128
+ return f"[SmolLM2 unavailable: {exc}]"
129
+
130
+
131
  async def _nemotron_chat(messages: list, model: str, api_key: str, temperature: float = 0.1) -> str:
132
  import httpx
133
 
 
155
  model_label: str,
156
  api_key: str,
157
  ) -> tuple[str, str]:
158
+ import json
159
 
160
  if not doc_text.strip():
161
  return '{"error": "No document text provided"}', "⚠ Provide document text"
162
 
163
  key = api_key.strip() or _NVIDIA_KEY
 
 
 
 
 
 
164
  schema = custom_schema.strip() if schema_preset == "Custom (edit below)" else _SCHEMAS[schema_preset]
165
  model = _MODELS.get(model_label, list(_MODELS.values())[0])
166
 
 
176
  ]
177
 
178
  try:
179
+ if key or _NEMOTRON_URL:
180
+ raw = _run_async(_nemotron_chat(messages, model, key, temperature=0.05))
181
+ label = f"✓ Extracted with {model_label}"
182
+ else:
183
+ raw = _local_smol_chat(messages, max_tokens=512)
184
+ label = "✓ Extracted with SmolLM2-135M (local fallback)"
185
  try:
186
  parsed = json.loads(raw)
187
+ return json.dumps(parsed, indent=2), label
188
  except json.JSONDecodeError:
189
  return raw, f"⚠ Model returned non-JSON (shown as-is)"
190
  except Exception as exc:
 
192
 
193
 
194
  def ask_document(doc_text: str, question: str, model_label: str, api_key: str) -> str:
 
 
195
  if not doc_text.strip():
196
  return "Provide a document first."
197
  if not question.strip():
198
  return "Ask a question."
199
 
200
  key = api_key.strip() or _NVIDIA_KEY
 
 
 
201
  model = _MODELS.get(model_label, list(_MODELS.values())[0])
202
  messages = [
203
  {
 
211
  },
212
  ]
213
  try:
214
+ if key or _NEMOTRON_URL:
215
+ return _run_async(_nemotron_chat(messages, model, key, temperature=0.3))
216
+ return _local_smol_chat(messages, max_tokens=512)
217
  except Exception as exc:
218
  return f"Error: {exc}"
219
 
220
 
221
  def summarise_document(doc_text: str, style: str, model_label: str, api_key: str) -> str:
 
 
222
  if not doc_text.strip():
223
  return "Provide a document first."
224
 
225
  key = api_key.strip() or _NVIDIA_KEY
 
 
 
226
  model = _MODELS.get(model_label, list(_MODELS.values())[0])
227
  style_prompts = {
228
  "Executive (3 bullets)": "Summarise in exactly 3 bullet points for an executive audience.",
 
236
  {"role": "user", "content": f"Document:\n\n{doc_text[:5000]}"},
237
  ]
238
  try:
239
+ if key or _NEMOTRON_URL:
240
+ return _run_async(_nemotron_chat(messages, model, key, temperature=0.4))
241
+ return _local_smol_chat(messages, max_tokens=512)
242
  except Exception as exc:
243
  return f"Error: {exc}"
244
 
245
 
246
  def push_to_mesh(doc_text: str, doc_title: str, corpus: str, mesh_url: str) -> str:
247
+ import httpx
248
 
249
  url = (mesh_url.strip() or _MESH_NODE).rstrip("/")
250
  if not url:
 
254
 
255
  async def _push():
256
  payload = {
257
+ "capability": "rag.ingest",
258
+ "version": "1.0",
259
+ "params": {"corpus": corpus or "documents"},
260
+ "input": {
261
+ "documents": [
262
+ {
263
+ "id": f"doc-{hash(doc_text) % 100000}",
264
+ "title": doc_title or "Untitled",
265
+ "text": doc_text,
266
+ }
267
+ ]
268
+ },
269
  }
270
  async with httpx.AsyncClient(timeout=15.0) as c:
271
+ r = await c.post(f"{url}/bus/v1/call", json=payload)
272
  r.raise_for_status()
273
  return r.json()
274
 
275
  try:
276
+ _run_async(_push())
277
  return f"✓ Document pushed to mesh at {url}\nCorpus: {corpus}\nNow searchable via Ask tab on any mesh node."
278
  except Exception as exc:
279
  return f"⚠ Push failed: {exc}"