HearthNet-Nemotron / tasks.md
GitHub Actions
fix: llm.chat IndexError (lazy Ollama warm + safe _resolve_backend fallback) + chat self-send returns direct
66a1a95
|
Raw
History Blame
17.4 kB

ο»Ώ# HearthNet β€” Task Tracker

Status Summary (June 2026)

All Phase 1 (M01-M13, X01-X04), Phase 2 (M14-M25, X05-X07), and Phase 3 experimental (M26-M31) modules are implemented. 489 tests pass, 59 skipped (E2E), 0 fail.

See ARCHITECTURE.md for the full module map, data flows, and local-to-HF setup guide. See docs/reports/IMPROVEMENTS.md for the full improvement backlog and prize targeting analysis.


Maximize Real Activation (June 12) β€” see upgrade_plan.md

Full 10-phase upgrade to make every feasible capability genuinely real (no mocks, fakes, or # noqa/# nosec bypasses). Net result: +18 passing tests, βˆ’6 failing, zero regressions (baseline 1296β†’1314 passing).

Done:

  • P1 Gossip fix β€” _gossip_loop built HttpClient with wrong args + a client lacking httpx .get()/.post(). Added _HttpxSyncClient adapter; gossip now runs.
  • P2 Real semantic RAG β€” added sentence-transformers/httpx to requirements.txt; EmbeddingService (SentenceTransformer BAAI/bge-small-en-v1.5) now registered, so rag.query does genuine semantic retrieval (was 16-dim hash).
  • P3 Activate dormant real services β€” node.install_extended_services() registers Embedding/Rerank/Ocr/Translation/Stt/Tts/ImageDescribe/ImageGenerate; all degrade to unavailable when optional deps absent (no mock).
  • P4 M30 Evidence + M31 Civil Defense β€” wrote EvidenceService and CivilDefenseService bus adapters (capabilities()); registered under research=True.
  • P5 M29 LoRa β€” intentionally not enabled in the demo (no hardware); documented.
  • P6 app.py wiring β€” real RagService+FederatedRagService with seeded corpus, multi-backend LlmService (HF + opt-in Nemotron/Modal/MiniCPM), EventLog opened and injected into Marketplace/Chat.
  • Multi-backend LLM dispatch (prize-critical) β€” registry keys by (node,name,version), so per-backend llm.chat registrations overwrote each other and sponsor backends were unreachable (the real reason NVIDIA_API_KEY did nothing). Now a single llm.chat/llm.complete advertises params.models and dispatches by model name; _remote_params_compatible honours the catalogue for cross-node routing.
  • Event-loop ordering fix β€” autouse fixture in tests/conftest.py provisions a fresh loop per test (Python 3.13 asyncio.run() resets the current loop); fixed 4 test_coverage_boost.py tests building gather() outside a loop.
  • Windows key-permission fix β€” keys.py POSIX 0o600 check now gated behind os.name == "posix" (stat.S_IMODE returns 0o666 on NTFS, not an error). Not a security bypass β€” POSIX mode bits are meaningless on Windows.
  • P7 Docs β€” README (real embeddings + sponsor backends), CAPABILITY_CONTRACT.md (multi-model params.models note); M05/M11 docs already matched the now-real impl.
  • P9 Tests β€” test_sponsor_backends.py, test_gossip_sync.py, test_phase3_services.py, test_extended_services.py (all green).

Model policy: LLM kept as SmolLM2-135M-Instruct (not swapped β€” MiniCPM-4B risks OOM on free ZeroGPU; MINICPM_URL remains the opt-in path). The real upgrade is semantic RAG via bge-small-en-v1.5.

Kept gated (honest): M26 distributed inference + M28 fedlearn raise NotImplementedError in core compute (need torch model-slicing / peft) β€” roadmap, not advertised. M29 LoRa is hardware-gated.

Known issue (pre-existing, not a regression): test_e2e_user_stories.py::TestUS11ApiCoverage::test_US11_3_rag_trace_shows_corpus fails only via a full Gradio + gradio_client round-trip (client-side dropdown-value serialization quirk in untouched demo/UI code). The 17 collection errors are pre-existing playwright ModuleNotFound (optional browser-test dep).


Security Audit & Fixes (June 12)

Full assessment: SECURITY_AUDIT_ASSESSMENT.md

Critical Vulnerabilities Fixed:

  • βœ… CVE-2025-3000 (PyTorch): Updated torch>=2.3.0 β†’ torch>=2.12.1 to patch memory corruption in torch.jit.script
  • βœ… CVE-2025-71176 (pytest): Updated pytest>=8.2 β†’ pytest>=8.5.0 to patch /tmp race condition on UNIX
  • βœ… RCE via trust_remote_code=True (florence2.py:52-58): Added hardcoded allowlist of approved Microsoft models, added validation in init to prevent loading arbitrary model IDs with trust_remote_code

High Priority Issues Documented:

  • Sync HTTP in async context (peering.py:208, 230): Intentional β€” PeeringClient methods are synchronous-only by design. If called from async, wrap with asyncio.to_thread(). Documented in class docstring + SECURITY_AUDIT_ASSESSMENT.md
  • System prompt secrets (app_nemotron.py:169): False positive β€” no actual secrets in system prompts, only instructions

False Positives Excluded:

  • agent-audit (43 findings): No .agent.md files in HearthNet; tool not applicable to capability-bus architecture
  • Semgrep system-prompt-contains-secret: Regex noise match, no real secrets present

Dependencies Updated:

  • requirements.txt: torch>=2.12.1
  • requirements-dev.txt: pytest>=8.5.0

Related files:


Hackathon additions (June 11):

  • app_nemotron.py: Second Gradio Space β€” Nemotron Document Intelligence (structured extraction, Q&A, summarisation, push-to-mesh RAG) Targets: NVIDIA RTX 5080 + Off Brand badge
  • hearthnet/ui/tabs/nemotron.py: Nemotron tab for embedding in main app
  • hearthnet/services/llm/backends/modal_backend.py: Modal serverless GPU backend (targets Modal Best Use $10k credits)
  • scripts/modal_deploy.py: One-command Modal deployment script
  • hearthnet/node.py install_services(): now auto-discovers Nemotron (NVIDIA_API_KEY), MiniCPM (MINICPM_URL), and Modal (MODAL_ENDPOINT) backends from env vars
  • README: added nemotron, minicpm, modal tags; expanded hackathon section with sponsor prize targeting table
  • docs/reports/IMPROVEMENTS.md: comprehensive improvement backlog with GPT-4o rating, 29 improvement items, and priority matrix

README + submission (June 11):

  • Full README rewrite: YAML tags, screenshots, author links, architecture, module ref
  • Tags: backyard-ai, tiny-titan, best-agent, nemotron, minicpm, modal
  • Links: HF Chris4K, X @zX14_7, GitHub ckal
  • Placeholders: demo video + social post (needed before June 15)

Previous fixes (June 11):

  • NameError: node_id in settings.py f-string β€” fixed to literal string
  • TestTabBuildRegression (6 tests) β€” catches build-time NameError before HF deploy
  • TestUS11ApiCoverage + TestUS12MeshConnection (8 new tests)

Recent fixes (June 10 β€” Phase 3 wiring):

  • MoeService: moe.route / moe.register / moe.list / moe.handoff registered on bus (M27)
  • ModelDistributionService: now always registered (auto-creates ~/.hearthnet/blobs if no blob_store passed) (M26)
  • PlantIdentificationService: tool.plant_identify on bus β€” local Florence-2 β†’ HF API β†’ unavailable (M21)
  • PLANT_TOOL_DEFINITION: ready for ToolExecutor (LLM can call plant_identify mid-generation)
  • Getting Started tab: documents pip install, MoE routing, BitTorrent model sharing, plant tool
  • README: updated test count, pip install, M26/M27 status to "registered"

Previous fixes (June 10):

  • FileService: real file.put / file.get / file.list / file.delete via bus (BLAKE3 CID)
  • Real RagService used in production (no longer importing demo stub)
  • Chat tab: missing return fixed (was silently failing on exceptions)
  • Emergency probe button: now actually runs DNS+HTTP probes and shows results
  • QR invite: graceful fallback when PyNaCl/community manifest not available
  • 10-document seed RAG corpus in HF Space (emergency, first aid, mesh, setup)
  • Marketplace: market.delete capability added
  • Test isolation: nest_asyncio.apply() in conftest.py fixes Python 3.13 + pytest-asyncio 0.26

impl_ref Β§22 gap-fill (June 11):

  • 9 CLI commands added: log, erase, rag list/ingest/reindex, invite create/redeem, version
  • ManifestPublisher + PeriodicTask in node.py
  • LmStudioBackend, HfApiBackend, AnthropicApiBackend (M04)
  • CommunityPolicy, CommunityMember, RevokedEntry in identity/manifest.py (M01)
  • hearthnet_theme + emergency_theme in ui/theme.py (M08)
  • TopologyComponent with push_trace/push_topology/render in ui/topology.py (M08)
  • FlowControl, RateCheck, RateLimiter in transport/backpressure.py (X01)
  • Frame + SseReader in transport/streams.py (X01)
  • DiscoveryError in discovery/init.py (M02)
  • RegistryEvent in bus/registry.py (M03)
  • CheckResult alias + TrackioExporter + detach() in observability/ (X03)
  • build_onboarding alias in ui/onboarding.py (M13)
  • Phase 3 type aliases in types.py (ShardID, ExpertID, ClaimID, AlertID, etc.)
  • Phase 3 constants in constants.py (all M26-M31, X08, X09 constants)
  • ARCHITECTURE.md created
  • scripts/connect_to_hf.py β€” script to peer local node with HF Space

Pending / future work:

  • pip install hearthnet β€” not yet published to PyPI (use pip install -e . from repo)
  • Custom UI (non-Gradio, modern HTML/CSS) β€” planned as second UI alongside current reference
  • Modal/LoRA fine-tuning integration β€” future M28 fedlearn
  • ShardServer.forward() β€” PipelineOrchestrator.run() β€” real torch sharding (M26 placeholder)

Phase 1 β€” Complete

  • M01 Identity (Ed25519, canonical JSON, node/community manifests)
  • M02 Discovery (mDNS, UDP broadcast, PeerRegistry with async events)
  • M03 Capability bus (schema validation, router, health, traces)
  • M04 LLM (Ollama, llama.cpp, HF Transformers backends; OpenAI online fallback)
  • M05 RAG (chunker, ChromaDB + in-memory, IngestPipeline, bus embed)
  • M06 Marketplace (event-sourced, post/list/expire/search)
  • M07 File blobs (BLAKE3 CID store, chunking, TransferManager)
  • M08 UI (Gradio, 6 tabs: Ask/Chat/Marketplace/Files/Emergency/Settings)
  • M09 Emergency (async probe loop, DNS+HTTP, anti-flap, StateBus)
  • M10 Chat (event-sourced, ChatView, DeliveryManager)
  • M11 Embedding (embed.text, SimpleHashBackend, SentenceTransformerBackend)
  • M12 CLI (click, ask/node info/caps/call/doctor/trace)
  • M13 Onboarding (InviteBlob, QR, create/join/redeem community)
  • X01 Transport (FastAPI server 12 endpoints, HttpClient, SSE, TLS)
  • X02 Events (SQLite WAL, LamportClock, ReplayEngine, MaterialisedView, SnapshotStore)
  • X03 Observability (structured JSON logging, prometheus metrics optional, trace ring buffer)
  • X04 Config (typed frozen Config, TOML load/save, XDG paths, env overlay)

Phase 2 β€” Complete

  • M14 Federation (FederationManifest, bilateral peering, FederationService)
  • M15 Relay tier (RelayClient, NAT traversal, keepalive, push token registry)
  • M16 Capability tokens (hntoken://v1/ Ed25519 JWS-style, AuthService)
  • M17 OCR (Tesseract + TrOCR backends, image/pdf capabilities)
  • M18 Translation (NLLB backend, LRU cache, 4000-char limit)
  • M19 STT/TTS (WhisperBackend local STT, EdgeTtsBackend synthesis)
  • M20 Vision (Florence-2 image describe, generate placeholder)
  • M21 Tool calls (ToolDefinition, ToolCall, ToolResult, ToolExecutor, run_loop)
  • M22 Mobile native (MobileInviteBlob, hnapp:// deep links, MobilePushService)
  • M23 E2E encryption (X3DH, Double Ratchet fixed bug, envelope, prekeys)
  • M24 Reranking (BGE + CrossEncoder, 100-doc limit, bus integration)
  • M25 Group chat (ThreadService, ThreadViewStore, event-sourced)
  • X05 DHT (Kademlia, 256-bucket routing table, KademliaNode, bootstrap)
  • X06 WebSocket (WebSocketSession, WebSocketClient, WebsocketPubSub)
  • X07 Federated metrics (NodeMetricsTick, MetricsAggregator, OTLP export)

Phase 3 β€” Experimental Stubs (feature-flag gated)

All enabled via config.research.* flags (all default False).

  • M26 Distributed inference (ShardDescriptor, Pipeline, PipelineOrchestrator)
  • M27 MoE routing (ExpertDescriptor, ExpertRegistry, MoeRouter)
  • M28 Federated learning (FedLearnCoordinator, RoundManifest)
  • M29 LoRa beacons (32-byte frame encoding, LoraBeaconService)
  • M30 Evidence graph (Claim, ClaimStore, Attestation, Dispute; EBKH import)
  • M31 Civil defense NRW (Alert, RoleCertificate, AuditChain, CivilDefenseService)

Quality Gates β€” All Passing

  • ruff β€” no lint errors
  • bandit β€” 0 HIGH findings, intentional nosec items documented
  • mypy β€” passes (optional deps handled with TYPE_CHECKING guards)
  • pylint β€” no blocking issues
  • pytest β€” 133 passed, 51 skipped (E2E), 0 failed

Test Suites

File Tests Coverage
tests/test_phase1_routing.py 8 Bus routing, failover, capabilities
tests/test_phase1_emergency_snapshot.py 5 Emergency mode, controller snapshot
tests/test_phase2_modules.py 23 M14-M25, X05-X07
tests/test_phase3_experimental.py 15 M26-M31, ResearchConfig
tests/test_wiring.py 22 Wiring integration: X01/X02/X06/X09/M02/M22
tests/test_e2e_user_stories.py 60 Gradio UI E2E (real browser, Playwright)

Architecture Notes

  • All services implement health() -> dict returning {"status": "ok" | "unavailable"}
  • All service handlers receive RouteRequest(capability, version_req, body, caller, trace_id)
  • Response format: {"output": {...}, "meta": {}}
  • No mocks in implementation paths; heavy optional deps fail gracefully
  • OpenAI only as opt-in online fallback β€” never the default local path
  • No security-tool suppression pragmas except narrow reviewed nosec comments

Internet Mesh β€” all-to-all over the relay hub (June 13)

Goal: any node (Python server, browser tab, phone) joins a mesh via a secure redeem code/QR and uses everyone's features (chat, RAG, LLM) as if local, with all-to-all node messaging and a one-command launcher that auto-connects to the HF Space. Local-first: internet "relay mode" is opt-in and modular β€” the default node stays LAN/in-process only.

Status: P1–P3 implemented and verified (tests/test_relay_mesh.py).

In progress (P1–P3):

  • P1 β€” Modular transport + relay hub. CompositeTransport (hearthnet/bus/transport.py) tries pluggable DeliveryStrategy handlers (in-process β†’ direct HTTP β†’ relay). Relay hub (hearthnet/transport/relay_hub.py) exposes pull-based mailboxes (/relay/v1/join|send|poll|roster, mounted on the Space in app.py) so NAT-bound nodes reach each other through the Space. RelayClient (hearthnet/transport/relay_client.py) joins + runs a long-poll loop + correlates request/response envelopes; node.join_relay()/leave_relay() attach it opt-in. Verified by tests/test_relay_mesh.py (all-to-all over a real uvicorn relay).
  • P2 β€” Secure relay-aware invite/redeem + QR. InviteBlob now carries relay_url + relay_token (signed into the payload, embedded in the QR/link via the existing generators); mesh.join capability (hearthnet/transport/mesh_service.py) decodes a pasted code/scanned QR or explicit relay URL and auto-joins the mesh.
  • P3 β€” Launcher. scripts/start_mesh_node.py starts a local-first node and, with --connect <invite|hf|relay-url>, attaches the relay, auto-connects to HF, and stays running. No flag = pure local (no outbound calls).

Deferred β€” P4 Browser ↔ Python bridge (NOT implemented yet): The browser mesh (webagent/src/mesh/browsermesh.js, PeerJS/WebRTC anchor-rendezvous full mesh) and the Python relay both rendezvous at the Space but are currently separate meshes. P4 would bridge them so browser tabs and Python nodes share one logical mesh β€” translating between WebRTC data channels and the relay mailbox at the Space, and adding WebRTC/tunnel as additional pluggable DeliveryStrategy implementations. Deferred because it is the heaviest piece (bidirectional WebRTC↔mailbox translation, signaling, and ICE/TURN concerns); P1–P3 should be proven end-to-end first. Tracked here so it is not lost.


Known Remaining Gaps

  • Wire real event log (X02) into HearthNode on startup (services still use in-memory fallback)

  • Wire X01 FastAPI transport into node.start() for real inter-node HTTP calls

  • Wire M02 mDNS/UDP discovery into node.start() (PeerRegistry not yet auto-started)

  • ShardServer.forward() / PipelineOrchestrator.run() β€” real torch sharding (M26 needs torch)

  • Gossip sync (X02 SyncClient/SyncServer) between live nodes in production

  • Live UI push via WebSocket pubsub (X06 wired into StateBus; Gradio event loop integration pending)

  • M22 Flutter mobile app β€” separate repo; Python anchor-side helpers done

  • Second implementation of M32 protocol (conformance is performative without a second impl)

  • pip install hearthnet β€” not yet published to PyPI

  • [] change model (ask user), deploy to modal , cohere , check all tags from 29 wins , demo video, poss links ...