HearthNet-Nemotron / tasks.md
GitHub Actions
docs: real semantic RAG, wired sponsor backends, multi-model contract note
74c4a03
|
Raw
History Blame
15 kB

ο»Ώ# HearthNet β€” Task Tracker

Status Summary (June 2026)

All Phase 1 (M01-M13, X01-X04), Phase 2 (M14-M25, X05-X07), and Phase 3 experimental (M26-M31) modules are implemented. 489 tests pass, 59 skipped (E2E), 0 fail.

See ARCHITECTURE.md for the full module map, data flows, and local-to-HF setup guide. See docs/IMPROVEMENTS.md for the full improvement backlog and prize targeting analysis.


Maximize Real Activation (June 12) β€” see upgrade_plan.md

Full 10-phase upgrade to make every feasible capability genuinely real (no mocks, fakes, or # noqa/# nosec bypasses). Net result: +18 passing tests, βˆ’6 failing, zero regressions (baseline 1296β†’1314 passing).

Done:

  • P1 Gossip fix β€” _gossip_loop built HttpClient with wrong args + a client lacking httpx .get()/.post(). Added _HttpxSyncClient adapter; gossip now runs.
  • P2 Real semantic RAG β€” added sentence-transformers/httpx to requirements.txt; EmbeddingService (SentenceTransformer BAAI/bge-small-en-v1.5) now registered, so rag.query does genuine semantic retrieval (was 16-dim hash).
  • P3 Activate dormant real services β€” node.install_extended_services() registers Embedding/Rerank/Ocr/Translation/Stt/Tts/ImageDescribe/ImageGenerate; all degrade to unavailable when optional deps absent (no mock).
  • P4 M30 Evidence + M31 Civil Defense β€” wrote EvidenceService and CivilDefenseService bus adapters (capabilities()); registered under research=True.
  • P5 M29 LoRa β€” intentionally not enabled in the demo (no hardware); documented.
  • P6 app.py wiring β€” real RagService+FederatedRagService with seeded corpus, multi-backend LlmService (HF + opt-in Nemotron/Modal/MiniCPM), EventLog opened and injected into Marketplace/Chat.
  • Multi-backend LLM dispatch (prize-critical) β€” registry keys by (node,name,version), so per-backend llm.chat registrations overwrote each other and sponsor backends were unreachable (the real reason NVIDIA_API_KEY did nothing). Now a single llm.chat/llm.complete advertises params.models and dispatches by model name; _remote_params_compatible honours the catalogue for cross-node routing.
  • Event-loop ordering fix β€” autouse fixture in tests/conftest.py provisions a fresh loop per test (Python 3.13 asyncio.run() resets the current loop); fixed 4 test_coverage_boost.py tests building gather() outside a loop.
  • Windows key-permission fix β€” keys.py POSIX 0o600 check now gated behind os.name == "posix" (stat.S_IMODE returns 0o666 on NTFS, not an error). Not a security bypass β€” POSIX mode bits are meaningless on Windows.
  • P7 Docs β€” README (real embeddings + sponsor backends), CAPABILITY_CONTRACT.md (multi-model params.models note); M05/M11 docs already matched the now-real impl.
  • P9 Tests β€” test_sponsor_backends.py, test_gossip_sync.py, test_phase3_services.py, test_extended_services.py (all green).

Model policy: LLM kept as SmolLM2-135M-Instruct (not swapped β€” MiniCPM-4B risks OOM on free ZeroGPU; MINICPM_URL remains the opt-in path). The real upgrade is semantic RAG via bge-small-en-v1.5.

Kept gated (honest): M26 distributed inference + M28 fedlearn raise NotImplementedError in core compute (need torch model-slicing / peft) β€” roadmap, not advertised. M29 LoRa is hardware-gated.

Known issue (pre-existing, not a regression): test_e2e_user_stories.py::TestUS11ApiCoverage::test_US11_3_rag_trace_shows_corpus fails only via a full Gradio + gradio_client round-trip (client-side dropdown-value serialization quirk in untouched demo/UI code). The 17 collection errors are pre-existing playwright ModuleNotFound (optional browser-test dep).


Security Audit & Fixes (June 12)

Full assessment: SECURITY_AUDIT_ASSESSMENT.md

Critical Vulnerabilities Fixed:

  • βœ… CVE-2025-3000 (PyTorch): Updated torch>=2.3.0 β†’ torch>=2.12.1 to patch memory corruption in torch.jit.script
  • βœ… CVE-2025-71176 (pytest): Updated pytest>=8.2 β†’ pytest>=8.5.0 to patch /tmp race condition on UNIX
  • βœ… RCE via trust_remote_code=True (florence2.py:52-58): Added hardcoded allowlist of approved Microsoft models, added validation in init to prevent loading arbitrary model IDs with trust_remote_code

High Priority Issues Documented:

  • Sync HTTP in async context (peering.py:208, 230): Intentional β€” PeeringClient methods are synchronous-only by design. If called from async, wrap with asyncio.to_thread(). Documented in class docstring + SECURITY_AUDIT_ASSESSMENT.md
  • System prompt secrets (app_nemotron.py:169): False positive β€” no actual secrets in system prompts, only instructions

False Positives Excluded:

  • agent-audit (43 findings): No .agent.md files in HearthNet; tool not applicable to capability-bus architecture
  • Semgrep system-prompt-contains-secret: Regex noise match, no real secrets present

Dependencies Updated:

  • requirements.txt: torch>=2.12.1
  • requirements-dev.txt: pytest>=8.5.0

Related files:


Hackathon additions (June 11):

  • app_nemotron.py: Second Gradio Space β€” Nemotron Document Intelligence (structured extraction, Q&A, summarisation, push-to-mesh RAG) Targets: NVIDIA RTX 5080 + Off Brand badge
  • hearthnet/ui/tabs/nemotron.py: Nemotron tab for embedding in main app
  • hearthnet/services/llm/backends/modal_backend.py: Modal serverless GPU backend (targets Modal Best Use $10k credits)
  • scripts/modal_deploy.py: One-command Modal deployment script
  • hearthnet/node.py install_services(): now auto-discovers Nemotron (NVIDIA_API_KEY), MiniCPM (MINICPM_URL), and Modal (MODAL_ENDPOINT) backends from env vars
  • README: added nemotron, minicpm, modal tags; expanded hackathon section with sponsor prize targeting table
  • docs/IMPROVEMENTS.md: comprehensive improvement backlog with GPT-4o rating, 29 improvement items, and priority matrix

README + submission (June 11):

  • Full README rewrite: YAML tags, screenshots, author links, architecture, module ref
  • Tags: backyard-ai, tiny-titan, best-agent, nemotron, minicpm, modal
  • Links: HF Chris4K, X @zX14_7, GitHub ckal
  • Placeholders: demo video + social post (needed before June 15)

Previous fixes (June 11):

  • NameError: node_id in settings.py f-string β€” fixed to literal string
  • TestTabBuildRegression (6 tests) β€” catches build-time NameError before HF deploy
  • TestUS11ApiCoverage + TestUS12MeshConnection (8 new tests)

Recent fixes (June 10 β€” Phase 3 wiring):

  • MoeService: moe.route / moe.register / moe.list / moe.handoff registered on bus (M27)
  • ModelDistributionService: now always registered (auto-creates ~/.hearthnet/blobs if no blob_store passed) (M26)
  • PlantIdentificationService: tool.plant_identify on bus β€” local Florence-2 β†’ HF API β†’ unavailable (M21)
  • PLANT_TOOL_DEFINITION: ready for ToolExecutor (LLM can call plant_identify mid-generation)
  • Getting Started tab: documents pip install, MoE routing, BitTorrent model sharing, plant tool
  • README: updated test count, pip install, M26/M27 status to "registered"

Previous fixes (June 10):

  • FileService: real file.put / file.get / file.list / file.delete via bus (BLAKE3 CID)
  • Real RagService used in production (no longer importing demo stub)
  • Chat tab: missing return fixed (was silently failing on exceptions)
  • Emergency probe button: now actually runs DNS+HTTP probes and shows results
  • QR invite: graceful fallback when PyNaCl/community manifest not available
  • 10-document seed RAG corpus in HF Space (emergency, first aid, mesh, setup)
  • Marketplace: market.delete capability added
  • Test isolation: nest_asyncio.apply() in conftest.py fixes Python 3.13 + pytest-asyncio 0.26

impl_ref Β§22 gap-fill (June 11):

  • 9 CLI commands added: log, erase, rag list/ingest/reindex, invite create/redeem, version
  • ManifestPublisher + PeriodicTask in node.py
  • LmStudioBackend, HfApiBackend, AnthropicApiBackend (M04)
  • CommunityPolicy, CommunityMember, RevokedEntry in identity/manifest.py (M01)
  • hearthnet_theme + emergency_theme in ui/theme.py (M08)
  • TopologyComponent with push_trace/push_topology/render in ui/topology.py (M08)
  • FlowControl, RateCheck, RateLimiter in transport/backpressure.py (X01)
  • Frame + SseReader in transport/streams.py (X01)
  • DiscoveryError in discovery/init.py (M02)
  • RegistryEvent in bus/registry.py (M03)
  • CheckResult alias + TrackioExporter + detach() in observability/ (X03)
  • build_onboarding alias in ui/onboarding.py (M13)
  • Phase 3 type aliases in types.py (ShardID, ExpertID, ClaimID, AlertID, etc.)
  • Phase 3 constants in constants.py (all M26-M31, X08, X09 constants)
  • ARCHITECTURE.md created
  • scripts/connect_to_hf.py β€” script to peer local node with HF Space

Pending / future work:

  • pip install hearthnet β€” not yet published to PyPI (use pip install -e . from repo)
  • Custom UI (non-Gradio, modern HTML/CSS) β€” planned as second UI alongside current reference
  • Modal/LoRA fine-tuning integration β€” future M28 fedlearn
  • ShardServer.forward() β€” PipelineOrchestrator.run() β€” real torch sharding (M26 placeholder)

Phase 1 β€” Complete

  • M01 Identity (Ed25519, canonical JSON, node/community manifests)
  • M02 Discovery (mDNS, UDP broadcast, PeerRegistry with async events)
  • M03 Capability bus (schema validation, router, health, traces)
  • M04 LLM (Ollama, llama.cpp, HF Transformers backends; OpenAI online fallback)
  • M05 RAG (chunker, ChromaDB + in-memory, IngestPipeline, bus embed)
  • M06 Marketplace (event-sourced, post/list/expire/search)
  • M07 File blobs (BLAKE3 CID store, chunking, TransferManager)
  • M08 UI (Gradio, 6 tabs: Ask/Chat/Marketplace/Files/Emergency/Settings)
  • M09 Emergency (async probe loop, DNS+HTTP, anti-flap, StateBus)
  • M10 Chat (event-sourced, ChatView, DeliveryManager)
  • M11 Embedding (embed.text, SimpleHashBackend, SentenceTransformerBackend)
  • M12 CLI (click, ask/node info/caps/call/doctor/trace)
  • M13 Onboarding (InviteBlob, QR, create/join/redeem community)
  • X01 Transport (FastAPI server 12 endpoints, HttpClient, SSE, TLS)
  • X02 Events (SQLite WAL, LamportClock, ReplayEngine, MaterialisedView, SnapshotStore)
  • X03 Observability (structured JSON logging, prometheus metrics optional, trace ring buffer)
  • X04 Config (typed frozen Config, TOML load/save, XDG paths, env overlay)

Phase 2 β€” Complete

  • M14 Federation (FederationManifest, bilateral peering, FederationService)
  • M15 Relay tier (RelayClient, NAT traversal, keepalive, push token registry)
  • M16 Capability tokens (hntoken://v1/ Ed25519 JWS-style, AuthService)
  • M17 OCR (Tesseract + TrOCR backends, image/pdf capabilities)
  • M18 Translation (NLLB backend, LRU cache, 4000-char limit)
  • M19 STT/TTS (WhisperBackend local STT, EdgeTtsBackend synthesis)
  • M20 Vision (Florence-2 image describe, generate placeholder)
  • M21 Tool calls (ToolDefinition, ToolCall, ToolResult, ToolExecutor, run_loop)
  • M22 Mobile native (MobileInviteBlob, hnapp:// deep links, MobilePushService)
  • M23 E2E encryption (X3DH, Double Ratchet fixed bug, envelope, prekeys)
  • M24 Reranking (BGE + CrossEncoder, 100-doc limit, bus integration)
  • M25 Group chat (ThreadService, ThreadViewStore, event-sourced)
  • X05 DHT (Kademlia, 256-bucket routing table, KademliaNode, bootstrap)
  • X06 WebSocket (WebSocketSession, WebSocketClient, WebsocketPubSub)
  • X07 Federated metrics (NodeMetricsTick, MetricsAggregator, OTLP export)

Phase 3 β€” Experimental Stubs (feature-flag gated)

All enabled via config.research.* flags (all default False).

  • M26 Distributed inference (ShardDescriptor, Pipeline, PipelineOrchestrator)
  • M27 MoE routing (ExpertDescriptor, ExpertRegistry, MoeRouter)
  • M28 Federated learning (FedLearnCoordinator, RoundManifest)
  • M29 LoRa beacons (32-byte frame encoding, LoraBeaconService)
  • M30 Evidence graph (Claim, ClaimStore, Attestation, Dispute; EBKH import)
  • M31 Civil defense NRW (Alert, RoleCertificate, AuditChain, CivilDefenseService)

Quality Gates β€” All Passing

  • ruff β€” no lint errors
  • bandit β€” 0 HIGH findings, intentional nosec items documented
  • mypy β€” passes (optional deps handled with TYPE_CHECKING guards)
  • pylint β€” no blocking issues
  • pytest β€” 133 passed, 51 skipped (E2E), 0 failed

Test Suites

File Tests Coverage
tests/test_phase1_routing.py 8 Bus routing, failover, capabilities
tests/test_phase1_emergency_snapshot.py 5 Emergency mode, controller snapshot
tests/test_phase2_modules.py 23 M14-M25, X05-X07
tests/test_phase3_experimental.py 15 M26-M31, ResearchConfig
tests/test_wiring.py 22 Wiring integration: X01/X02/X06/X09/M02/M22
tests/test_e2e_user_stories.py 60 Gradio UI E2E (real browser, Playwright)

Architecture Notes

  • All services implement health() -> dict returning {"status": "ok" | "unavailable"}
  • All service handlers receive RouteRequest(capability, version_req, body, caller, trace_id)
  • Response format: {"output": {...}, "meta": {}}
  • No mocks in implementation paths; heavy optional deps fail gracefully
  • OpenAI only as opt-in online fallback β€” never the default local path
  • No security-tool suppression pragmas except narrow reviewed nosec comments

Known Remaining Gaps

  • Wire real event log (X02) into HearthNode on startup (services still use in-memory fallback)

  • Wire X01 FastAPI transport into node.start() for real inter-node HTTP calls

  • Wire M02 mDNS/UDP discovery into node.start() (PeerRegistry not yet auto-started)

  • ShardServer.forward() / PipelineOrchestrator.run() β€” real torch sharding (M26 needs torch)

  • Gossip sync (X02 SyncClient/SyncServer) between live nodes in production

  • Live UI push via WebSocket pubsub (X06 wired into StateBus; Gradio event loop integration pending)

  • M22 Flutter mobile app β€” separate repo; Python anchor-side helpers done

  • Second implementation of M32 protocol (conformance is performative without a second impl)

  • pip install hearthnet β€” not yet published to PyPI

  • [] change model (ask user), deploy to modal , cohere , check all tags from 29 wins , demo video, poss links ...