Spaces:
Running on Zero
Running on Zero
File size: 7,602 Bytes
6f9a5fd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 | # M02 β Discovery
**Spec version:** v1.0
**Depends on:** M01 (identity), X04 (config), X03 (observability), X01 (transport, for the manifest fetch URL), `python-zeroconf`
**Depended on by:** M03 (bus, for peer enumeration), M09 (emergency mode increases discovery cadence)
---
## 1. Responsibility
Find peers on the local network. Maintain a live in-memory registry of known peers with their manifests, last-seen timestamps, and latencies. Republish our own presence.
Out of scope:
- DHT (Phase 2)
- LoRa beacons (Phase 3)
- Internet relay (Phase 2)
---
## 2. File layout
```
hearthnet/discovery/
βββ __init__.py
βββ mdns.py # zeroconf-based service browser + announcer
βββ udp.py # UDP broadcast announcer + listener
βββ peers.py # PeerRegistry: in-memory state
βββ relay.py # Phase 2 stub
```
---
## 3. Public API
### 3.1 `peers.py`
```python
# hearthnet/discovery/peers.py
from dataclasses import dataclass
@dataclass
class PeerRecord:
node_id: str # short form
node_id_full: str
display_name: str
community_id: str
profile: str
endpoints: list[Endpoint]
manifest: NodeManifest | None # None until fetched
last_seen: float # monotonic time
rtt_ms: float | None # measured by health probe
source: str # "mdns" | "udp" | "relay"
class PeerRegistry:
"""In-memory map of NodeID β PeerRecord. Thread-safe via asyncio.Lock."""
def __init__(self, our_node_id_full: str, community_id: str):
...
def upsert(self, record: PeerRecord) -> bool:
"""Add or update; returns True if new peer."""
def remove(self, node_id_full: str) -> bool: ...
def get(self, node_id_full: str) -> PeerRecord | None: ...
def all(self) -> list[PeerRecord]: ...
def for_community(self, community_id: str) -> list[PeerRecord]: ...
def prune_stale(self, max_age_seconds: int = 90) -> int:
"""Remove peers not seen recently. Returns count removed."""
# subscribers (called when peer added / removed / updated):
def subscribe(self) -> AsyncIterator[PeerEvent]: ...
@dataclass(frozen=True)
class PeerEvent:
kind: str # "added" | "removed" | "updated"
peer: PeerRecord
```
### 3.2 `mdns.py`
```python
# hearthnet/discovery/mdns.py
class MdnsAnnouncer:
"""Publishes our own service via mDNS."""
def __init__(
self,
kp: KeyPair,
node_id_short: str,
display_name: str,
community_id_short: str,
profile: str,
port: int,
capabilities_names: list[str],
manifest_url: str,
):
...
async def start(self) -> None: ...
async def stop(self) -> None: ...
def update(self, *, capabilities_names: list[str] | None = None) -> None:
"""Refresh TXT records (e.g. when capabilities change)."""
class MdnsBrowser:
"""Listens for other nodes via mDNS, populates the registry."""
def __init__(self, registry: PeerRegistry, our_community_id: str):
...
async def start(self) -> None: ...
async def stop(self) -> None: ...
```
### 3.3 Service definition
- Service type: `_hearthnet._tcp.local.`
- Instance name: `<display_name>-<short_node_id_4chars>`
- Port: from manifest's first endpoint
- TXT records:
- `v=1`
- `node=<short_node_id>`
- `community=<short_community_id>`
- `profile=<anchor|hearth|spark|bridge>`
- `caps=<comma-separated cap names>` (max 200 bytes; truncate if needed)
- `manifest_url=https://<host>:<port>/manifest`
- `contract_version=1.0`
### 3.4 `udp.py`
```python
# hearthnet/discovery/udp.py
class UdpAnnouncer:
"""Periodic UDP multicast of node presence."""
def __init__(
self,
kp: KeyPair,
registry: PeerRegistry,
node_id_short: str,
community_id_short: str,
port: int,
capabilities_names: list[str],
multicast_group: str = "239.255.42.42",
multicast_port: int = 42424,
):
...
async def run(self) -> None:
"""Loop: emit announcement every DISCOVERY_UDP_INTERVAL_SECONDS.
Active interval when fewer than 2 peers; stable interval otherwise."""
class UdpListener:
"""Receives multicast announcements, populates registry."""
def __init__(self, registry: PeerRegistry, our_community_id: str): ...
async def run(self) -> None: ...
```
### 3.5 UDP payload
```json
{"v":1,"node":"7H4G-Y9KL","community":"NIED-...","port":7080,"caps":["llm.chat","rag.query"]}
```
Max 1KB. No signature on the announce itself (we'll re-fetch & verify the full manifest from `manifest_url`).
---
## 4. Behaviour
### 4.1 First contact flow
```
mDNS or UDP discovers a peer at <host:port> for community X (matches ours)
β
PeerRegistry.upsert(stub PeerRecord with manifest=None)
β
asyncio task: HTTP GET https://<host>:<port>/manifest (via X01 client)
β
parse + verify_node_manifest (M01)
β
if community matches AND author is a member (community manifest): keep
otherwise: remove
β
PeerEvent("added") emitted
```
### 4.2 Refresh
- mDNS TXT updates trigger re-fetch of `/manifest`
- Every 30 seconds, we attempt to refresh peers whose manifests are within 10 seconds of expiry
- Peers whose manifests expired and could not be refetched are pruned after 90 seconds
### 4.3 Mode behaviour
When [M09](M09-emergency.md) reports offline:
- `UdpAnnouncer` switches to fast interval
- `MdnsAnnouncer` doesn't change (already low-overhead)
- Stale peer pruning becomes more aggressive (30s instead of 90s) β we want fresh data quickly
### 4.4 Multi-interface handling
- mDNS uses `zeroconf` defaults (all interfaces)
- UDP listener binds to `INADDR_ANY` on the multicast group; SO_REUSEPORT so multiple processes can coexist on the same host
### 4.5 Privacy
mDNS announces the short NodeID, profile, and a list of capability names. This is visible to any device on the LAN. We accept this β it is the price of zero-config.
Devices NOT in our community still see our presence but cannot make calls (rejected at the bus signature check).
---
## 5. Errors
`DiscoveryError` codes:
- `socket_in_use` β UDP port already bound
- `mdns_unavailable` β zeroconf fails to start (Linux without avahi, etc.)
- `manifest_fetch_failed` β HTTP error fetching `/manifest`
- `manifest_invalid` β propagated from M01 verification
Errors are logged but not fatal; the node continues with whichever discovery transport works.
---
## 6. Configuration
From [X04](X04-config.md):
```python
config.discovery.mdns_enabled
config.discovery.udp_enabled
config.discovery.udp_multicast_group
config.discovery.udp_port
config.discovery.relay_urls # Phase 2
```
Constants: `DISCOVERY_UDP_INTERVAL_SECONDS`.
---
## 7. Tests
### Unit
- `test_peer_registry_upsert_returns_true_first_time`
- `test_peer_registry_prune_stale`
- `test_udp_payload_under_1kb`
- `test_mdns_txt_records_parse`
### Integration
- `test_two_nodes_find_each_other_via_mdns` (in-process zeroconf)
- `test_udp_fallback_when_mdns_disabled`
- `test_foreign_community_peer_filtered_out`
---
## 8. Cross-references
| What | Where |
|------|-------|
| Manifest fetch + verify | [M01 Β§3.2](M01-identity.md) |
| Service definition | [CONTRACT Β§6.1](../CAPABILITY_CONTRACT.md) (manifest schema) |
| Bus consumes peer events | [M03 Β§5.2](M03-bus.md) |
| Emergency mode influence | [M09 Β§5](M09-emergency.md) |
| Phase 2 internet relay | this module's `relay.py` (stub) |
|