File size: 7,602 Bytes
6f9a5fd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
# M02 β€” Discovery

**Spec version:** v1.0
**Depends on:** M01 (identity), X04 (config), X03 (observability), X01 (transport, for the manifest fetch URL), `python-zeroconf`
**Depended on by:** M03 (bus, for peer enumeration), M09 (emergency mode increases discovery cadence)

---

## 1. Responsibility

Find peers on the local network. Maintain a live in-memory registry of known peers with their manifests, last-seen timestamps, and latencies. Republish our own presence.

Out of scope:
- DHT (Phase 2)
- LoRa beacons (Phase 3)
- Internet relay (Phase 2)

---

## 2. File layout

```
hearthnet/discovery/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ mdns.py              # zeroconf-based service browser + announcer
β”œβ”€β”€ udp.py               # UDP broadcast announcer + listener
β”œβ”€β”€ peers.py             # PeerRegistry: in-memory state
└── relay.py             # Phase 2 stub
```

---

## 3. Public API

### 3.1 `peers.py`

```python
# hearthnet/discovery/peers.py
from dataclasses import dataclass

@dataclass
class PeerRecord:
    node_id:        str             # short form
    node_id_full:   str
    display_name:   str
    community_id:   str
    profile:        str
    endpoints:      list[Endpoint]
    manifest:       NodeManifest | None  # None until fetched
    last_seen:      float           # monotonic time
    rtt_ms:         float | None    # measured by health probe
    source:         str             # "mdns" | "udp" | "relay"

class PeerRegistry:
    """In-memory map of NodeID β†’ PeerRecord. Thread-safe via asyncio.Lock."""

    def __init__(self, our_node_id_full: str, community_id: str):
        ...

    def upsert(self, record: PeerRecord) -> bool:
        """Add or update; returns True if new peer."""

    def remove(self, node_id_full: str) -> bool: ...

    def get(self, node_id_full: str) -> PeerRecord | None: ...

    def all(self) -> list[PeerRecord]: ...

    def for_community(self, community_id: str) -> list[PeerRecord]: ...

    def prune_stale(self, max_age_seconds: int = 90) -> int:
        """Remove peers not seen recently. Returns count removed."""

    # subscribers (called when peer added / removed / updated):
    def subscribe(self) -> AsyncIterator[PeerEvent]: ...

@dataclass(frozen=True)
class PeerEvent:
    kind:   str        # "added" | "removed" | "updated"
    peer:   PeerRecord
```

### 3.2 `mdns.py`

```python
# hearthnet/discovery/mdns.py
class MdnsAnnouncer:
    """Publishes our own service via mDNS."""
    def __init__(
        self,
        kp: KeyPair,
        node_id_short: str,
        display_name: str,
        community_id_short: str,
        profile: str,
        port: int,
        capabilities_names: list[str],
        manifest_url: str,
    ):
        ...
    async def start(self) -> None: ...
    async def stop(self) -> None: ...
    def update(self, *, capabilities_names: list[str] | None = None) -> None:
        """Refresh TXT records (e.g. when capabilities change)."""

class MdnsBrowser:
    """Listens for other nodes via mDNS, populates the registry."""
    def __init__(self, registry: PeerRegistry, our_community_id: str):
        ...
    async def start(self) -> None: ...
    async def stop(self) -> None: ...
```

### 3.3 Service definition

- Service type: `_hearthnet._tcp.local.`
- Instance name: `<display_name>-<short_node_id_4chars>`
- Port: from manifest's first endpoint
- TXT records:
  - `v=1`
  - `node=<short_node_id>`
  - `community=<short_community_id>`
  - `profile=<anchor|hearth|spark|bridge>`
  - `caps=<comma-separated cap names>` (max 200 bytes; truncate if needed)
  - `manifest_url=https://<host>:<port>/manifest`
  - `contract_version=1.0`

### 3.4 `udp.py`

```python
# hearthnet/discovery/udp.py
class UdpAnnouncer:
    """Periodic UDP multicast of node presence."""
    def __init__(
        self,
        kp: KeyPair,
        registry: PeerRegistry,
        node_id_short: str,
        community_id_short: str,
        port: int,
        capabilities_names: list[str],
        multicast_group: str = "239.255.42.42",
        multicast_port: int = 42424,
    ):
        ...
    async def run(self) -> None:
        """Loop: emit announcement every DISCOVERY_UDP_INTERVAL_SECONDS.
        Active interval when fewer than 2 peers; stable interval otherwise."""

class UdpListener:
    """Receives multicast announcements, populates registry."""
    def __init__(self, registry: PeerRegistry, our_community_id: str): ...
    async def run(self) -> None: ...
```

### 3.5 UDP payload

```json
{"v":1,"node":"7H4G-Y9KL","community":"NIED-...","port":7080,"caps":["llm.chat","rag.query"]}
```

Max 1KB. No signature on the announce itself (we'll re-fetch & verify the full manifest from `manifest_url`).

---

## 4. Behaviour

### 4.1 First contact flow

```
mDNS or UDP discovers a peer at <host:port> for community X (matches ours)
  ↓
PeerRegistry.upsert(stub PeerRecord with manifest=None)
  ↓
asyncio task: HTTP GET https://<host>:<port>/manifest (via X01 client)
  ↓
parse + verify_node_manifest (M01)
  ↓
if community matches AND author is a member (community manifest): keep
otherwise: remove
  ↓
PeerEvent("added") emitted
```

### 4.2 Refresh

- mDNS TXT updates trigger re-fetch of `/manifest`
- Every 30 seconds, we attempt to refresh peers whose manifests are within 10 seconds of expiry
- Peers whose manifests expired and could not be refetched are pruned after 90 seconds

### 4.3 Mode behaviour

When [M09](M09-emergency.md) reports offline:

- `UdpAnnouncer` switches to fast interval
- `MdnsAnnouncer` doesn't change (already low-overhead)
- Stale peer pruning becomes more aggressive (30s instead of 90s) β€” we want fresh data quickly

### 4.4 Multi-interface handling

- mDNS uses `zeroconf` defaults (all interfaces)
- UDP listener binds to `INADDR_ANY` on the multicast group; SO_REUSEPORT so multiple processes can coexist on the same host

### 4.5 Privacy

mDNS announces the short NodeID, profile, and a list of capability names. This is visible to any device on the LAN. We accept this β€” it is the price of zero-config.

Devices NOT in our community still see our presence but cannot make calls (rejected at the bus signature check).

---

## 5. Errors

`DiscoveryError` codes:

- `socket_in_use` β€” UDP port already bound
- `mdns_unavailable` β€” zeroconf fails to start (Linux without avahi, etc.)
- `manifest_fetch_failed` β€” HTTP error fetching `/manifest`
- `manifest_invalid` β€” propagated from M01 verification

Errors are logged but not fatal; the node continues with whichever discovery transport works.

---

## 6. Configuration

From [X04](X04-config.md):

```python
config.discovery.mdns_enabled
config.discovery.udp_enabled
config.discovery.udp_multicast_group
config.discovery.udp_port
config.discovery.relay_urls       # Phase 2
```

Constants: `DISCOVERY_UDP_INTERVAL_SECONDS`.

---

## 7. Tests

### Unit
- `test_peer_registry_upsert_returns_true_first_time`
- `test_peer_registry_prune_stale`
- `test_udp_payload_under_1kb`
- `test_mdns_txt_records_parse`

### Integration
- `test_two_nodes_find_each_other_via_mdns` (in-process zeroconf)
- `test_udp_fallback_when_mdns_disabled`
- `test_foreign_community_peer_filtered_out`

---

## 8. Cross-references

| What | Where |
|------|-------|
| Manifest fetch + verify | [M01 Β§3.2](M01-identity.md) |
| Service definition | [CONTRACT Β§6.1](../CAPABILITY_CONTRACT.md) (manifest schema) |
| Bus consumes peer events | [M03 Β§5.2](M03-bus.md) |
| Emergency mode influence | [M09 Β§5](M09-emergency.md) |
| Phase 2 internet relay | this module's `relay.py` (stub) |