Claim Log — Labelwatch

Generated by labelwatch

Claim Log

Analytical findings, their methodology, and their current status. Labelwatch publishes its reasoning, not just its conclusions.

Hosting Locus Investigation

Investigated 2026-04-10. Status: conclusion revised.

The original hosting-locus hypothesis asked: do labeled targets cluster on specific PDS hosts in ways that reveal coordinated or concentrated adversarial behavior?

The answer turned out to be more interesting than yes or no. The pipeline was measuring something real, just not the thing it was built to surface.

What we tested

Labelwatch seeds driftwatch’s resolver with labeled-target DIDs so their PDS hosts can be resolved. We compared the host distribution of these “seed” DIDs (accounts that have been labeled) against “live” DIDs (accounts observed posting in real time) to look for hosting-locus signals.

Finding 1: Head divergence is an age confounder

Explained

The top of the distribution showed a dramatic divergence: two newer Bluesky PDS shards (jellybaby, stropharia) held ~131,000 live accounts each with near-zero seed presence, while older “mushroom” shards (lionsmane, amanita, oyster, etc.) held ~2,000 each with ~60% seed ratio.

This is explained by Bluesky’s PDS shard rotation. New accounts go to the newest shards. Labels take time to accumulate. Therefore older shards are seed-heavy (more labeled accounts) and newer shards are live-heavy (more currently-active posters). The divergence reflects account age, not any property of the hosts themselves.

Finding 2: Stale pds_host hypothesis falsified

Tested, 100/100

We tested whether the stored pds_host field might be stale — that is, whether seed DIDs had migrated to different PDS hosts since resolution, making the comparison invalid.

100 seed DIDs from the top-20 mushroom-head hosts were re-resolved fresh against plc.directory using the same extraction logic as the driftwatch resolver. All 100 matched their stored host exactly. The stale-field explanation is killed for the mushroom-head population.

Scope limitation: This check was bounded to the mushroom-head (top-20 seed hosts, all did:plc, all *.host.bsky.network shards). It does not certify pds_host freshness for the long tail, for did:web, or for any other population.

Finding 3: Seed:live ratio measures labeler coverage, not host behavior

Conclusion revised

The long-tail analysis found hosts with 100% seed ratios — every account labeled, zero observed posting. The leading example was skystack.xyz: 276 accounts, all carrying a substack label from a single labeler (did:plc:uxjwly6emtgik7juvxxdpl3c, 29,620 label events). The same labeler had enumerated every account on the host with a content-type label.

The 100% seed ratio was not a behavioral signal. It was a labeler coverage artifact: one labeler chose to enumerate all accounts on a specific PDS, which is a governance decision about labeler tactics, not evidence of adversarial account clustering.

The same pattern likely holds for other high-seed-ratio hosts: pds.1440.news (100%, 24 accounts), northsky.social (83%), bsky.bestofmodels.blog (72.7%), and atproto.brid.gy (49.8%, the Bridgy Fed fediverse bridge).

The pipeline is measuring something real, just not the thing it was built to surface. The seed:live ratio is a proxy for labeler coverage density — which labelers chose to enumerate which PDSs — not for behavioral concentration of adversarial accounts on hosts.

Finding 4: Prior pds.rip claim not supported

Unresolved

A prior internal note claimed that a pds.rip cluster was “the first real coordinated inauthentic behavior pattern via PDS data” and “validated the hosting-locus thesis.”

This claim has no support in persisted state currently examined. Driftwatch sees 32 live accounts on pds.rip with zero seed presence. Labelwatch has zero label events for those 32 DIDs, zero alerts referencing pds.rip, and no hosting-locus findings table. The only persistent reference is a provider registry suffix-match rule classifying pds.rip as known_alt.

The original observation may have been a transient CLI output that was never persisted. “Validated” is not supportable for a one-off result that left no persistent trace. This is a narrow finding about one specific prior claim, not a general assessment of data reliability.

What this means going forward

Provider registry gaps

The investigation also found that Labelwatch’s provider registry is sparse for long-tail hosts. Of 7 hosts with elevated seed:live ratios, only 2 (atproto.brid.gy, pds.rip) had provider registry entries. skystack.xyz, pds.1440.news, northsky.social, and bsky.bestofmodels.blog were unclassified. Any downstream analysis relying on provider classification would miss them.

← Back to dashboard