GLOSSARY

Stylometric Fingerprint

A quantitative profile of a writer's or brand's language across cadence, vocabulary, structure, and rhetorical patterns. Reproducible and enforceable in code. The technical layer underneath a Foundrkit.

The technique

Stylometry is the statistical analysis of writing style. Forensic linguists use it to identify authors. Brand teams can use it to identify and enforce their own language.

A stylometric fingerprint captures:

Sentence length distribution
Vocabulary clusters (which word families recur)
Cadence patterns (long sentences followed by short ones, etc.)
Forbidden words (vs a baseline corpus)
Exemplar sentences (the most representative samples)

These dimensions are deterministic and reproducible — run the analysis twice on the same content, get the same fingerprint.

Why this matters for AI

Style guides written in prose ("our voice is confident but approachable") can't be enforced. AI tools interpret them differently every time.

A stylometric fingerprint is machine-readable. AI tools can verify their output matches it. The fingerprint becomes a constraint layer over generation. The Foundrkit bundles this fingerprint together with the rules engine and forbidden-word list that operationalize it.

Related Terms

Skills That Address This

GROUND TRUTH

media-tsunami

The empirical layer. Extracts brand voice as executable code — cadence, vocabulary, forbidden words, exemplar sentences — serialized as a CLAUDE.md any LLM can load.

Install →

VOICE★ 1

whystrohm-voice-extract

Extract a 6-dimension voice profile from any URL. Generate 15-20 enforceable guardrails. Outputs as CLAUDE.md.

Install →

GLOSSARY

Stylometric Fingerprint

The technique

Stylometry is the statistical analysis of writing style. Forensic linguists use it to identify authors. Brand teams can use it to identify and enforce their own language.

A stylometric fingerprint captures:

Sentence length distribution
Vocabulary clusters (which word families recur)
Cadence patterns (long sentences followed by short ones, etc.)
Forbidden words (vs a baseline corpus)
Exemplar sentences (the most representative samples)

These dimensions are deterministic and reproducible — run the analysis twice on the same content, get the same fingerprint.

Why this matters for AI

Style guides written in prose ("our voice is confident but approachable") can't be enforced. AI tools interpret them differently every time.

Related Terms

Skills That Address This

GROUND TRUTH

media-tsunami

The empirical layer. Extracts brand voice as executable code — cadence, vocabulary, forbidden words, exemplar sentences — serialized as a CLAUDE.md any LLM can load.

Install →

VOICE★ 1

whystrohm-voice-extract

Extract a 6-dimension voice profile from any URL. Generate 15-20 enforceable guardrails. Outputs as CLAUDE.md.

Install →