Stylometric Fingerprint
A quantitative profile of a writer's or brand's language across cadence, vocabulary, structure, and rhetorical patterns. Reproducible and enforceable in code. The technical layer underneath a Foundrkit.
The technique
Stylometry is the statistical analysis of writing style. Forensic linguists use it to identify authors. Brand teams can use it to identify and enforce their own language.
A stylometric fingerprint captures:
- Sentence length distribution
- Vocabulary clusters (which word families recur)
- Cadence patterns (long sentences followed by short ones, etc.)
- Forbidden words (vs a baseline corpus)
- Exemplar sentences (the most representative samples)
These dimensions are deterministic and reproducible — run the analysis twice on the same content, get the same fingerprint.
Why this matters for AI
Style guides written in prose ("our voice is confident but approachable") can't be enforced. AI tools interpret them differently every time.
A stylometric fingerprint is machine-readable. AI tools can verify their output matches it. The fingerprint becomes a constraint layer over generation. The Foundrkit bundles this fingerprint together with the rules engine and forbidden-word list that operationalize it.
Related Terms
- Brand Language DriftThe measurable divergence between a brand's stated language and the content that actually ships, especially across channels and contributors.
- Deterministic ContentContent produced by systems that yield the same brand-aligned output for the same input, every time. Opposite of probabilistic AI generation.
Skills That Address This
media-tsunami
The empirical layer. Extracts brand voice as executable code — cadence, vocabulary, forbidden words, exemplar sentences — serialized as a CLAUDE.md any LLM can load.
Install →whystrohm-voice-extract
Extract a 6-dimension voice profile from any URL. Generate 15-20 enforceable guardrails. Outputs as CLAUDE.md.
Install →