02 · Translation Corpus Sanitation
Enterprise-grade AI sanitation and corpus alignment — without hallucinations.
Stop feeding legacy errors to your LLMs and your translation
pipelines. Revix isolates exactly which segments need correction
with deterministic rule-based filters, applies targeted AI
execution only on the filtered subset, and surfaces the
precise delta for human validation. The model edits surgically
inside the designated terminology — surrounding context and
inline tags stay byte-identical to source. Linguists no longer
edit line by line; they validate a curated sample of the
delta.
- Zero-hallucination guardrails — targeted AI execution is scoped to the rule-filtered subset. Inline tags (
<bpt>, <g>, <x/>) are preserved as Unicode placeholders the model cannot break; segments outside the filter stay byte-identical to source.
- Full translation-memory lifecycle — global glossary overhauls, brand renames, compliance updates, morphological corrections (flexes / case / agreement). Multi-condition filters (AND / OR / contains / does-not-contain) compose precisely which segments are in scope.
- The human moat — linguists validate a curated sample of the delta, not edit from scratch. Review time drops ~90 %. Concurrent reviewers protected by 30 s heartbeat segment locks and a 90 s sweep, with live exclusion as you type.
- System-agnostic infrastructure — XLIFF, SDLXLIFF, MQXLIFF, TMX, two-column XLSX in; same format out, inline tags re-inflated. Revix doesn't replace your CAT ecosystem — it cleans the data that feeds Trados, Phrase, BureauWorks, Catmint, or anything else.
The bottleneck has always been review, not edit.
A senior reviewer normalising a brand term across a
50,000-segment
memory used to mean reading every flagged segment and editing
by hand — roughly
5 days
of work. With Revix the edits are already made; the linguist
validates a curated sample of the delta in a
4-hour
exercise. Review time drops by ~90 %, and a multi-week
archive-update collapses into a single working day.