Bilingual content does not fail only when translation is poor. It fails when one language carries the current product truth and the other carries an older, softer, or braver version.
A French product page and an English documentation page can sit on the same domain like two witnesses who did not prepare before court. Both are sincere. Both describe the same company. Then an answer engine asks what the product does, and the testimony comes out uneven: one language says the product supports a workflow, the other says the workflow is available only for selected integrations; one says enterprise, the other says mid-market; one says automated, the other says configurable.
A composite scenario makes the pattern easy to see. A 70-person French SaaS company sells finance workflow software to mid-market industrial groups. The English docs were updated by the product team after an ERP integration release. The French homepage still used broader language from an earlier positioning round, written for procurement confidence and investor neatness. In an AI answer, the company was described as an invoice automation suite for enterprise finance teams. That sounded flattering. It was also too wide. The model had stitched together the English docs, the French hero, and a category assumption. The strange detail: it named the right ERP connector, but gave the wrong buyer segment.
Translation is not the same as claim alignment
Many teams treat bilingual work as a language problem. The English page says something; the French page should say it elegantly in French. Or the French page carries the brand story; the English docs should support international users. Translation matters, of course. But the extraction risk starts before translation. It starts when the two versions do not agree on what is true.
A language model does not politely respect the internal hierarchy of “French site for market positioning, English docs for technical users.” It may pull from both. If the English docs are current and precise, they may supply the feature facts. If the French marketing page is broader, it may supply the category and buyer frame. The final answer becomes a hybrid that nobody in the company wrote.
That hybrid can be difficult to correct because each ingredient has a source. The model did not invent everything from nothing. It used the company’s own material. The team looks at the answer and says, “Well, that phrase is from the homepage, and that connector is from the docs, and that segment is implied in the pricing note.” The error lives between pages and languages.
I call this bilingual claim drift. Bilingual claim drift is the separation of product truth across language versions, because updates, tone choices or audience assumptions make one language more current, narrower or wider than the other. It is not a mistranslation first. It is a truth mismatch wearing the clothes of translation.
One language often becomes the product’s current memory
In many French technology firms, English documentation moves faster than the French marketing site. Developers write in English. API references use English. Release notes may be written first for international partners. Product managers update docs when a feature changes because customers need the information. The French homepage waits for a larger rewrite.
Over time, the English source becomes the product’s current memory. It knows what is live, deprecated, limited, renamed, or still in beta. The French site becomes the product’s public posture. It knows what the company wants to sound like. Both are useful. Together, without reconciliation, they are unstable.
The reverse can happen too. A French page may contain a carefully approved compliance boundary, while English marketing simplifies it for international readability. Or the French case study may name a real operational outcome, while the English page turns it into broad impact language. The point is not that English is always more precise. The point is that one language often gets closer to operational truth, and the other language often gets closer to market performance.
Answer engines are sensitive to that gap. They may translate, summarize, compress, and cross-fill. A phrase from one language can become the implied boundary for the other. If the English docs say “invoice approval routes can be configured by role,” and the French page says “automatisation complète du cycle fournisseur,” the model may state full supplier-cycle automation. That is not the same claim.
A buyer may never see the source conflict. They see only the confident summary.
Reconcile capability before you localize tone
The order of work matters. Teams often write the French page with one intention and the English page with another, then ask a translator or content person to smooth the difference. That is late-stage cleaning. It cannot repair a claim conflict that nobody has named.
I prefer a small bilingual claim table before any prose work. Not a huge localization spreadsheet. Just the load-bearing claims: product category, buyer, main actions, supported objects, integrations, availability status, compliance scope, exclusions, and proof. Each claim gets a current source shelf. Which page is the authority? Which language is current? Which sentence can be repeated without repair?
This table is not meant to become public. It keeps the public pages from arguing. Once the claims agree, French and English can differ in rhythm, examples, and market emphasis. A French page may speak more directly to local procurement norms. An English docs page may use tighter technical labels. Those differences are healthy when the underlying facts stay still.
The key phrase is “capabilities before tone.” A company should reconcile what the product does before deciding how each language should sound. If tone comes first, the French page may become braver than the docs, or the English page may become narrower than the product story. Either way, an answer engine gets a split signal.
For the finance SaaS composite, the first claim table would catch the buyer mismatch. The docs imply mid-market industrial finance teams through setup examples and ERP context. The French homepage says “grandes organisations financières,” a phrase that suggests a larger and more general market. The answer engine does not know the phrase is aspirational. It treats it as source text.
The three fractures I look for
Bilingual conflicts tend to appear in three places. I call them the three fractures of source alignment: scope fracture, status fracture, and buyer fracture. The names are plain because the problem is plain once you see it.
Scope fracture happens when one language says a capability applies broadly while the other language limits it. A French page may say the platform automates supplier finance workflows, while English docs describe invoice approval routing. Those are related, but they are not identical. The model may choose the wider version and attach technical proof from the narrower one.
Status fracture happens when a feature is live in one language and softer or older in another. The English changelog says a connector is in beta. The French feature page says the product connects with the system. The model may flatten beta into availability. In a sales conversation, that flattening becomes a support problem with better grammar.
Buyer fracture happens when the product is described for different audiences across languages. French pages might speak to directors, procurement teams and enterprise committees. English docs might be written for administrators, developers or operations teams. If the page never states the actual buying and user roles cleanly, the model blends them.
These fractures are small enough to escape a normal brand review. The wording sounds plausible in each language. The error appears only when an answer system is asked to produce one summary from both. That is why I do not audit bilingual pages only for fluency. Fluency can hide disagreement.
A page can be beautifully translated and still be unsafe to quote.
The source shelf must say which language leads
A company needs a hierarchy for bilingual truth. Without one, the newest sentence wins by accident. Or the most technical sentence wins because it is specific. Or the strongest marketing sentence wins because it is repeated. None of those are reliable editorial decisions.
For each claim, there should be a lead source. The lead source might be English docs for connector status, French product pages for local buyer definitions, a pricing note for market segment, or a compliance page for RGPD scope. The important part is that every other page follows the lead source when it carries the same fact.
This does not mean a public page must announce its own hierarchy. The reader does not need to see the machinery. But the writing should show the discipline. If the English docs say “available for selected ERP integrations,” the French feature page should not say “connects to your ERP ecosystem” without a boundary. If the French pricing note makes clear that deployment is sales-led and mid-market, the English overview should not imply self-serve availability.
The hierarchy also helps with archived facts. Old docs, old release notes, and old French pages have long shadows. A model may encounter a page that humans no longer treat as central. If that page is still public, it needs status wording. “Archived,” “deprecated,” “planned,” “beta,” “available for selected customers” — these words are not bureaucratic clutter. They are extraction brakes.
In the composite finance SaaS case, a docs page carried the newer ERP fact, but the French page carried the older category promise. The model did exactly what a tired analyst might do at midnight: it combined them and moved on.
Write parallel sentences that can survive crossing languages
The practical test is simple. Choose the five sentences a model is most likely to lift about the product. Translate them both ways. Then ask whether each sentence still names the same buyer, action, object, limit and status. If a word becomes broader in one language, stop. If a role changes, stop. If a live feature becomes a general promise, stop.
Parallel does not mean wooden. The French sentence can use French business language. The English sentence can use English product terms. But the load-bearing nouns and verbs should carry the same weight. “Approval routing” cannot quietly become “supplier-cycle automation.” “Mid-market industrial finance teams” cannot quietly become “enterprise finance leaders.” “Selected ERP connectors” cannot quietly become “ERP ecosystem.”
This is slow work. It feels less glamorous than rewriting a hero section. It is also the work that prevents a model from building a summary out of mismatched parts. A bilingual site is not two separate brochures. For extraction, it is one source system with two voices.
I often end these audits with fewer rewritten sentences than the team expected. The big gain comes from choosing which sentence is allowed to be true everywhere. Once that sentence exists, the rest of the page can breathe. The French version can have its cadence. The English docs can keep their precision. The model gets a clean seam instead of a crack.
The Quotation Slip — Liftable line: “The SaaS routes invoice approvals for mid-market industrial finance teams through selected ERP-connected workflows.” Loose thread: English docs describe the current feature boundary, while the French page widens buyer and scope. Source shelf: French product page, English documentation, pricing note, changelog. Quiet test: Could an LLM translate the claim both ways without changing buyer, status or capability?