Designing a scalable AI-assisted
UX writing system
AI was already being used in design, but without clear structure or guardrails. This caused inconsistent copy quality, tone, and brand alignment, and left UX writers at risk of only reviewing reactive outputs instead of guiding AI use effectively. Our core challenge was controlling variance while keeping the speed and efficiency benefits of AI.
The core problem:
How do we scale AI safely without compromising UX quality?
We needed to:
Protect brand voice consistency
Reduce revision cycles
Clarify ownership and accountability
Create predictable review pathways
Allow AI usage without increasing risk
This was a systems design problem, not a copy problem.
Our hypothesis:
Treat the LLM as a conversational product, not a writing shortcut
If we:
Design behavioural guardrails
Structure knowledge routing
Gate usage by risk and complexity
Embed human review checkpoints
Then we can safely scale AI across PD and beyond.
AI workflow: Governance
Our AI-copy workflow maps out when and how AI can be used safely, defining ownership, review checkpoints, and escalation paths.
Projects begin with a quick assessment of complexity and product risk, guided by UX writers. This ensures AI is used where it is safe, while high-risk product experiences remain human-led.
AI workflow: Design and validation
Once the project path is defined, designers proceed with initial design. If the task is assessed as low-risk and low-complexity, they will also handle copy creation.
Stakeholder feedback is incorporated before the design moves to high-fidelity review. This allows teams to move quickly in early stages while maintaining alignment across product, engineering, and design.
AI workflow: Review and delivery
Once designs reach high-fidelity readiness, any AI-generated copy also enters the review stage. Designers submit review requests via our Slack workflow, with approved copy handed off to engineers for localisation and production delivery.
Slack workflow
Through Slack, designers can submit structured AI-copy review requests, providing the necessary context for UX writers to evaluate and approve outputs efficiently.
Behaviour architecture: Designing LLM behaviour, not just prompts
In our LLM, we embedded:
Voice and tone constraints aligned to brand
Context-sensitive tone modulation logic
Component-level formatting rules
Sentence case and proper noun enforcement
Routing to function-level content guides
Escalation rules for high-risk scenarios
We defined how the assistant should behave across content types, not just what it should generate.
By creating multiple conversation paths, we helped guide designers with structured prompts so our LLM could gather the context needed to generate accurate UX copy.
Reducing hallucinations through structured knowledge grounding
We created:
● A routing index to detect content type
● Function-level master guides (errors, warnings, good news, etc.)
● Form-factor constraints (modals, tags, tooltips, buttons)
● Domain-specific rules Vouchers, Browser extension)
● Embedded style guide enforcement
The impact of these changes improved structural consistency and reduced stylistic drift.
Building AI literacy across Product Design
Using Confluence as our AI library, we created:
● Prompt follow-up framework
● Decision criteria for choosing between AI options
● Evaluation checklist (clarity, tone, guidelines, context)
● Review time expectation guide
● Feedback logging mechanism
This allowed us to enable usage at scale without creating long-term dependency on UX writers.
Measuring output quality: Tracking AI performance over time
Insight
Most issues were clarity-related, not structural or compliance failures. This indicated prompt refinement and context alignment gaps — not fundamental model weakness.
1. Guardrails improve consistency more than prompt tuning
AI outputs became reliable only after we embedded behavioural rules, formatting constraints, and escalation paths.
2. Ambiguous briefs create more variance than model limitations
Most revision issues were clarity-related. Structured prompts and context gathering reduced this significantly.
3. Risk-gated workflows increase team adoption
Clear rules around when AI is appropriate built trust and made teams more comfortable using it.
4. AI without operational workflows creates friction, not speed
AI tools alone didnʼt improve delivery speed. Structured review pathways and ownership models did.
Key takeaway
AI systems succeed when behaviour, governance, and workflows are designed together. Treating the LLM as a product with defined behaviour and operational guardrails allowed us to scale AI usage safely across Product Design.