{"id":16956,"date":"2026-04-12T14:26:46","date_gmt":"2026-04-12T14:26:46","guid":{"rendered":"https:\/\/dmsretail.com\/RetailNews\/non-obvious-patterns-in-building-enterprise-ai-assistants\/"},"modified":"2026-04-12T14:26:46","modified_gmt":"2026-04-12T14:26:46","slug":"non-obvious-patterns-in-building-enterprise-ai-assistants","status":"publish","type":"post","link":"https:\/\/dmsretail.com\/RetailNews\/non-obvious-patterns-in-building-enterprise-ai-assistants\/","title":{"rendered":"Non-Obvious Patterns in Building Enterprise AI Assistants"},"content":{"rendered":"<p> <p><a href=\"https:\/\/dmsretail.com\/online-workshops-list\/\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-496\" src=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png\" alt=\"Retail Online Training\" width=\"729\" height=\"91\" srcset=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png 729w, https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90-300x37.png 300w\" sizes=\"auto, (max-width: 729px) 100vw, 729px\" \/><\/a><\/p><br \/>\n<\/p>\n<div>\n<p><strong><em>Lessons from building production AI systems that nobody talks about.<\/em><\/strong><\/p>\n<p>The conversation around AI agents has moved fast. A year ago, everyone was optimizing RAG pipelines. Now the discourse centers on context engineering, MCP\/A2A protocols, agentic coding tools that read\/manage entire codebases, and multi-agent orchestration patterns. The frameworks keep advancing.<\/p>\n<p>After 18 months building the AI Assistant at Cisco Customer Experience (CX), we\u2019ve learned that the challenges determining real-world success are rarely the ones getting attention. Our system uses multi-agent design patterns over structured enterprise data (mostly SQL, like most enterprises). The patterns that follow emerged from making that system actually useful to the business.<\/p>\n<p>This post isn\u2019t about the obvious. It\u2019s about some of the unglamorous patterns that determine whether your system gets used or abandoned.<\/p>\n<p><strong>1. The Acronym Problem<br \/><\/strong><\/p>\n<p style=\"text-align: left;\">Enterprise environments are dense with internal terminology. A single conversation might include ATR, MRR, and NPS, each carrying specific internal meaning that differs from common usage.<\/p>\n<p>To a foundation model, ATR might mean Average True Range or Annual Taxable Revenue. To our business users, it means Available to Renew. The same acronym can also mean completely different things within the company, depending on the context:<\/p>\n<p style=\"text-align: center;\">User: \u201cSet up a meeting with our CSM to discuss the renewal strategy\u201d<br \/>AI: CSM \u2192 Customer Success Manager (context: renewal)<\/p>\n<p style=\"text-align: center;\">User: \u201cCheck the CSM logs for that firewall issue\u201d<br \/>AI: CSM \u2192 Cisco Security Manager (context: firewall)<\/p>\n<p>NPS could be Net Promoter Score or Network Protection Solutions, both completely valid depending on context. Without disambiguation, the model guesses. It guesses confidently. It guesses wrong.<\/p>\n<p>The naive solution is to expand acronyms in your prompt. But this creates two problems: first, you need to know which acronyms need expansion (and LLMs hallucinate expansions confidently). Second, enterprise acronyms are often ambiguous even within the same organization.<\/p>\n<p>We maintain a curated company-wide collection of over 8,000 acronyms with domain-specific definitions. Early in the workflow, before queries reach our domain agents, we extract potential acronyms, capture surrounding context for disambiguation, and look up the correct expansion.<\/p>\n<p>50% of all queries asked by CX users to the AI Assistant contain one or more acronyms and receive disambiguation before reaching our domain agents.<\/p>\n<p>The key detail: we inject definitions as context while preserving the user\u2019s original terminology. By the time domain agents execute, acronyms are already resolved.<\/p>\n<p><strong>2. The Clarification Paradox<\/strong><\/p>\n<p>Early in development, we built what seemed like a responsible system: when a user\u2019s query lacked sufficient context, we asked for clarification. \u201cWhich customer are you asking about?\u201d \u201cWhat time period?\u201d \u201cCan you be more specific?\u201d<\/p>\n<p>Users did not like it, and a clarification question would often get downvoted.<\/p>\n<p>The problem wasn\u2019t the questions themselves. It was the repetition. A user would ask about \u201ccustomer sentiment,\u201d receive a clarification request, provide a customer name, and then get asked about time period. Three interactions to answer one question.<\/p>\n<p><em>Research on multi-turn conversations<\/em> shows a 39% performance degradation compared to single-turn interactions. When models take a wrong turn early, they rarely recover. Every clarification question is another turn where things can derail.<\/p>\n<p>The fix was counterintuitive: classify clarification requests as a last resort, not a first instinct.<\/p>\n<p>We implemented a precedence system where \u201cproceed with reasonable defaults\u201d outranks \u201cask for more information.\u201d If a user provides any useful qualifier (a customer name, a time period, a region), assume \u201call\u201d for missing dimensions. Missing time period? Default to the next two fiscal quarters. Missing customer filter? Assume all customers within the user\u2019s access scope.<\/p>\n<p>This is where intelligent reflection also helps tremendously: when an agent\u2019s initial attempt returns limited results but a close alternative exists (say, a product name matching a slightly different variation), the system can automatically retry with the corrected input rather than bouncing a clarification question back to the user. The goal is resolving ambiguity behind the scenes whenever possible, and being transparent to users about what filters the agents used.<\/p>\n<p>Early versions asked for clarification on 30%+ of queries. After tuning the decision flow with intelligent reflection, that dropped below 10%.<\/p>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"lazy lazy-hidden aligncenter wp-image-489685 \" data-lazy-type=\"image\" src=\"https:\/\/blogs.cisco.com\/gcs\/ciscoblogs\/1\/2026\/04\/mermaid-diagram-2026-04-09-092319.png\" alt=\"\" width=\"1211\" height=\"261\"\/><noscript><img fetchpriority=\"high\" decoding=\"async\" class=\"aligncenter wp-image-489685 \" src=\"https:\/\/blogs.cisco.com\/gcs\/ciscoblogs\/1\/2026\/04\/mermaid-diagram-2026-04-09-092319.png\" alt=\"\" width=\"1211\" height=\"261\"\/><\/noscript><\/p>\n<p style=\"text-align: center;\"><strong><em>Figure: Decision flow for clarification, with intelligent reflection<\/em><\/strong><\/p>\n<p>The key insight: users would rather receive a broader result set they can filter mentally than endure a clarification dialogue. The cost of showing slightly more data is lower than the cost of friction.<\/p>\n<p><strong>3. Guided Discovery Over Open-Ended Conversation<\/strong><\/p>\n<p>We added a feature called \u201cCompass\u201d that suggests a logical next question after each response. \u201cWould you like me to break down customer sentiment by product line?\u201d<\/p>\n<p>Why not just ask the LLM to suggest follow-ups? Because a foundation model that doesn\u2019t understand your business will suggest queries your system can\u2019t actually handle. It will hallucinate capabilities. It will propose analysis that sounds reasonable but leads nowhere.<\/p>\n<p>Compass grounds suggestions in actual system capabilities. Rather than generating open-ended suggestions (\u201cIs there anything else you\u2019d like to know?\u201d), it proposes specific queries the system can definitely fulfill, aligned to business workflows the user cares about.<\/p>\n<p>This serves two purposes. First, it helps users who don\u2019t know what to ask next. Enterprise data systems are complex; business users often don\u2019t know what data is available. Guided suggestions teach them the system\u2019s capabilities through example. Second, it keeps conversations productive and on-rails.<\/p>\n<p>Approximately 40% of multi-turn conversations within the AI Assistant include an affirmative follow-up, demonstrating how contextually relevant follow up suggestions can improve user retention, conversation continuity and guide discovery.<\/p>\n<p>We found this pattern valuable enough that we open-sourced a standalone implementation:\u00a0langgraph-compass. The core insight is that follow-up generation should be decoupled from your main agent so it can be configured, constrained, and grounded independently.<\/p>\n<p><strong>4. Deterministic Security in Probabilistic Systems<\/strong><\/p>\n<p>Role-based access control cannot be delegated to an LLM.<\/p>\n<p>The intuition might be to inject the user\u2019s permissions into the prompt: \u201cThis user has access to accounts A, B, and C. Only return data from those accounts.\u201d This does not work. The model might follow the instruction. It might not. It might follow it for the first query and forget by the third. It can be jailbroken. It can be confused by adversarial input. Prompt-based identity is not identity enforcement.<\/p>\n<p>The risk is subtle but severe: a user crafts a query that tricks the model into revealing data outside their scope, or the model simply drifts from the access rules mid-conversation. Compliance and audit requirements make this untenable. You cannot explain to an auditor that access control \u201cusually works.\u201d<\/p>\n<p>Our RBAC implementation is entirely deterministic and completely opaque to the LLM. Before any query executes, we parse it and inject access control predicates in code. The model never sees these predicates being added; it never makes access decisions. It formulates queries; deterministic code enforces boundaries.<\/p>\n<p>When access filtering produces empty results, we detect it and tell the user: \u201cNo records are visible with your current access permissions.\u201d They know they\u2019re seeing a filtered view, not a complete absence.<\/p>\n<p>Liz Centoni, Cisco\u2019s EVP of Customer Experience, has written about\u00a0the broader framework for building trust in agentic AI, including governance by design and RBAC as foundational principles. These aren\u2019t afterthoughts. They\u2019re prerequisites.<\/p>\n<p><strong>5. Empty Results Need Explanations<\/strong><\/p>\n<p>When a database query returns no rows, your first instinct might be to tell the user \u201cno data found.\u201d This is almost always the wrong answer.<\/p>\n<p>\u201cNo data found\u201d is ambiguous. Does it mean the entity doesn\u2019t exist? The entity exists but has no data for this time period? The query was malformed? The user doesn\u2019t have permission to see the data?<\/p>\n<p>Each scenario requires a different response. The third is a bug. The fourth is a policy that needs transparency (see section above).<\/p>\n<p><em>System-enforced filters (RBAC):<\/em> The data exists, but the user doesn\u2019t have permission to see it. The right response: \u201cNo records are visible with your current access permissions. Records matching your criteria exist in the system.\u201d This is transparency, not an error.<\/p>\n<p><em>User-applied filters:<\/em> The user asked for something specific that doesn\u2019t exist. \u201cShow me upcoming subscription renewals for ACME Corp in Q3\u201d returns empty because there are no renewals scheduled for that customer in that period. The right response explains what was searched: \u201cI couldn\u2019t find any subscriptions up for renewal for ACME Corp in Q3. This could mean there are no active subscriptions, or the data hasn\u2019t been loaded yet.\u201d<\/p>\n<p><em>Query errors:<\/em> The filter values don\u2019t exist in the database at all. The user misspelled a customer name or used an invalid ID. The right response suggests corrections.<\/p>\n<p>We handle this at multiple layers. When queries return empty, we analyze what filters eliminated records and whether filter values exist in the database. When access control filtering produces zero results, we check whether results would exist without the filter. The synthesis layer is instructed to never say \u201cthe SQL query returned no results.\u201d<\/p>\n<p>This transparency builds trust. Users understand the system\u2019s boundaries rather than suspecting it\u2019s broken.<\/p>\n<p><strong>6. Personalization is Not Optional<\/strong><\/p>\n<p>Most enterprise AI is designed as a one-size-fits-all interface. But people expect an \u201cassistant\u201d to adapt to their unique needs and support their way of working. Pushing a rigid system without primitives for customization causes friction. Users try it, find it doesn\u2019t fit their workflow, and abandon it.<\/p>\n<p>We addressed this on multiple fronts.<\/p>\n<p><em>Shortcuts<\/em>\u00a0allow users to define command aliases that expand into full prompts. Instead of typing out \u201cSummarize renewal risk for ACME Corp, provide a two paragraph summary highlighting key risk factors that may influence likelihood of non-renewal of Meraki subscriptions\u201d, a user can simply type\u00a0<em>\/risk ACME Corp<\/em>. We took inspiration from agentic coding tools like Claude Code that support slash commands, but built it for business users to help them get more done quickly. Power users create shortcuts for their weekly reporting queries. Managers create shortcuts for their team review patterns. The same underlying system serves different workflows without modification.<\/p>\n<p>Based on production traffic, we\u2019ve noticed the most active shortcut users average 4+ uses per shortcut per day. Power users who create 5+ shortcuts generate 2-3x the query volume of casual users.<\/p>\n<p><em>Scheduled prompts<\/em>\u00a0enable automated, asynchronous delivery of information. Instead of synchronous chat where users must remember to ask, tasks deliver insights on a schedule: \u201cEvery Monday morning, send me a summary of at-risk renewals for my territory.\u201d This shifts the assistant from reactive to proactive.<\/p>\n<p><em>Long-term memory<\/em>\u00a0remembers usage patterns and user behaviors across conversation threads. If a user always follows renewal risk queries with product adoption metrics, the system learns that pattern and recommends it. The goal is making AI feel truly personal, like it knows the user and what they care about, rather than starting fresh every session.<\/p>\n<p>We track usage patterns across all these features. Heavily-used shortcuts indicate workflows that are worth optimizing and generalizing across the user community.<\/p>\n<p><strong>7. Carrying Context from the UI<\/strong><\/p>\n<p>Most AI assistants treat context as chat history. In dashboards with AI assistants, one of the challenges is context <em>mismatch<\/em>. Users may ask about a specific view, chart or table they are viewing, but the assistant usually sees chat text and broad metadata or perform queries that are outside the scope the user switched from. The assistant does not reliably know the exact live\u00a0<em>view<\/em>\u00a0behind the question. As filters, aggregations, and user focus change, responses become disconnected from what the user actually sees. \u00a0For example, a user may apply a filter for assets that have reached end-of-support for one or more architectures or product types, but the assistant may still answer from a broader prior context.<\/p>\n<p>We enabled an option in which UI context is explicit and <em>continuous<\/em>. Each AI turn is grounded in the actual view state of the selected dashboard content or even objects, not just conversation history. This gives the assistant precise situational awareness and keeps answers aligned with the user\u2019s current screen. Users are made aware that they are within their view context when they switch to the assistant window,<\/p>\n<p>For users, the biggest gain is accuracy they can verify quickly. Answers are tied to the exact view they are looking at, so responses feel relevant instead of generic. It also reduces friction: fewer clarification loops, and smoother transitions when switching between dashboard views and objects. The assistant feels less like a separate chat tool and more like an extension of the interface.<\/p>\n<p><strong>8. Building AI with AI<\/strong><\/p>\n<p>We develop these agentic systems using AI-assisted workflows. It\u2019s about encoding a senior software engineer\u2019s knowledge into machine-readable patterns that any new team member, human or AI, can follow.<\/p>\n<p>We maintain rules that define code conventions, architectural patterns, and domain-specific requirements. These rules are always active during development, ensuring consistency regardless of who writes the code. For complex tasks, we maintain command files that break multi-step operations into structured sequences. These are shared across the team, so a new developer can pick things up quickly and contribute effectively from day one.<\/p>\n<p>Features that previously required multi-week sprint cycles now ship in days.<\/p>\n<p>The key insight: the value isn\u2019t necessarily in AI\u2019s general intelligence and what state-of-the-art model you use. It\u2019s in the encoded constraints that channel that intelligence toward useful outputs. A general-purpose model with no context writes generic code. The same model with access to project conventions and example patterns writes code that fits the codebase.<\/p>\n<p>There\u2019s a moat in building a project as AI-native from the start. Teams that treat AI assistance as infrastructure, that invest in making their codebase legible to AI tools, move faster than teams that bolt AI on as an afterthought.<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>None of these patterns are technically sophisticated. They\u2019re obvious in hindsight. The challenge isn\u2019t knowing them; it\u2019s prioritizing them over more exciting work.<\/p>\n<p>It\u2019s tempting to chase the latest protocol or orchestration framework. But users don\u2019t care about your architecture. They care whether the system helps them do their job and is evolving quickly to inject efficiency into more elements of their workflow.<\/p>\n<p>The gap between \u201ctechnically impressive demo\u201d and \u201cactually useful tool\u201d is filled with many of these unglamorous patterns. The teams that build lasting AI products are the ones willing to do the boring work well.<\/p>\n<p><em>These patterns emerged from building a production AI Assistant at Cisco\u2019s Customer Experience organization. None of this would exist without the team of architects, engineers and designers who argued about the right abstractions, debugged the edge cases, and kept pushing until the system actually worked for real users.<\/em><\/p>\n<\/p><\/div>\n<p><p><a href=\"https:\/\/dmsretail.com\/online-workshops-list\/\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-496\" src=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png\" alt=\"Retail Online Training\" width=\"729\" height=\"91\" srcset=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png 729w, https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90-300x37.png 300w\" sizes=\"auto, (max-width: 729px) 100vw, 729px\" \/><\/a><\/p><br \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Lessons from building production AI systems that nobody talks about. The conversation around AI agents has moved fast. A year ago, everyone was optimizing RAG [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":16957,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-16956","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/posts\/16956","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/comments?post=16956"}],"version-history":[{"count":0,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/posts\/16956\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/media\/16957"}],"wp:attachment":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/media?parent=16956"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/categories?post=16956"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/tags?post=16956"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}