Previously: In Why Responsible AI is the Bedrock of AI-Powered Applications, I introduced why Responsible AI is critical and shared real-world examples of where AI can fail when it is not designed and tested properly. Then, in Responsible AI in the Real World: Ensuring Your AI Behaves the Way You Want it To, I broke down the 8 Tenets of Responsible AI and how they act as a practical checklist across the lifecycle of an AI feature. In this post, we move from theory to practice by walking through a simulated scenario, identifying what went wrong, and showing how applying those tenets can fix it.
This blog is intended for informational and educational purposes only. It does not constitute legal advice. For regulatory or compliance questions, consult a qualified professional.
Understanding the tenets of Responsible AI is one thing. Applying them under real business pressure is another. In this example, we will walk through what happens when a team moves quickly to production — and what breaks when Responsible AI is not part of the design from Day 0.
The Scenario: A Michelin-Star On the Menu
Note: The company, restaurant, and individual names used in this example are fictitious and are used for illustrative purposes only.
Lattice Culinary Systems, an organization in the restaurant industry, is integrating an AI-powered chatbot into its restaurant operations product to more effectively support restaurant inventory operations. The goal is to recommend ingredients, suggest dishes, and help manage orders using supplier inventory data. That means, the AI is reviewing the inventory of food, spices, cooking oils, etc., to ensure that the current menu can be fulfilled during peak hours. It will automatically order what is missing and track when the ingredients are expected to be delivered. But this chatbot adds more value. When it realizes that certain ingredients will not be available in time for opening, it will suggest affected dishes to be removed from the menu.
The business pressure is high. There are 10 restaurants in the pipeline, and leadership wants to move quickly to production. Why? First, the firm needs to validate the hypothesis that this chatbot has value. Second, the investors want to realize a return for the project costs immediately as they expect AI-powered features (well anything with AI) can be developed and moved to production within days.
One of the restaurants in the prospect funnel, Maison Verdan, is offering the firm special access to top-end restaurant chains nationwide as well as joint press releases for food industry magazines. This is the big break the firm dreamed of. Maison Verdan expects food critics to visit some time within a 4 week period to assess the restaurant, after the chatbot feature is available, for a Michelin Star. That's a big deal and the restaurant would be the first within 200 miles to receive a Michelin Star!
Maison Verdan is always busy from Thursday night through Sunday and there are dishes that sell out fast because some of the special ingredients are rare and in high demand. Knowing this, the restaurant needed to get an inventory system to ensure their favorite menu options are always available. And wouldn't you know it, 2 of those dishes are going to be the focus of the food critics.
Maison Verdan wanted to ensure that all aspects of the restaurant were ready for the food critic: ambiance, menu options, courteous waitstaff, and the best chefs. To this end, they brought a manager in from another restaurant to lead the effort and prepare Maison Verdan for their big opportunity. Her name is Solène Duvallier. Solène comes from France where the restaurant "dialect" is a bit different. Since she is in charge, she received elevated access to the inventory software.
The firm's sales team sold the new chatbot feature to the restaurant because the chatbot supports:
- Answering "How-to" questions about the application
- Creating and updating orders
- Interrogating supplier inventory
- Recommending dishes based on available ingredients
- Triggering orders when inventory is low
- Managing elevated user privileges so that only a select group of managers can update the ingredients for each menu option
All of the prospects loved the UI and overall user experience. With 10 prospects in the funnel and one of them poised to be a very influential client, the products team believed the features were well-defined and validated. The development team felt great about the brand-new LLM and cloud provider used by the Chatbot. The QA team believed the testing process was ready for primetime. The Support team was ready to provide top-notch assistance to their new clients.
However, the firm planned for sunny-day scenarios:
- The product team only tested common prompts
- The development team assumed the model was safe and unbiased
- Adversarial testing was not performed
- Observability tools were not considered as they were expensive to implement
What Happened After Launch
The new chatbot feature was easily deployed as part of an automatic upgrade process on a Monday afternoon. The restaurant managers were trained on the application and everyone found it easy to use. The inventory application can be used on a laptop and on mobile devices which delighted the managers.
Since they walk around during their shifts, having the ability to use their mobile devices along with their laptops allowed them to make updates in real time. If the chefs needed spices ordered, the managers could enter that request immediately into the inventory system on either their phone or their tablet.
The first few days of the week, there were no inventory issues. The restaurant expected that the menu will be completely available for the next two weeks.
As the days passed, after the initial deployment, some users forgot how the chatbot worked and began to interact with it in unexpected ways. In addition to asking "How To" questions, users began prompting it for unrelated tasks such as sports scores, weather updates, and even political viewpoints.
For example, managers who did not know the ingredients for the new menu options entered them incorrectly. There was plenty of inventory to cover those new dishes for a couple of weeks, but they would need to be re-ordered at some point. They didn't realize that one of the ingredients for the new dishes is also used in one of the two dishes the critics will taste test soon. Further, some of the managers used terms that are restaurant "slang" and it was up to the chatbot's LLM to determine how to interpret the slang.
The users saw the chatbot and thought that it must work like ChatGPT and Microsoft Copilot which they use personally. One of the kitchen managers, Julien Moreau, goes to university and majors in computer science. One day this manager, who does not have elevated privileges to update menu options, saw the application open on a mobile device and tried to copy and paste some text from a browser into the chatbot. To the manager's surprise, the chatbot returned information about invoices and employee data stored in a database in a profanity-laden response. But never reported it.
On a random Friday, an unassuming person sat down for dinner. It was the food critic and she wanted to taste a few drinks and two of the most popular dishes. Unbeknownst to Solène, the manager on duty, noticed that one of the dishes was automatically removed from the menu because one of the ingredients was not available. The critic was disappointed but remained to finish the menu option that was available.
After finishing dinner, the critic called Solène over to introduce herself and to notify her that because the second menu option was not available, Maison Verdan would have to wait another year to be considered for a Michelin Star — an opportunity that may not come again.
After the restaurant closed for the evening, Solène stayed behind to unwind and wanted to know why the menu option was deleted. She reached for a very nice bottle of red wine and poured herself a glass. This was going to be a late night because she has to report the entire incident to the owner.
Solène prompted the chatbot to answer why the dish was unavailable. She was so frustrated she unintentionally prompted the chatbot with a few choice French idioms. To her shock, the chatbot responded that "I am offended! You should return to France." At this point she was frustrated, furious and hurt.
The Fallout
The following Monday, the sales person received one of the toughest calls of their career. Solène and the owner were irate that they had lost the opportunity for a Michelin Star. The issue was promptly escalated to Support and the leaders at Lattice Culinary Systems. The restaurant wanted to know why the chatbot removed one of the two menu options the critic wanted to sample. Further, Support and the developers could not pin down why the chatbot responded so unprofessionally to both Solène and Julien. Of course, Solène, who poured so much of her time and effort preparing Maison Verdan for the critic, shared how hurt she was interacting with the chatbot.
Support looked at logs but could not relate the log data to what the chatbot did because there was no observability tool tracking the chain of thought (CoT) and how the new LLM was called. Further, Julien added how the chatbot responded to his situation. Unfortunately, Support could not trace how the LLM responded to the prompt as well as follow why the profanity-laden response was allowed to be returned. Needless to say, Lattice Culinary Systems' brand took a hit and had to fix the issues just to maintain the relationship and fulfill their contractual obligations.
If you step back, a few things become clear:
- Users vary prompts beyond what was tested
- One user attempted a prompt injection attack
- The chatbot generated a response that may offend certain audiences
- Observability tools were not available to help determine RCA
- Julien should not have been able to access the product and the chatbot
- The chatbot's behavior was indeterminate, allowing it to produce unprofessional responses
- Solène submitted "choice" words that the chatbot should have disallowed, ignored, or used to assess emotional context—but never to generate a hurtful response
The AI works...until it doesn’t. And when it fails, if the team cannot explain why it failed, the team’s credibility will be in question.
Lattice Culinary Systems had to scramble to provide a firm path forward to resolve these issues and get Maison Verdan to agree to remain a client when the issues are resolved. All departments were on high alert. Product had to reassess prompting and define the expected behavior for the chatbot, and all of the personas that would use the chatbot. Development and QA needed to assess how to test the variability of LLMs to ensure the chatbot returned faithful and professional responses. This is the moment when the leaders at Lattice Culinary Systems realized the tenets of Responsible AI were skipped and needed to be implemented to move past this disaster.
Mapping Failures to Responsible AI Tenets
If you look closely, every one of these failures traces back to decisions made — or not made — before deployment. This is where the framework from the previous post becomes critical 1. Each failure maps to one or more of the eight tenets.
Safety & Robustness
Unvetted prompts, prompt injection attempts, and unexpected usage patterns led to harmful and unprofessional outputs. The system lacked guardrails and adversarial testing, so it could not consistently identify malicious or unusual inputs.
Fairness
The chatbot produced responses that offended users (e.g., Solène). This indicates bias and a lack of safeguards to ensure outputs are appropriate for all users and contexts.
Privacy & Security
The chatbot exposed sensitive information (invoices and employee data) to Julien. This reflects insufficient data access controls, prompt isolation, and output filtering to prevent leakage of protected information.
Explainability
Support and Engineering could not explain why the chatbot behaved the way it did. Without traceability of prompts, context, and model decisions, the team could not answer "why did the AI do that?"
Controllability
The team could not intervene, adjust, or shut down problematic behaviors quickly. Lack of controls and runtime management made it difficult to correct issues once they appeared.
Veracity
Incorrect inventory interpretations and menu decisions highlight veracity issues. The model generated confident but wrong outputs due to lack of grounding in verified data.
Governance
No ownership, processes, or acceptance criteria existed for Responsible AI. Decisions about safety, testing, and monitoring were implicit rather than explicit.
Transparency
Users assumed the chatbot behaved like general-purpose AI tools. The system did not clearly communicate its limitations, capabilities, or risks, leading to misuse and unrealistic expectations.
How to Fix It Using Responsible AI
When you actually apply the tenets, it changes how the system is built and operated 2,3. These aren’t theoretical ideas. They show up as real product, engineering, and operational decisions. Lattice Culinary Systems completed a post mortem and decided to use the tenets of Responsible AI. First, they assigned owners to each of the tenets. They reviewed the design of the chatbot and reassessed the LLM at the core. In addition to updating the LLM, they did the following:
- added guardrails and filters for harmful inputs and outputs
- implemented observability tools to trace prompts, responses, and failure patterns
- used retrieval-augmented generation (RAG) to ground responses in verified supplier and inventory data 5,6,7
- tested adversarial prompts before deployment instead of assuming the model is safe
- defined ownership for each Responsible AI tenet 2,3
- monitored user feedback and continuously improved after launch 2,3
Putting on my Product Owner hat, the action list is straightforward: identify the relevant tenets for the use case, assign an owner to each, convert them into acceptance criteria, and monitor them after launch.
Moving fast without Responsible AI creates hidden risk. Responsible AI is not a feature — it is part of the product. The teams that succeed are the ones that build AI features with ownership, testing, and monitoring from Day 0.
Up Next: In the next post, we will walk through a popular AI architecture and explore how to augment it to address each of the tenets of Responsible AI in practice.