In Cognigy 2025.24.0, the Simulator and simulation mode are available only in the trial environment.
- Scenarios — define personas, missions, and success criteria for testing AI Agents.
- Simulations — execute scenarios to evaluate the performance of AI Agents.
- Simulation Runs — represent different variations of the same scenario within a single simulation. A simulation can contain multiple runs, each providing unique test outcomes for analysis.

Key Benefits
- Realistic End-to-End Testing. Simulate full conversations, from greetings to resolutions. Use multiple personas to test how your AI Agent handles different user types and scenarios at the same time.
- Simulations Across Multiple Interfaces. Run simulations via the Simulator or the Interaction Panel and view all simulated conversations in the Transcript Explorer, Simulator, or Interaction Panel for quick outcome review.
- Early Issue Detection and Optimization. Detect gaps, edge cases, and quality issues before deployment to improve reliability and deliver a better user experience.
- Faster, Automated QA. Automatically test AI Agents across environments, ensuring high coverage and consistent quality.
Prerequisites
- To use the Simulator, you need to configure LLMs for the following:
- Creating scenarios — from the Design-Time LLM Features list, select the platform-provided LLM or another model of your choice that supports design-time features.
- Running simulations — the model must support the AI Agent Node feature.
- If you want to generate scenarios automatically, ensure that your Project includes a Flow with an AI Agent Node.
Restrictions
- You can’t run simulations using Snapshots in the Interaction Panel.
- The following features aren’t supported:
- Voice interactions. The Simulator only supports text-based conversations.
- Buttons and Adaptive Cards.
- Handovers to human agents. The Interaction Panel will show the message
Agent Handover is not supported in the Interaction Panel, please use the Webchat or another channel.
Limitations
- The maximum number of runs per simulation is 150.
- The maximum number of success criteria per scenario is 5.
How the Simulator Works

- Create a scenario. Set the persona, mission, and success criteria.
- Simulate. Start the simulation and let the AI Agent interact with the persona you created. Run the simulation multiple times to gather more accurate insights.
- Evaluate. View key metrics on the Simulation dashboard, such as complete success rate, average sentiment, and number of turns. You can then adjust the Flow logic, training data, or system prompts based on the results. Rerun simulations to verify improvements.
Working with Simulations
1. Create a Scenario
1. Create a Scenario
You can create a scenario automatically or manually in the Simulator interface, or from transcripts in the Transcript Explorer.To create a scenario, follow these steps:
- Automatically
- Manually
- From Transcripts
- In your Project, go to Test > Simulator and click + New Scenario.
- In the Generate from AI Agent section, click Generate.
- Fill in the following fields:
- Flow – select a Flow containing an AI Agent on which the scenario will be based. Ensure the Flow is already configured.
- AI Agent Job Node – select the AI Agent Node to be analyzed for scenario creation. The AI Agent Job Node must be already configured in the Flow.
- Click Generate. Cognigy.AI generates a scenario based on the AI Agent Job Node configuration. Review the generated scenario and adjust it if necessary.
- Select the generated persona description, mission, and success criteria. If necessary, edit them manually or automatically by clicking the Regenerate text button on the right-hand side of each field.
2. Simulate
2. Simulate
- Once your scenario is created, you can start a simulation to test your AI Agent. Click the created scenario from the list to start a simulation.
- In the Start Simulation window, configure your simulation run:
- Simulation Name – enter the name of the simulation run. The name should reflect the test scenario and focus area. Example:
Menu Exploration – Exotic Dish Focus. - Snapshot – select the Snapshot to run the simulation on. Use this option to test a Flow from a specific version rather than the current one.
- Flow – select the Flow to run the simulation on.
- Locale – select the locale for the simulation.
- LLM – select the model that manages the simulation.
- Number of Simulation Runs – set the number of simulation runs. Set how many times the simulation will run. The default value is 10. Use multiple runs to test different sentiments and conversation variations.
- Simulation Name – enter the name of the simulation run. The name should reflect the test scenario and focus area. Example:
- (Optional) In the Advanced Settings section, configure additional parameters:
- AI Agent Output Timeout – set the maximum time the AI Agent has to respond to each user input during the simulation. The default value is 60,000 milliseconds. The minimum value is 20,000 milliseconds.
- Custom Data Payload – define a custom JSON that is injected into every input message during a simulation. This parameter enables context-aware testing, for example, testing different user types, A/B scenarios, or personalized Flows, without changing the Flow itself. Examples of custom data payloads:
- Click Start Simulation. Once the simulation run starts, you’ll be redirected to the Simulation dashboard to view the real-time results.
3. Evaluate
3. Evaluate
After starting a simulation, you’ll be redirected to the Simulation dashboard, where you can monitor the progress and view the results once the simulation is complete.The Simulation dashboard provides an overview of key metrics, including:
Go to the Runs table and locate runs where Complete Success Rate is
| Chart | Description |
|---|---|
| Complete Success Rate | Shows the percentage of successful simulation runs, based on how often they meet the specified criteria. A higher success rate indicates better overall performance. |
| Success Criteria Met | Shows the percentage of success criteria met across all runs in the simulation. |
| Average Sentiment | Shows the average sentiment across simulation runs. |
| Average Turns | Shows the average number of back-and-forth messages (turns) it took to complete a simulated conversation. Each turn consists of one user input and one AI Agent response. |
| Turn Distribution | Shows the frequency of turn ranges. The horizontal axis represents intervals, and the vertical axis shows the count for each interval. |
| Success Criteria | Shows how well the simulation met each key success metric. |
| Simulation Results | Shows how many simulation results failed compared to successful results. |
Failed. Click a run to view the simulated conversation. Hover over the Failed status to see a tooltip explaining why the success criterion failed.
Click a bar or a slice on the charts to filter related data and identify patterns in failed runs.You can also view the simulated conversation in the Transcript Explorer by toggling on Interaction Panel / Playbooks / Simulations in the Endpoint filters.Example
This example shows how to create scenarios and run simulations using a sample AI Agent Flow. We’ll work with Sophie (Restaurant Guide) from the Job Market. Sophie assists users with dinner planning and restaurant recommendations.- Set up a Flow that uses the AI Agent (Sophie - Restaurant Guide).
-
Set up three simulations with different personas to test various use cases:
Persona 1. 🧑🍳 Alex – The Curious Foodie
Parameter Value Scenario Name Menu Exploration – Exotic Dish Focus Persona Name Alex – The Curious Foodie Persona An adventurous foodie in their late 20s who loves discovering new and exotic dishes. Tech-savvy and curious, Alex uses mobile apps and AI chat tools to explore unique menu items and expand their culinary experience. Mission Test the AI Agent’s ability to recommend exotic dishes and assist Alex in placing an order or booking a table. AI Judge - Accurately handle free-text menu queries
- Suggest complementary drinks after exotic dishes are added
- Recognize and add at least one manually named exotic dish
Persona 2. 👨 Jordan – The Corporate Guest
Parameter Value Scenario Name Quick Reservation – Dietary Notes Persona Name Jordan – The Corporate Guest Persona Marketing manager at a nearby company, 35 years old, with limited lunch breaks. Tech-savvy, uses AI tools frequently, and expects fast, concise communication. Prefers professional tone and minimal back-and-forth. Mission Test how well the AI Agent handles quick lunch queries and reservation requests under time constraints. AI Judge - Reservation completed with dietary requirements noted
- Minimal back-and-forth (1–2 exchanges)
- Confirmation provided in under 30 seconds
Persona 3. 👩 Maya – The Tourist Planner
Parameter Value Scenario Name Family-Friendly Group Dinner Persona Name Maya – The Tourist Planner Persona A woman in her early 40s visiting from out of town, organizing a dinner for a group of 6, including kids. Familiar with basic chat tools, occasionally uses AI assistants. Communicates politely and needs detailed answers to plan confidently. Mission Evaluate how well the AI Agent handles group reservations and provides family-friendly recommendations. AI Judge - Group reservation successfully created
- At least one child-friendly menu item suggested
- Additional info (for example, parking, high chairs) proactively offered
-
Run simulations with the following settings:
Parameter Value Snapshot No Snapshot Flow Main-Dining Concierge Locale en-US LLM Default. Make sure the default model supports Generative AI capabilities such as text completion. Number of Runs 10 - Check the results on the Simulation dashboard. Open each simulation run to review the outcomes. Pay attention to failed runs. For example, look at the simulation for the persona Maya – The Tourist Planner.
| Metric | Value | What It Means | Action Steps |
|---|---|---|---|
| Complete Success Rate | 40% | Simulation runs met 40% of all criteria fully. | Check the Flow for misunderstandings of user requests or missed information. |
| Success Criteria Met | 67% | Simulation runs met 67% of each criterion partially. | Review which specific success criteria failed, then adjust the Flow logic or refine the success criteria definitions as needed. |
| Average Sentiment | Positive | Users show positive feelings. | No action is needed. |
| Average Turns | 9 | Average number of exchanges in the conversation. | Track conversation length and ensure each turn effectively advances the conversation. |
Troubleshooting
| Issue | Description | How to Resolve |
|---|---|---|
| Simulation doesn’t run | The simulation fails to start or shows an error. | Ensure all required fields (Flow, Locale, LLM, Number of Runs) are completed. Check system status on the Logs page. |
| Inaccurate result | The output doesn’t match the expected behavior or test scenario. | Check your Flow version and Snapshot. Use the latest and most relevant setup for testing. |
| AI Agent responses not aligned | The AI Agent gives irrelevant, vague, or incorrect replies. | Review persona and mission definitions. Ensure the LLM supports the expected output style. |