Skip to main content

Orchestrator Agent Deep Dive

2025.1.01+

Introduction

The orchestrator agent represents an advanced pattern in AI-driven automation. Unlike a single AI service invocation that performs a narrowly scoped task, the orchestrator agent is embedded within a case model and operates as a long-lived participant in the execution of that case.

Its role is to assist, coordinate, and optionally automate tasks over time, based on evolving case context and process state. This enables dynamic, context-aware behavior, where the agent can offer suggestions, trigger steps, or decide on actions based on the current situation. At every point, the agent acts within the boundaries of the model, ensuring consistency, auditability, and human oversight where needed.

he orchestrator agent is currently available for use within CMMN case models. Its behavior is tightly integrated into the core of the CMMN execution engine, meaning that the logic and lifecycle of the agent are treated as first-class components during case execution. As a result, AI-driven actions, suggestions, or decisions made by the orchestrator agent are not external side effects: they are 100% part of the runtime execution flow of the engine itself.

For more details on how AI execution is handled during runtime, see the next section.

Why CMMN?

Declarative modeling tools like CMMN are particularly well suited for integrating AI in a context-driven manner. This is because CMMN focuses on modeling behavior that is event-driven, data-dependent and non-linear. Precisely the kind of environment where AI can provide the most value.

CMMN models define what can or should happen depending on available context, rather than prescribing a strict sequence of steps. This makes it an ideal fit for workflows where:

  • The path isn’t always known upfront.
  • Human input, data changes, or AI suggestions may influence the next steps.
  • Activities need to be triggered or adapted dynamically during runtime.

AI agents often operate by evaluating current context, generating responses, and adapting to new information. CMMN enables this by design, as it allows tasks, stages, or milestones to be triggered based on case variables, external signals, or agent outputs, without needing to restructure the model.

By contrast, BPMN follows an imperative modeling style: it defines a fixed control flow, specifying exactly what happens and in what order. While BPMN excels at representing well-understood, repeatable business processes, it is inherently less flexible when it comes to scenarios that require adapting to uncertain or evolving context.

That said, this does not mean AI capabilities are limited to CMMN. Flowable supports integrating AI into BPMN processes as well, and many use cases benefit from doing so, especially when AI is used in clearly defined steps such as classification, summarization, or generation tasks.

Based on our experience with real-world customer scenarios, starting with CMMN is often a natural fit when building AI-enhanced workflows, particularly those involving dynamic, evolving, and context-rich interactions. The BPMN process related to it typically naturally follow in the next phase, when patterns start to emerge.

Engine Implementation

This is a technical section and can be skipped if not interested in the technical details.

When using the orchestrator agent in a CMMN case model, the Flowable engine automatically evaluates whether the agent should be invoked at the end of any operation that modifies the case context (e.g., task completions, variable updates, or state transitions). This evaluation is built directly into the CMMN engine and ensures that the agent stays in sync with the evolving state of the case.

Rather than immediately calling the underlying foundational model, the engine first prepares the necessary input data for the agent invocation. This includes relevant context variables, any prompt instructions, and metadata needed for the call. Importantly, this preparation phase happens within the scope of the original operation, but the actual call to the AI service is handled asynchronously.

This asynchronous design ensures that AI service calls do not delay or block user interactions. These calls are also executed outside of the regular database transactions that wrap most Flowable operations. This separation is critical because AI service calls (especially those made over REST) can introduce latency. This avoids holding database transactions open unnecessarily, which improves system throughput.

Once the AI service responds, the resulting data is processed in a new, short-lived transaction. This transaction may update case variables, complete tasks, or trigger new stages, depending on how the case model is configured. All of the above happens using the regular asynchronous job executor, which will automatically retry jobs on failure. If relevant, this update can in turn trigger another cycle of agent evaluation, continuing the orchestrator’s involvement.

Token limits?

To prevent uncontrolled growth in prompt size over time (a common issue when working with token-limited models or with costs associated to token usages), the internal engine employs prompt construction techniques that ensure token usage remains bounded, even as the case progresses. Nonetheless, it is recommended to monitor token usage of orchestrator agents, just as you would track other system resource consumption.

Orchestrator vs Orchestration?

The orchestrator agent builds on top of regular orchestration capabilities and supports all the standard orchestration patterns available, such as sequential steps, parallel execution, and human-in-the-loop validation. What sets it apart is its continuous, context-aware presence within a CMMN case, allowing it to suggest actions or take initiative based on how the case evolves over time.

Concepts

At a high level, the orchestrator agent operates directly on a case instance and its evolving context. Each time the case changes, the orchestrator agent is automatically triggered.

The first thing it does is check whether any intent has been expressed. In this context, an intent is a signal that something should happen based on the current situation. For example, if a customer has submitted certain documents, the intent might be to begin a review. The orchestrator agent tries to recognize such intent by interpreting the current state of the case and the data available.

Once that’s evaluated, the agent looks at whether any case steps can be AI-activated: either directly (if configured to do so automatically), or proposed as a suggestion for a human to review and confirm. This allows the agent to continuously assist in moving the case forward, while respecting the structure and rules of the case model.

Intents

An intent is modeled using an intent event listener in CMMN. It behaves like other event listeners in the case model, waiting for specific conditions to be met before it can be triggered. Each intent has a name and associated instructions, which are made available to the orchestrator agent’s underlying foundational model. These instructions guide the agent in recognizing and responding to the intent based on the case’s current state.

Where does this concept come from?

In AI terminology, you can think of an intent as a goal or signal that something should happen. For example, in a chat scenario, if someone types, "I'd like to close my account", that message expresses a clear intent: the person wants to start the account closure process. The orchestrator agent works similarly: it watches for these kinds of signals in the data or context of a case and uses its foundational model to decide if an intent should be triggered, helping move the case forward.

AI Activation

AI Activation allows certain steps in a CMMN case model to be marked so the orchestrator agent can decide when and how to move them forward. Once a step is marked this way and becomes applicable (meaning the conditions for starting it are met), the agent can either suggest the step to a user or start it automatically, depending on how the model is configured.

Suggestion? Automatic?

To explain it in simpler terms, imagine a waiter coming to your table at a restaurant. The waiter might say, "Today's special is grilled XYZ, you might like it!" That’s like the agent suggesting a step: the choice is yours.

Or the waiter might just bring the dish to the table without asking, because you're a regular to the place. This is like the agent automatically starting the step.

Of course, you could also have a fine-tuned dining experience where you decide everything, from looking at the menu, ordering each course, the music being player, the carpet on the floor, .... This is always possible, because the CMMN model defines the rules, and the orchestrator agent only acts according to what you add and allow in the model.

Document Agent

An orchestrator agent optionally includes a document agent. When configured, this document agent automatically handles any document uploaded to the case instance. Once a document is uploaded in the context of a case instance that has an orchestrator agent, it’s automatically passed to the document agent for processing.

If document classification is enabled, the agent uses the referenced content models to determine what type of document it is, for example, identifying it as an invoice, contract, or ID document. After classification, the agent will attempt to extract metadata from the document. It does this by matching the content against the forms defined in the associated content model.

The result of this document analysis is then fed back into the orchestrator agent’s regular AI evaluation phase. With this new information, the agent might now detect an intent, such as "invoice uploaded", or make suggestions based on the data extracted, like automatically proposing the next step(s). This tight integration enables context-aware AI behavior directly from unstructured document input, without manual intervention.

Want more?

Looking for more details? Have a look at the reference documentation, or check the step-by-step example in the following section.