The Trust Gap: Earning the Right to Optimise a Container Terminal

You have built an optimisation engine. It runs in simulation, it beats every baseline, and the numbers say it should save the terminal fifteen percent on reshuffles. You deploy it. The planner looks at its first recommendation, says "that's not how we do it," and overrides it.

This is not a failure of the algorithm. It is the beginning of the real problem.

Why operators resist

Terminal planners are not being irrational when they reject algorithmic recommendations. They are being perfectly rational.

A planner who follows established procedures and something goes wrong has a defensible position. They did what has always been done. A planner who follows an algorithm's recommendation and something goes wrong is personally exposed. The downside is asymmetric. No one gets promoted for trusting an algorithm. But people do get blamed when a new system causes a vessel delay.

There is also a deeper issue. Experienced planners have built intuition over years of operational exposure. They have seen the situations that never make it into a simulator: the crane that always jams in rain, the truck gate that backs up at shift change, the chief who insists on a particular loading sequence. When an algorithm makes a recommendation that contradicts this intuition, the planner is not just being conservative. They are pattern-matching against a richer dataset than your model has ever seen.

Respecting this is not a concession. It is a prerequisite for building anything that actually gets used.

Safety is the ground truth, not just a constraint

Every terminal has its own safety culture, shaped by its layout, its equipment, its workforce, and its history of incidents. These are lived practices, the result of near-misses, investigations, and hard lessons learned over years of operation.

An optimisation engine sees a feasible solution. It checks the physical constraints: weight limits, stack heights, crane reach. The numbers work. But feasibility on paper and safety in practice are not the same thing.

A solution might route a straddle carrier through a path that is technically passable but runs adjacent to a pedestrian crossing used during shift changes. It might stack a container in a position that satisfies weight limits but places hazardous cargo downwind of the crew break area, something no constraint matrix captures but every experienced planner knows to avoid. A "keep-out zone" around a particular yard block might exist not because of any structural reason, but because there was an incident there three years ago. That agreement lives in people's heads, not in your data model.

Your optimiser does not know what it does not know. The most critical safety practices are often the hardest to formalise: contextual, situational, and built on human judgment refined through experience. Until your system has earned the trust to account for these realities, the human in the loop is not a bottleneck. They are the safety system.

What constrained trust looks like

In practice, earning trust means the optimisation engine does not start with full authority over decisions. It starts constrained, sometimes severely. The operator sets boundaries: "you can optimise within this bay, but do not move containers across bays." Or: "you can suggest a stacking order, but it has to use the same zones we use today." Or simply: "show me what you would do, but I will decide."

These constraints are not technical requirements. They are trust boundaries. And they have real consequences for the quality of solutions your engine can produce.

Consider yard stacking. An unconstrained optimiser might redistribute containers across the entire yard to minimise future reshuffles. But if the operator insists that containers stay within their traditional zoning (reefer cargo here, export cargo there, empties in the back), you are optimising within a partition of the solution space. The global optimum almost certainly lives outside these boundaries. Your engine is finding a local optimum within an artificially constrained region.

This is the core tension: the constraints that build trust are the same constraints that prevent your system from demonstrating its full value.

The sub-optimal regime

Let me be precise about what happens mathematically. When operators impose procedural constraints on your optimisation engine, you are effectively adding hard constraints to your problem formulation that do not come from the physics of the terminal or the requirements of the operation. They come from organisational comfort.

The feasible region shrinks. In some cases, it shrinks dramatically. If your engine was designed to jointly optimise across crane scheduling, yard placement, and equipment routing, but the operator only allows it to adjust yard placement within fixed zones while everything else follows existing procedures, you have reduced a rich combinatorial problem to a narrow slice of it.

The result is predictable. Your engine produces solutions that are better than the status quo, but only marginally. The operator sees a three percent improvement and thinks: "is it really worth changing our workflow for three percent?" The answer, rationally, is often no. And so the system gets shelved or sidelined to advisory mode, where its recommendations are politely ignored.

This is the trap. You cannot demonstrate transformative value because trust constraints prevent it. You cannot earn deeper trust because you have not demonstrated transformative value. It is a chicken-and-egg problem, and it kills more optimisation deployments than any technical limitation.

How the engine adapts

Building an optimisation engine that can operate effectively under trust constraints requires deliberate architectural choices. This is not something you bolt on after the fact.

Hierarchical decision boundaries. Structure the engine so that different decision layers can be independently enabled or constrained. If the operator trusts your container sequencing but not your zone allocation, the engine should be able to optimise sequencing while respecting fixed zones, without degrading into a trivial problem. This means the decomposition of your optimisation model matters as much as the model itself.

Graceful degradation. When constraints tighten, the engine should still produce meaningful improvements within the allowed space, not collapse to the status quo. This requires understanding which degrees of freedom matter most for solution quality. If you have ten decision variables and the operator locks seven of them, can you still move the needle with the remaining three? You need to know this in advance, not discover it in production.

Transparent counterfactuals. The engine should be able to answer the question: "what would I recommend if this constraint were relaxed?" Not as a sales pitch, but as an honest comparison. "Within your current zoning rules, I can reduce reshuffles by four percent. If you allowed cross-zone moves for export containers only, the improvement would be eleven percent." This gives operators concrete information to make trust decisions, rather than asking them to take a leap of faith.

Constraint relaxation as a dial, not a switch. Trust does not go from zero to one. It moves in increments. Your engine should support this. Allow operators to relax constraints gradually: first within a single block, then across adjacent blocks, then yard-wide. Each relaxation should produce a measurable, attributable improvement that the operator can verify against their own experience.

The incremental path

The deployments that have worked, the ones where the system eventually earned enough trust to operate with real authority, followed a recognisable pattern.

Phase one: shadow mode. The engine runs alongside existing operations but makes no decisions. It generates recommendations and logs them. At the end of each shift, someone compares what the engine suggested against what actually happened. This builds familiarity and identifies cases where the engine's recommendations conflict with operational reality. It also surfaces simulator gaps, places where the engine's model of the terminal diverges from how it actually works.

Phase two: advisory on low-stakes decisions. The engine starts making recommendations for decisions where the cost of a mistake is low. Stacking positions for containers that will not be retrieved for days. Equipment pre-positioning during off-peak hours. Sequencing choices where multiple options are roughly equivalent. The operator still decides, but now they have a suggestion to evaluate rather than a blank slate.

Phase three: constrained authority. The engine begins making some decisions directly, but within tight boundaries set by the operator. This is where the trust constraints are most visible and where the sub-optimal regime is most acute. The key is rigorous measurement: tracking the engine's performance against what would have happened under the old process, and sharing those results honestly with the operations team.

Phase four: earned expansion. As the engine demonstrates reliability within its constraints, operators gradually relax the boundaries. This is not something you push for. It happens naturally when a planner starts saying "the system was right about that" more often than "I would have done it differently." The constraint relaxation is the operator's decision, not yours.

Each phase can take weeks or months. Trying to skip phases by deploying with full authority because the simulation results are compelling almost always fails. Not because the engine cannot handle it technically, but because the organisation cannot absorb it.

What this means for how you build

If you accept that every optimisation engine will spend its first months (sometimes its first year) operating in a constrained regime, it changes how you build the system.

You stop optimising for peak theoretical performance and start optimising for performance under partial constraint. Your benchmarks should include scenarios where half the decision variables are locked to baseline values. Your training environments, if you are using RL, should include episodes where the agent can only act on a subset of decisions. Your test suite should verify that the system degrades gracefully as constraints tighten, not that it achieves the best possible score in an unconstrained setting.

You need to build the organisational process around the deployment, not just the software. Regular reviews with the operations team. Clear escalation paths when the engine makes a recommendation that conflicts with operational judgment. A mechanism for operators to flag recommendations as wrong, with those flags feeding back into the system. The engine is not a product you deliver. It is a relationship you maintain.

About timelines

In an ideal world, you would deploy your optimisation engine, it would immediately demonstrate its value, and operators would quickly grant it full authority. In reality, the path from first deployment to meaningful operational impact takes far longer than the technical work suggests.

Building the algorithm might take months. Earning the trust to let it actually run takes longer. And that is not a failure of your sales process or your change management. It is a reflection of the fact that container terminals are high-stakes environments where the cost of getting it wrong (a delayed vessel, a safety incident, a missed sailing window) is measured in hundreds of thousands of euros.

Operators are right to be cautious. The question is not how to overcome their caution. It is how to build systems that earn their confidence, one decision at a time, one shift at a time, one small expansion of authority at a time.

The optimisation engines that eventually transform operations are not the ones with the best objective function values. They are the ones that survive the trust-building process long enough to demonstrate what they can really do.