Why Production AI Needs a Session Layer, Not Just a Stream @ Ably

Over on the Ably blog, I wrote about what happens when you take an HTTP-streamed AI experience from demo to production and the challenges with building AI apps with SSE.

The post digs into three failure patterns:

Resilient delivery: when the connection drops, tokens are not delivered to the user; supporting resumable streaming means buffering events outside the agent and implementing complex replay logic.
Cross-device continuity: SSE is a private pipe between one client and the agent, so different clients (tabs/devices) aren’t in sync.
Live control: SSE is one-way, so the only way to cancel an agent is to close the connection. The agent can’t tell a deliberate cancel from a flaky network, which breaks resumption and cancellation at the same time.

None of these are impossible to solve with SSE, but you won’t like what it takes. My colleague Zak Knill worked through the details in How to make SSE token streams resumable, cancellable, and multi-device: storing every token in a database to support Last-Event-ID replays, routing cancellations through shared state, and polling for prompts to sync devices. You end up rebuilding half a message broker on top of your database.

The short version: backend durability is largely a solved problem, but the transport between agent and user isn’t. What’s missing is a stateful session layer between users and agents that support resumability, routing, cross-device sync and bidirectional control.

Where a durable session layer sits between users and agents

Read the full post here: Why production AI needs a session layer, not just a stream.