Resilience in AI agents has two halves, and most teams only solve one of them.

Durable execution - like Temporal - lets you build long-running agents that automatically retry on LLM call failures and recover from process crashes. Durable sessions - like Ably AI Transport - let you build clients that automatically recover and resume dropped connections. Put them together and you get an agent that’s resilient end to end: the backend keeps working through failures, and the client keeps a durable, unified view of all the agent’s activity - even when that work is executed across many separate Temporal workflow runs.

This explainer and demo walks through how the two layers fit together and why you want both.

For more on the durable sessions side of this, I’ve written about why production AI needs a session layer and shown the patterns running in a live demo.

If you’ve got comments, questions or feedback you can reach me on Twitter or email mike (at) christensen.codes ✌️