Five Things I Noticed at AI Council 2026

Reflections from AI Council 2026 in San Francisco on agent infrastructure, production realities, open source credibility, and the human-in-the-loop question 🎤

By Parminder Singh · Published on May 23, 2026

Speaker on stage at AI Council 2026 presenting under an AI Launchpad backdrop

Three panelists seated on stage at AI Council 2026 during a panel discussion

At AI Council 2026 in San Francisco (May 12-14 at the SF Marriott Marquis), I spoke on building durable, long-running autonomous agents. This post isn't a recap of that talk, but a few things I noticed in the room, including some that my talk ran straight into.

Watch the talk

If you want the talk itself, here's the recording. The rest of this post is broader conference reflection from the room.

Watch on YouTube or open the slides.

AI Council bills itself as a conference "for people who ship." 🚢 Three days in, I believe it. The hallway conversations were as dense as the talks. The questions from the audience were specific, not generic. People weren't there to collect swag. They were there because they're deep in problems and needed to think alongside other people who are too.

Here's what stuck with me.

1. The field has quietly moved past "will agents work?" to "why do they keep breaking?"

The Agent Infrastructure track ran all of Day 1. Linus Lee opened with context engineering. Jacopo Tagliabue made a case for "forgiveness, not permission" when running agents on production data. Glauber Costa argued that agents will need trillions of databases. I closed the track on durability.

We weren't coordinating. But we were all circling the same problem.

The question in 2024 was whether agents could do useful work. That question is settled. The question in 2026 is why they keep failing once they're deployed, and what the infrastructure around them needs to look like. That shift is significant. It means the field has grown up enough to be honest about production realities.

2. "Demos vs. production" is the conference's real throughline, and nobody planned it that way

It showed up everywhere. Eno Reyes at Factory opened his talk with the observation that most AI coding agents max out at short sessions, a few minutes before context degrades. His team built a system that sustains autonomous development for days. Sixteen days, in one case.

Emilie Schario from Kilo Code argued that the best engineer in the room isn't the one writing the most code. It's the one building systems of agents. Naheil McAvinue from GitLab talked specifically about breaking the proof-of-concept cycle.

The throughline wasn't coordinated. It emerged because everyone in that room is dealing with the same gap: impressive demos, brittle production systems. The conference became an informal collective diagnosis of that gap.

3. The most interesting talks weren't about models. They were about the plumbing around them

This might be the most important signal from the conference.

Nikhil Benesch from turbopuffer talked about operating a search engine at over a trillion documents with sub-100ms query latency. Robin Tang from Artie walked through what broke when their CDC pipeline hit 20-30 billion events per day. Jacopo Tagliabue introduced the idea of a "correct-by-design lakehouse" where illegal states are provably unrepresentable.

None of these talks were about model capabilities. They were about the infrastructure that makes models useful, and survivable, at scale. The audience was packed for all of them.

The model wars get the headlines. But the builders in that room have moved on. They're building the layer underneath. 🔧

4. Open source has a new kind of credibility in the room

Raffi Krikorian from Mozilla opened Day 3 with a keynote on open source AI. His argument wasn't about ideology. It was practical. Open source AI can win if it becomes the simple default for the 99% of use cases that don't need a frontier mega-model.

What struck me wasn't the argument itself. It was how it landed. A room full of engineers who ship, not researchers or VCs, nodding along. The framing of "owners, not renters" resonated specifically because of who was in the room.

Vincent Weisser from Prime Intellect was on stage talking about open frontier labs and distributed training. Lucas Atkins from Arcee described training a 400B parameter MoE from scratch. The panel on "The Open Layer" had engineers from OpenRouter, Kilo Code, Arcee AI, and Fireworks AI in conversation about what open actually means in agentic pipelines.

The energy around open source at this conference wasn't nostalgic or political. It was technical and forward-looking. 🚀

5. The human-in-the-loop question is more live than the field admits

My talk covered what I call Durable Autonomy: the architecture question of when an agent should stop and ask a human, and how that decision should be made. I expected this to be the least contested part of my talk. It generated the most discussion.

The honest reality is that most teams are still at what I'd call Level 1: fixed policy checkpoints, defined at build time. Interrupt on: send_email, execute_sql, delete_file, publish. Safe, auditable, and rigid.

The interesting question, how do you build an agent that decides dynamically when to escalate, gets better at that judgment over time, and earns its autonomy progressively, is largely unsolved in production. People in the hallway knew it. The questions I got after the talk confirmed it.

The field is shipping autonomous systems. It hasn't fully reckoned with the governance layer those systems need.

What I took home

AI Council 2026 felt different from most conferences I've been to. The sessions I sat in were standing room only, not because they were flashy, but because the content was specific. People are past the point of needing convincing that agents matter. They're in the weeds of making them work.

The conversations that stuck with me weren't the keynotes. They were the ones between sessions, about side-car architectures for agentic inference, about the right abstraction layer for agent memory, about what the production failure modes actually look like and why nobody writes about them publicly.

That's the kind of room worth being in.

I'll be back next year. 👋

I presented "Building Durable, Long-Running Autonomous Agents" in the Agent Infrastructure track at AI Council 2026 on Day 1. Watch the recording or view the slides (PDF). If you're working on similar problems, I'd love to compare notes.

About the author

Serial entrepreneur and engineer. I co-founded Hansel.io (acquired by NetcoreCloud) and now build AI agents at Redscope.ai . I've built Scaler.com's US business, shipped mobile products at Flipkart and Rediff, and hold a B.Tech from IIIT Hyderabad.

LinkedIn · GitHub · X (Twitter) · Substack

Read more from Parminder