AI Platform Engineering — AI Engineer Summit

Patrick draws on his 15 years of DevOps experience — starting from organizing the first DevOpsDays in 2009 — to frame how organizations should approach scaling generative AI. He traces a familiar pattern: a pilot team experiments with new technology, learnings are extracted, and eventually an abstraction layer (a platform) emerges so that other teams can move faster without having to understand every underlying detail. The same progression that played out with cloud infrastructure and DevOps is now repeating with AI engineering.

The talk provides a detailed breakdown of what goes into an AI platform. Patrick walks through the key services a platform team should provide: centralized model access and registries, vector databases for RAG, data source connectors, model version control, provider-agnostic proxy layers with access control, prompt observability and tracing, production data quality monitoring with continuous evals, caching services, and feedback mechanisms. He emphasizes that this is a genuinely new layer of infrastructure — comparable in complexity to the Kubernetes ecosystem — and that organizations should not expect every feature team to build these capabilities independently.

Beyond infrastructure, the talk covers enablement — helping application teams actually use the platform effectively. This includes providing prototyping tools for experimentation (including for product owners), frameworks for learning, and local development environments. Patrick shares hard-won lessons from two years of building generative AI applications: the real use case matters more than blanket mandates to add AI everywhere; switching models is costly without proper eval suites; developers struggle with non-deterministic testing; and end-user feedback often lands with engineers who lack the domain context to act on it.

Patrick also addresses the human side of adoption. He discusses how coding copilots serve a dual purpose — increasing productivity while also evangelizing AI to hesitant engineers. He references the “Ironies of Automation” paper to explain the shift from producer to manager/reviewer, the risk of lost situational awareness, and the DevOps-inspired progression from automation through testing, monitoring, resilience, and chaos engineering. The talk concludes with a governance framework covering awareness programs, license management, the EU AI Act risk levels, prompt injection defenses, and guardrails-as-a-service — and positions the AI platform team alongside cloud ops and developer experience teams in a unified organizational structure.

Watch on YouTube — available on the jedi4ever channel

This summary was generated using AI based on the auto-generated transcript.