While attending the AI Engineer’s NYC Summit, I noticed a trend: while last year everyone was trying to sell you a vector database, evals are now the rage.
What I saw
- Sourcegraph & Booking.com — Migration automation and review systems
- Datadog — SRE/DevOps agent announcements alongside evals work
- Daytona — Sandbox for agent code generation
- Gitpod — Development environment automation
- Ellipsis Development — PR review system with agent integration
- Windsurf — Autonomous coding agent advances
The market is early-stage despite rapid innovation. The shift from vector databases to evals signals a maturing industry — people are past “can AI code?” and onto “how do we know if the code is good?”
Originally posted on LinkedIn.