From 850ms to 4ms: Optimizing pgvector HNSW for Production AI Agent Search
How I built a sub-5ms semantic search system for 31 AI specialists using pgvector HNSW indexes, asyncpg connection pooling, and cosine similarity — with zero downtime migrations.