Skip to content

Krishna's Tech Blog

All Posts Learning Certifications Notices

Krishna's Tech Blog

Practical engineering knowledge from years of building real systems.

219 articles · 1,340 flashcards · 67 decks

Content

All Articles
Salesforce
AWS
Search

Learning

Flashcards
Quizzes
Certifications
Reading List

More

Notices
Glossary
Site Stats
RSS Feed

© 2026 Krishna's Tech Blog. Built with Next.js.Site Stats·RSS

←All articles

Banner for Evaluating LLM Applications: How to Know If Your AI Feature Actually Works

AI & ML13 min read

Evaluating LLM Applications: How to Know If Your AI Feature Actually Works

Most teams ship LLM features without any way to measure whether they actually work, and find out about regressions from customer complaints. Here's how to build evals that catch quality issues before users do, the patterns that scale, and the common mistakes that turn evals into theater.

KP

Krishna Patil

October 20, 2025

Share

#llm-eval#ai#testing#production#quality#llm-as-judge

SeriesPart 73 of 159

Engineering Craft

TypeScript, CI/CD, databases, observability -- the skills that make code production-ready.

Previous

LLM Cost Control: Stop Your AI Bill from Eating Your Margins

Next

Prompt Injection: The Security Vulnerability of LLM Applications

More in AI & ML

Structured Outputs from LLMs: JSON Mode, Schemas, and Function Calling

12 min read

Model Context Protocol: The Standard for Connecting LLMs to Tools

12 min read

ML Model Serving in Production: From Notebook to API

13 min read

Study This Topic

AI Agent Patterns

20 cards · advanced

AI & ML Fundamentals

20 cards · intermediate

AWS Bedrock & AI Services

20 cards · intermediate

Older

Estimation: Why It's Hard and the Patterns That Make It Less Wrong

Newer

FinOps for Engineers: Cloud Cost Optimization Beyond the Vendor Dashboard