Build an LLM Eval Dataset from Production TracesHow to convert real user interactions into reusable test sets for regression and model comparison.