Overview
https://www.youtube.com/watch?v=N-qAOv_PNPc
https://maven.com/parlance-labs/evals?promoCode=evals-info-url
Teresa Torres built an app to automate feedback on discovery calls:
- https://www.producttalk.org/2020/03/interview-customers-together/
- https://www.producttalk.org/2021/08/product-discovery/
Major takeaways & concepts
- Evals, Failure modes
- Tools: airtable for eval tracking, jupyter notebooks for eval analysis
- Used jupyter notebooks to quickly evaluate responses because it was easy to compare. Really embraced engineering mindset. Rolled her own eval tool.
- Judges
- Traces
- Context engineering
Failure modes
Automated coach takes in transcript from a customer discovery call to give you feedback, e.g. . Ran into a couple of failure modes recommendations:
- Leading questions: where the question implies the answer
- General questions: tell me about your morning routine ⇒ tell me about your morning routine this morning

Coding
Uses Claude Code inside of VSCode extensively — doesn’t “vibe code” in the sense that she doesn’t simply let claude write code blindly.
Teresa:
- “I’m really good at describing what I want”.
- “I just demand things from Claude Code.”
- “Every time I give Claude Code a longer leash, it goes wrong for me.”
- “I’m really terrified of ending up with a product that I can’t maintain myself”

Has transitioned to using notebook for analysis, visualization, individual transcript details.

Toward Production
Teresa Plans to integrate automated coach into vistaly: https://www.vistaly.com/
