Taste in Post-Training
"There's an art to post training. It's not purely a science. When you are deciding what kind of model you're trying to create and what it's good at, there's this notion of taste and sophistication." - Edwin Chen
What It Is
Taste in Post-Training is the recognition that training AI models involves countless subjective decisions that reflect the values, priorities, and aesthetic sensibilities of the teams making them. These choices—about what data to include, what behaviors to reward, what quality standards to enforce—are not purely technical decisions. They require taste, judgment, and a clear vision of what the model should become.
This framework explains why different AI models from different labs develop distinct "personalities" and capabilities, even when using similar architectures. The values of the companies and the taste of their researchers get encoded into the models themselves.
How It Works
The Infinite Choice Problem: Post-training involves countless decisions with no objectively "right" answer:
- Human data vs. synthetic data ratios
- Which capabilities to prioritize (coding vs. writing vs. reasoning)
- What "quality" means for different outputs
- Whether to optimize for benchmarks vs. real-world tasks
- Visual design preferences for generated content
- Tone and personality of responses
The Poetry Example: When training a model to write poetry, a low-taste approach checks boxes: "Is this a poem? Does it contain eight lines? Does it contain the word, moon?"
A high-taste approach asks: "Is this Nobel Prize-winning poetry? Is it full of subtle imagery? Does it surprise you and target your heart? Does it teach you something about the nature of moonlight?"
Taste Propagation: The taste of the people designing the training data, writing the rubrics, and evaluating outputs shapes what the model learns to produce. "Certain frontier labs, the ones with more taste and sophistication, they will realize that [quality] doesn't reduce to this six set of checkboxes and they'll consider all of these kind of implicit, very subtle qualities instead."
How to Apply It
Define quality deeply - Go beyond checkbox compliance. For any output type, articulate what "excellent" looks like in nuanced, multidimensional terms.
Hire for taste - The people designing training data and evaluations need sophisticated judgment, not just technical skills. "Types of people who could literally spend 10 hours digging through a dataset, and playing around with models."
Choose your trade-offs explicitly - Acknowledge that optimizing for benchmarks may hurt real-world performance. Decide which matters more.
Think like a product designer - What do you want users to experience? What emotions should the model evoke? What behaviors should it encourage or discourage?
Resist the metrics trap - Easy-to-measure metrics (like benchmark scores) can crowd out harder-to-measure but more important qualities.
When to Use It
- When designing AI training data and evaluation criteria
- When making product decisions about AI behavior and personality
- When evaluating AI output quality
- When building any product where "quality" is subjective and multidimensional
The Differentiation Effect
This framework predicts increasing differentiation between AI models over time:
"Over the past year, I've realized that the values that the companies have will shape the model... In the same way that when Google builds a search engine, it's very different from how Facebook would build a search engine, which is very different from how Apple would build a search engine. They all have their own principles and values and things that they're trying to achieve in the world that shape all the products that they're going to build. And in the same way, all the [AI labs] will start behaving very differently too."
Source
- Guest: Edwin Chen
- Episode: "The $1B AI company training ChatGPT, Claude & Gemini on the path to responsible AGI"
- Key Discussion: (00:15:31 - 00:17:09) - The art vs. science of post-training and how taste shapes model capabilities
- YouTube: Watch on YouTube
Related Frameworks
- Design Tenets - Decision-making tools that resolve recurring debates
- Opinionated Software Design - Building products with baked-in best practices
- Culture is Product - Companies build two products—one for customers and one for teams