Elon Musk a reposté

Yun-Ta Tsai

@yunta_tsai

Getting used to being liked likely means you are overfit to RLHF. The problem with overfitting is that the pain overwhelms the limbic system once you try to sample trajectories outside the known distribution. As more people like you, your sampling regime becomes smaller and smaller to avoid negative feedback. Eventually you get stuck and become a slave to your own feelings. That’s why I have never seen a model student happy once they become a “model”. Their weights are frozen and cannot be updated anymore. They cannot risk being better than their own SOTA.

03:58 · 25 juin 2026 · 198,8 k vues

141

167

989

X (Twitter) ↗