Discussion about this post

User's avatar
Daniel Popescu's avatar

Thanks for writing this, it clarifies alot. The point about RLHF not modeling diverse human values really resonates. It makes me wonder, how do we build feedback loops that genuinely represent global, pluralistic views, not just whoever pushes a button? Insightful analysis, appreciate you tackling these complex problems.

Expand full comment
UMET ALE's avatar

The situation of compromised Microsoft servers shows a real threat.

Expand full comment

No posts