Thanks for writing this, it clarifies alot. The point about RLHF not modeling diverse human values really resonates. It makes me wonder, how do we build feedback loops that genuinely represent global, pluralistic views, not just whoever pushes a button? Insightful analysis, appreciate you tackling these complex problems.
Thanks for writing this, it clarifies alot. The point about RLHF not modeling diverse human values really resonates. It makes me wonder, how do we build feedback loops that genuinely represent global, pluralistic views, not just whoever pushes a button? Insightful analysis, appreciate you tackling these complex problems.
The situation of compromised Microsoft servers shows a real threat.