Explaining AI Alignment as an NLPer and Why I am Working on It
ruiqizhong.substack.com
I started NLP in 2018 and worked on various NLP topics related to algorithmic bias, interpretability, and semantic parsing. In 2021, I pivoted to Scalable Oversight, a sub-area of AI alignment. Since the successes of chatbots such as GPT-4 and Claude, more NLP researchers start to discuss AI alignment. However, people use the word “AI alignment” for different things: some think it means “preventing AI systems from taking over the world” while others think it specifically means “optimizing for human ratings”.
Explaining AI Alignment as an NLPer and Why I am Working on It
Explaining AI Alignment as an NLPer and Why I…
Explaining AI Alignment as an NLPer and Why I am Working on It
I started NLP in 2018 and worked on various NLP topics related to algorithmic bias, interpretability, and semantic parsing. In 2021, I pivoted to Scalable Oversight, a sub-area of AI alignment. Since the successes of chatbots such as GPT-4 and Claude, more NLP researchers start to discuss AI alignment. However, people use the word “AI alignment” for different things: some think it means “preventing AI systems from taking over the world” while others think it specifically means “optimizing for human ratings”.