Red-teamers ask the AI potentially problematic questions. Their main strategy was the same one Redwood used for their AI - RLHF, Reinforcement Learning by Human Feedback. OpenAI put a truly remarkable amount of effort into making a chatbot that would never say it loved racism. When they inevitably succeed, they publish an article titled “AI LOVES RACISM!” Then the corporation either recalls its chatbot or pledges to do better next time, and the game moves on to the next company in line. Then the journalists try to trick the chatbot into saying “I love racism”. The corporation tries to program the chatbot to never say offensive things. It’s very impressive!Įvery corporate chatbot release is followed by the same cat-and-mouse game with journalists. If you haven’t played with it yet, I recommend it. OpenAI released a question-answering AI, ChatGPT. ![]() Now that same experiment is playing out on the world stage.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |