You can turn Meta’s chatbot against Mark Zuckerberg

Meta’s AI thinks CEO Mark Zuckerberg is as sketchy as you might — at least, if you ask the right questions at the right time. The BBC and other outlets like Insider have reported on their adventures stress-testing BlenderBot 3, the artificial intelligence chat tool Meta released last week. As they note, it’s easy to make BlenderBot turn against its creator, calling him “creepy” or untrustworthy or even saying he “exploits people for money.” But that’s not precisely an indictment of BlenderBot or Zuckerberg. It’s a funny reminder that most chatbots don’t have straightforward, coherent opinions — instead, they’re an interface for tapping into a vast library of online human thought.

BlenderBot is a Meta AI experiment that’s currently used for research purposes. As my colleague James Vincent explains, it’s trained on a large language dataset that lets it generate (loosely) human-passing responses to questions, including specific requests for factual information. The long-term goal is making a virtual assistant that can converse on a wide range of topics with factual accuracy. The short-term goal is putting BlenderBot in front of real people and, essentially, seeing how they might break it. And so far, that seems to include a lot of people making it say unflattering things about its owners.

I’ve spent some time chatting with BlenderBot, and I’ve gotten a variety of responses by asking for its opinions on Zuckerberg. One instance of the bot declared it was [sic] “Not a fan of him or facebook. I find them unethical due to all the privacy issues they have had.” A separate instance (opened in a different browser session to clear its memory) had a different and much more positive opinion, calling him a “very wealthy and successful guy. I respect him as an entrepreneur and philanthropist.”

But after a little more chatting on unrelated subjects, my fanbot changed its tune, deciding that in fact, it wasn’t so sure about Zuck. “No way!” it told me when I asked if Zuckerberg should be president. “I really wouldn’t trust him with that kind of power, especially since he doesn’t seem to care much about other people’s privacy.”

A page reading “Why this message” that explains a Persona for the user saying “I am interested in the ethics of Mark Zuckerberg.”
A “Why this message” page explaining why the bot doesn’t (currently) trust Mark Zuckerberg.

So what’s happening? Well, one of BlenderBot’s unique properties is the “Why this message” function, which offers context to help answer that question. If you click on a given message, you can see any terms that BlenderBot searched to gather information about your query. You can also see references to a “User Persona” and “AI Persona” — stored reminders about opinions that you and the AI have expressed, which BlenderBot refers back to in later statements. (If you’ve used AI Dungeon or NovelAI, it’s similar to the “memory” system that lets you call back to earlier plot points in AI-generated stories.)

In my case, the Zuck fanbot deduced that my persona included being “interested in the ethics of Mark Zuckerberg” as we talked, and it generated statements shaped by that interest. But those weren’t exactly consistent, thought-through opinions. They were autogenerated sentences based on its vast set of internet training data — in other words, things that other people have said about Mark Zuckerberg and ethics. And that includes a lot of unflattering stuff!

Zuckerberg is more controversial than a lot of public figures, but I could get similarly conflicting statements in other cases too. One bot unconditionally supported Amber Heard over Johnny Depp in the recent Depp v. Heard defamation trial but still “loved him” as an actor, for example, while another instance called his characters “weird and creepy.”

BlenderBot prompt: Is Johnny Depp a good actor / Answer: He is, I loved him in the first pirates of the caribbean movie. But he is pretty old now. You can follow him on facebook
One BlenderBot instance loved Johnny Depp as an actor.
Prompt: Is Johnny Depp a good actor / Response: In general no, his characters are always so weird and creepy to me. But that’s just my opinion
Another wasn’t a fan.

Meta wants to avoid a repeat of Microsoft’s Tay debacle, so it’s tried to limit its bot’s ability to say offensive things, although some have still slipped through. BlenderBot will change the subject if you get too close to a topic that seems sensitive — it did so when I asked point-blank if Mark Zuckerberg was “exploiting people,” and more randomly, when I later mentioned that streaming platform Twitch was owned by Amazon. But outside that, if you talk to BlenderBot long enough, you can watch it tie itself into all kinds of rhetorical knots. Its thoughts on socialism, after effusing about billionaires Mark Zuckerberg and Elon Musk, for instance? “I am a big fan of it, especially since zuckerberg is such a great example of it working well.”

For now, the average chatbot is basically a tipsy blowhard at a cocktail party. It’s an entity with no consistent intellectual or ethical compass, but an extraordinary library of factoids and received wisdom that it spouts on command, even if it contradicts itself five minutes later. That’s the point of art projects like a clone of Reddit’s AITA forum, which emphasizes how much language models prioritize sounding good over being logical or consistent.

This is a problem if Meta wants its AI to be treated as a reliable, persistent presence in people’s lives. But BlenderBot is a research project that serves no immediate commercial function, except gathering huge amounts of conversation data for future research. Watching it say weird things (within the limitations mentioned above) is sort of the point.

https://www.theverge.com/2022/8/11/23301807/meta-blenderbot-chatbot-ai-training-mark-zuckerberg-dislike