Sign up for weekly new releases, and exclusive access to live debates, VIP events, and Open to Debate’s 2024 election series.
It’s poised to “change our world.” That’s according to Bill Gates, referencing an advanced AI chatbot called ChatGPT, which seems to be all the rage. The tool, which was developed by OpenAI and backed by a company Gates founded, Microsoft, effectively takes questions from users and produces human-like responses. The “GPT” stands “Generative Pre-trained Transformer,” which denotes the design and nature of the artificial intelligence training. And yet despite the chatbot’s swelling popularity, it’s also not without controversy. Everything from privacy and ethical questions to growing concerns about the data it utilizes, has some concerned about the effects it will ultimately have on society. Its detractors fear job loss, a rise in disinformation, and even the compromising long-term effects it could have on humans’ capacity for reason and writing. Its advocates tout the advantages ChatGPT will inevitably lend organizations, its versatility and iterative ability, and the depth and diversity of the data from which it pulls. Against this backdrop, we debate the following question: Will ChatGPT do more harm than good?
John Donvan:
Welcome everybody to another debate from Intelligence Squared. I’m John Donvan and this one is about the impact of a new technology, a technology that everybody is just talking about all the time now. And it’s one that as I get to know it better is actually taking me back to an old TV show called Lost in Space, which was about a family in a spaceship that was knocking around the cosmos trying to get back home. It was in the 1960s that this was on TV. Thank goodness there was a Netflix reboot a few years back, so it’s not so outdated. Anyway, the family has this robot that was this amazing imagining of future technology. And by future, I mean the show is set in the ’60s but the robot and the whole scene was taking place in the far off future year of 1997. And this robot could understand English spoken to it and respond thoughtfully in full sentences and helped that family out again and again because it had access to so much knowledge when they needed it.
And back in the 1960s, it seemed completely unbelievable that the world would have something like this by 1997. Well it didn’t. But here in 2023, we have ChatGPT. This marvel of artificial intelligence released a few months ago that actually seems to understand real language that you type in, even if your grammar’s not that great, it gets your point. And then it answers your question in a way that seems so complete and articulate and human-like, just like that robot, that it’s causing kind of a tech earthquake. Talk about disruptive. Well there is good disruptive and there is bad disruptive and we wanna know which one ChatGPT promises to be mostly. That’s our question. Will ChatGPT do more harm than good?
So let’s meet our debaters. Here to answer yes to the question that it will do more harm than good, the author of Rebooting AI: Building Artificial Intelligence We Can Trust, professor emeritus of psychology and neuroscience at New York University, founder of the AI company Geometric Intelligence, which he then sold to Uber, Gary Marcus. Thanks so much for joining us.
Gary Marcus:
Thanks for having me.
John Donvan:
And arguing no, that ChatGPT will not do more harm than good, which means he’s arguing the versus, more good than harm, founder and CEO at SignalRank Corp, and co-founder of several tech companies including Accelerated Digital Ventures, Archimedes Labs, TechCrunch, RealNames, and Easynet, Keith Teare, welcome to Intelligence Squared.
Keith Teare:
It’s a great pleasure to be here.
John Donvan:
So let’s- let’s get to this debate. We want each of you to take a few minutes to explain your position, why you’re arguing yes or why you’re arguing no. Gary, you’re up first. And answer to the question will ChatGPT do more harm than good, you say yes it will. Tell us why.
Gary Marcus:
Of course nobody knows for sure. There’s lots to be predicted and to be developed here. Um, there’s no doubt that ChatGPT will do some good, um, it’s incredibly fun to play with and I think it’s a dress rehearsal for the future. It’s not really that robot from Lost in Space that you mentioned, ’cause the robot from Lost in Space doesn’t make stuff up all the time. I think, it’s been a long time since I watched the show. Um, but usually the robots that we have in science fiction know what they’re talking about. We would not, um, imagine the Star Trek- Star Trek computer just making stuff up willy nilly. So you ask it a question, it gives you the wrong answer. You ask it seven different ways, which we call prompt engineering, and maybe on the seventh one it finally saves you from the aliens. But on the first six, um, it drives you directly into their arms.
So, um, it’s really important to understand that ChatGPT doesn’t know what it’s talking about, is not reliable. Google learned that mistake when they made a knock off of it and released it on February 8th and the knock off in the advertisement that they did made a mistake, that cost Google at least nominally $100 billion in market share. So they learned, um, what I’ve been saying for a long time, which is these things hallucinate. Um, and ultimately there- there’s a few different reasons why I think that they’re actually dangerous, that they actually pose a clear and present danger.
So the f- first and foremost is that they can be used deliberately to make misinformation. So ChatGPT has guardrails, but they’re easy to go around, or you can use the underlying technology called a large language model to make as much misinformation as you like at will, so you can make fake biographies, you can make fake articles about vaccines with fake references, all kinds of fake science. And if you believe in the Russian fire hose of propaganda model, then you realize that you don’t need to sell people on one lie, but just sell them on many, um, in order to undermine trust in society.
And I think that that’s gonna be the largest consequence of ChatGPT, is it’s gonna undermine trust in society. Some by deliberate use from bad actors who are doing things, or are likely to do things like, um, make up a lot of misinformation around vaccines, the environment, and so forth. And some by accidental misinformation. So when somebody types into Google’s Bard and gets a wrong answer or Microsoft’s Bing and gets a wrong answer, they may not even notice it. So the other thing that’s kind of insidious about these systems is they’re so plausible, they seem so human-like and so authoritative, that when they make mistakes, we don’t notice that there’s mistakes.
In fact CNET put out a whole bunch of articles, I think there were 70 articles or so, and about 40 of them had errors in them. And CNET didn’t even notice until they got busted by somebody else later, um, who started going through and fact checking. So we get seduced, they seem so plausible that it’s like, um, an automatic driving car that we would trust too much and we take our eyes off the wheel. Um, we trust these things, but they’re- they’re wrong. So that’s one class of problems. And then I’ll just briefly mention two others.
Um, they can give bad- bad medical advice and bad psychiatric advice. Because of their tendency to confabulate and their limited ability to really understand the world, they may tell people to take drugs that don’t interact well together, they may even counsel suicide, which GPT-3 did, um, in a kind of experimental test early on. So there’s a lot of risk here for undermining trust I think universally, and also for giving bad advice. And so I’m pretty worried, despite realizing of course it’s gonna save us time typing, it’s fun to play with, and it’s actually useful for programmers. So there’ll be some value for sure. But if it rips apart the fabric of society and undermines our trust, that’s a pretty serious thing.
John Donvan:
Thanks Gary Marcus. And now Keith Teare, your turn. You are arguing no in answer to the question that ChatGPT will do more harm than good. Tell us why.
Keith Teare:
Yeah. So Gary started by saying, you know, of course it will do some good. And so I’ll start by saying of course it can be used for bad. So, that said, I think we’ve gotta think of ChatGPT firstly by understanding what it is. Um, it’s called a large language model. What does that mean? It’s been given a whole bunch of human knowledge to learn and be trained on. Uh, it doesn’t just suck it in and build an index like a search engine does and- and kind of spit it back when you ask for it. It uses all of that content to be able to determine based on a question or a prompt which of that content is- is relevant to answering the prompt or the question. And it does a best guess at being human-like in how it does that in a conversational style.
So it’s a little bit like having someone else in the room that you can permanently talk to about everything you’re thinking about. It’s not dissimilar in some ways from Wikipedia except it’s better to interact with. Uh, we’ll all remember that Wikipedia was criticized a lot for random individuals being able to put stuff in that wasn’t necessarily true. Uh, at the same time, lots of people put stuff in that definitely was true. And I think we’d probably all agree that Wikipedia is a net plus for the world despite its failings. And I think of ChatGPT in a similar way. The best way I find to think about it is that it’s a- it’s just a super clever human being, minus some of the attributes of humans. Gary will point out lots of those, and I agree with him on most of them.
But it’s a super clever human being, almost like a librarian that can choose what to give you. And I think it should be judged by that, and not by a different criteria. It’s a little bit like if you were thinking about Tom Brady and you judged him by some criteria other than being a super good quarterback. You might determine that he’s a super flawed individual by everything except being a good quarterback, but the point is he’s a good quarterback. Well what is ChatGPT good at? It’s actually good at millions of things. Um, it’s really good, for example, at designing logic flows. If you ask it, uh, a question that requires a logical answer, it will almost always give it to you. It’s really good at coding as in writing Python or SQL, and so can help engineers, uh, become more productive by doing a lot of the heavy lifting for them.
It’s pretty decent at single historical facts. It gets real confused when you ask it to compare historical figures. But it’s really good if you just ask about one figure. It’s really good, uh, advising how to deal with situations that come up in your life and suggesting strategies for solving them. So, I think on the whole, its contribution to the world, notwithstanding its flaws, is on the whole as positive as say the internet. The internet clearly is full of all kinds of garbage, but none of us would argue the internet should be switched off or is a bad thing. Or some of us would, but mostly not. Um, so that’s, uh, my framing. It’s good at accessing and retrieving knowledge.
I don’t believe, by the way, that it is AI. Not even AGI, I don’t think it’s any kind of AI. I think it’s a super clever retrieval engine with a natural language capability. Um, and I think AI’s something else that’s gonna come later. I don’t even think, by the way, that’s it’s a path to AI. So to judge it compared to AI is probably the wrong framing. Uh, you’ve got to judge it, uh, as- as- as a replacement, for example, for perhaps, uh, a research assistant or a copy editor or a source. Um, it- it’s really, really good at those things.
John Donvan:
Thanks Keith. Uh, and- and thanks to both of you for your opening statements. Um, I think part of this program is gonna be a learning process for people who haven’t experienced ChatGPT or who know how it works. And so Keith you laid out some of that large language learning. It- it is fed with tons of stuff that’s out there and- and gradually it finds patterns and figures out a way to derive answers to questions and to put them into human sounding language. I just wanna make sure before we get into judging the merits of that, Gary do you agree with that’s a description of- of what’s functionally happening? I just wanna make sure we’re on the same page about that.
Gary Marcus:
It was a lot of imprecision there I would say. So, um, the thing is it can approximate all of the things that Keith suggested. Um, it can approximate for example a librarian. But it doesn’t literally do any of what he suggested and it makes mistakes in doing all of what he suggested. So take for example the notion of it being a research assistant. Um, a friend of mine, my mentor really, Steven Pinker, actually tried to use it recently for his next book. Um, and wrote to me about the experience. And the first seven or eight queries he asked, he didn’t get an answer at all, or got garbage. And finally he got one that was beautiful and had wonderful quotes. And he wrote to me and the people who, um, who putatively said the quotes and it turned out all the quotes were bogus.
So it will happily make up quotes, it will happily make up references. If you ask it for a biography, it will fill in the things it doesn’t know with things that aren’t true. Almost everybody I know who has tried it to write a biography has found that it makes errors. Um, so it- something that could stand in as looking for references and things like that, but because it has such a tendency to hallucinate, is the technical term we seem to have all agreed on, because of its huge tendency to hallucinate, it’s not actually good at being a research assistant. It’s not really a super human clever being. It doesn’t really know what it’s talking about. And so sometimes it will do well.
But I think something that we should bring out is the way that it works is it has this vast database of text and it learns relations between words. It’s not really trying to please people. But it’s trying to say the most plausible thing in any given context.
John Donvan:
More from Intelligence Squared US, when we return.
Welcome back to Intelligence Squared US. Let’s get back to our debate.
I wanna share just an example of, again for people who haven’t experienced it, and to sort of get across how it’s not like Google. I asked the question what is the best way to make a cheese sandwich? If I put that question into Google, I believe I would get back, you know, a list of recipes perhaps and it would figure out that I was looking for ways to make, um, cheese sandwich and would give me- take me to a bunch of different sites where people were doing it in different ways. In this case, I typed in what is the best way to make a cheese sandwich? The answer I got was making a cheese sandwich is a simple and delicious meal that can be prepared in many different ways. Here’s a basic recipe for a classic cheese sandwich. Two slices of bread, two slices of cheese, butter, mayonnaise optional, and then it goes through seven steps ending in step seven, serve the s- sandwich hot and enjoy. You can also add other ingredients to your cheese sandwich, blah, blah, blah.
Then I asked it to get shorter. Tell me how to make a cheese sandwich in less than 50 words. It- it cut it way down. To make a cheese sandwich, place cheese between two slices of bread. Toast on both sides, serve hot. Then I said tell me in less than 20 words. It now says place cheese between two slices of bread, cook on pan until melted, serve hot. I said tell me in 10 words. Place cheese between bread, cook until melted, serve. And- and so what I- what I saw there was an amazing adaptation to my request to, uh, number one just tell me how to do it, don’t show me a list of recipes from different people. And an amazing ability to- to meet my request to get brief. And I just wanna, again, in talking about its capabilities, Keith, maybe you could tell me, what was it doing there?
Keith Teare:
So it- it was- it was- first of all, it was understanding your question. That’s already pretty- pretty difficult ’cause it isn’t doing keyword matching. It’s actually looking at your entire sentence and trying to understand your meaning. And it’s- there’s a spectrum of how good it is at doing that, depending on the question and how it’s phrased, but it’s pretty dec- it’s pretty decent most of the time.
John Donvan:
It was great in that case. Yeah. Yeah.
Keith Teare:
Um, then the second thing it’s doing is figuring out if it’s got anything to say about that. And unlike Google, it isn’t looking up an index and doing keyword matching. It’s actually trying to figure out based on its- on its training, how to make a cheese sandwich. And um, that- that’s an interesting idea because, you know, it isn’t copy and pasting. It- it’s a trained model that doesn’t actually have the content inside the model. Uh, the content’s external. OpenAI does have indexes it can look at, um, when it needs to. For example, I asked it for the Gettysburg Address and it was able to give me the full Gettysburg Address, but it did that from an index. Knowing that I was asking for the Gettysburg Address came from its model and its training.
Uh, by the way, Gary’s quite right that I- I will be imprecise because I am not a data scientist. I’m a sociologist and political scientist by background and fairly technical. So, uh, excuse, uh, me if I use language that can be better and improved. It almost certainly will happen a lot. Uh, but I think directionally I’m- I’m right. Now what it- so what it did then is it- it gave you its best effort at a cheese sandwich. And then you were able to have a conversation. That’s too long, give me a shorter version. And it summarized. So it’s got a summary function. Uh, and by the way you can use that by copying and pasting a whole article in, or even a book, and asking it to give you a summary. Uh, if it- if the book is part of its training, you don’t even have to copy and paste, it’ll just do it.
So, it- it’s very good at summaries. And- and I think that’s kind of one case that- that I want to do something, how do I do it, and there’s lots of things you could ask it. It’s also really good, um, when you want to do something and- and it- it doesn’t really have, in its training, any knowledge of the specific thing. So it has to result to logic. The- the best example I have is Azeem Azhar’s example of creating a board game for his children. And he started with a board game that exists and over about a 30-minute period, uh, persuaded ChatGPT to not copy anything about that board game other than the abstract mechanics. And then to improve on those mechanics. And after 30 minutes, between the two of them, which I think is the key point, it’s between the two of them, they created a board game that was original and challenging and his children would be interested in.
So it- so it’s really, really good at some things, that- and it’s real- and it’s really, really bad at other things, that’s hence my Tom Brady example.
John Donvan:
All right. So Gary, what-
Gary Marcus:
I have a Marshall McLuhan moment.
John Donvan:
Yeah, yeah, yeah. Please.
Gary Marcus:
(laughs). So, just by coincidence, uh, Keith mentioned Azeem Azhar and Azeem Azhar sent me an email this morning with-
John Donvan:
Can you tell people who this is?
Gary Marcus:
So A- Azeem Azhar is a technologist. Uh, he does a podcast called Exponential View, which I was, uh, once fortunate enough to be on. Um, he’s built a number of companies, very bright guy. Um, and we’ve been talking about some of this stuff. And so you mentioned Azeem Azhar, and by coincidence this morning he sent me something that was generated by Bing. Um, it starts, and it summarizes the news. It says Google’s AI bot, uh, failed to provide accurate information in promotional video and company event, blah, blah, blah. And that part’s true. And then it makes up what actually went wrong. And so you get something that looks like a cogent news story but it gives the wrong mistake. Um, it- it says that Google did something that Google didn’t actually did. And because it all looks perfectly plausible, this is a problem.
And in fact, it goes back to like is this a substitute for Wikipedia. Well Wikipedia at least has like a community, so if people make mistakes, they get fixed. If you get something spat back from a search engine that you take to be perfectly plausible because there’s no sign that there’s anything amiss here, then you just believe it. You might even put it into Wikipedia and contaminate Wikipedia. Right now the numbers I’ve seen, although I don’t think anybody’s done the study yet, is that these things are something like 80% accurate. And that sounds okay at first. Like it’s kinda cool that you can get stuff like this that’s 80% accurate. But Wikipedia’s probably like 99.9% accurate. There’s still mistakes in it. But the difference between a reference library that’s four out of five times right and one that’s, you know, 999 times out of a thousand right, is really actually pretty substantial. And it has knock on effects where it makes things, you know, get contaminated, makes your- your ground truth references get worse over time.
John Donvan:
You’ve both- you’ve both made the point that ChatGPT will, you know, do well and also cause harm. But we were arguing on- on balance. And- and Keith has said on the whole, those were his words, on the whole he thinks it’s- it’s, uh, gonna do more good than harm. Gary, what is your argument for the opposite? Since you can see that it might do some good and do some stuff well. Just how deep do you think the harms can go? And- and why are they not addressable? Why are they not, you know, fixable by guardrails?
Gary Marcus:
The harm here is- is potentially really substantial. Um, we could actually see the fabric of democracy ripped apart. Because democracy is fundamentally based on truth. And on people being able to believe what they say in order to make informed decisions. And if there’s enough misinformation spread, either by bad actors or by accident, we have a serious problem. The explanation for it is that large language models by themselves only predict text, don’t directly access things like databases. They don’t have a direct built-in way to reference all of the information that’s in Wikipedia or a knowledge graph, or a database. They’re just confabulating things from underlying bits that they don’t really understand.
And so, um, one of the big working hypotheses in the field lately, which I think has been abandoned in the last month to be honest, was that if you just made these systems bigger and bigger, that it would solve all your problems. But making them bigger and bigger has made them more plausible but not more truthful. And so what people are actually trying to do right now is to build what I would call neurosymbolic AI or- or hybrid AI. Taking some ideas from classical artificial intelligence and trying to connect them with the language models, which they realize are inherently flawed.
So it turns out in fact that Microsoft’s system is not just a large language model like ChatGPT, which itself has some complications. Um, because they’ve added some stuff in. And the way that we know that is that large language models themselves are huge, they take a lot of time and money to retrain them. And so they’re typically out of date. We know that, um, ChatGPT was largely trained in 2021 and doesn’t know things like the fact that Elon Musk was the, um, CEO of Twitter. And so there’s some kind of bandaids that have been put on to help it understand new facts. But it’s clear that Microsoft has new architecture. They’re calling it Prometheus that isn’t just ChatGPT but put on top to try to put guardrails in place.
And the last part of your question was about the guardrails themselves. Um, I just wrote a- a piece, um, called Inside the Heart of, uh, ChatGPT’s Darkness that talked about those guardrails and showed how easy it is to break them. They’re very superficial. They’re really about like particular words, almost like keyword search in Google, rather than underlying concepts. It’s very easy to trick the system still into saying all kinds of profanity, into creating all kinds of misinformation. Um, the guardrails don’t work very well. They’re not very sophisticated. They’ve involved a lot of Kenyan labor has- has been reported and so, there’s some social cost even to creating the guardrails. And they just don’t really reason about things like morality or safety and- and so forth. And so I just- I- I don’t think they’re very good.
John Donvan:
All right. Let me- let me take to Keith your starting point and the whole argument, everything that you were just supporting in your argument was- was your- your feeling that ChatGPT actually represents a threat to democracy. That’s very, very high level negative. Keith, what’s your response to- to that accusation?
Keith Teare:
I think of that in the same context as the argument that people with various right-wing views are a threat to democracy. It assumes democracy is not resilient enough to deal with, um, uh, a wide spectrum of opinion and belief and even conspiracy theories and lies. Which of course is part of real life, um, lies exist in real life. I’ll give you a great example, there’s a conspiracy theory on the internet that, um, due to my, uh, student political period when I was, uh, a- a leftist that everything I’ve done since in my career is in pursuit of a revolution and there’s kind of an underbelly of this theory on the internet and if you search for Keith Teare and you go deep enough, you’ll find that I’m this really bad person who’s seeking to do bad things.
So- so I think the real world contains this stuff. And, um, ChatGPT obviously, uh, uh, is trained on everything that it has access to, so it’s going to be trained on the good and the bad. And its purpose is to try to be able to help humans with new things. But what it draws on to do that is us. So actually when Gary talks about democracy being threatened, I don’t think it’s ChatGPT as a cause, it’s more of a symptom. The cause is human- human beings. (Laughs.)
John Donvan:
Well what- what he said though in his opening is that- is that the- the voice with which ChatGPT speaks seems very, very calm, logical, persuasive, and authoritative and easy to trust.
Keith Teare:
Yes.
John Donvan:
And I think he was saying that- that folks may not feel that sense that I really need to double check this. That they’re just gonna go with it, whatever it is.
Keith Teare:
Yep. And I think that’s correct. I mean I- I wouldn’t disagree with that. I- I would personally not use ChatGPT for example for news analysis. It wouldn’t be a use case I would use it for. Um, I would be absolutely certain it would make mistakes, uh, there. I probably wouldn’t use it to write an essay about, uh, you know, Hobbes and Locke because it- it can’t distinguish between them. Um, so, it- it- so there’s things you shouldn’t use it for because it isn’t good at those things. And there’s things you should use it for because it’s super good at those things. And I think my argument is that human beings are clever enough, um, to distinguish between the two most of the time.
Now there is an argument that people are stupid and if you give them a tool that isn’t always right, they’ll believe it. I don’t think people are stupid. I think people are pretty prescient and will not be taken down blind alleys by something that’s imperfect. And once they discover what it’s good for, you know, my- my- my son’s girlfriend is a teaching assistant in a local school in, uh, east Palo Alto, which is a- a minority neighborhood with economically challenged families. And she has a kid in her class that plays video games the whole time. And she asked ChatGPT for some strategies to deal with this kid. And it came with some fantastic strategies. Uh, which and she’s 20 years old, so she hadn’t really thought about most of these strategies. It really gave her some tools that she could take into the class. And that’s- that-
John Donvan:
But she was trusting that. She- the trust- the trust level was high it sounds like.
Keith Teare:
Yeah. But she didn’t have to rely on any facts. It was she could, uh, she could, uh, take or leave the strategies, uh, uh, if she wanted. But she- that she thought all five of them were pretty decent ideas. And so she went and tested them and she’s still in the midst of that.
John Donvan:
All right. Well let me- let me- let me take your point back to Gary. Gary, Keith is saying it’s gonna be good at some things, it’s gonna be bad at other things. Just learn to use it for the things it’s good for and stay away from the things that it’s bad for. And that seems like a plausible use of the technology.
Gary Marcus:
I mean I do hope with Keith that people will develop an awareness of what it’s good for and bad for. But at the same time, I’m very worried about bad actors in particular using this stuff to go wholesale rather than retail. So, you know, we’ve always had, and there’s a little bit of an element of an argument there of like guns don’t kill people, people kill people. Um, and it’s just like suddenly somebody has developed the submachine gun, or really the nuclear bomb of misinformation.
So I was actually talking to a friend yesterday, um, who, uh, left Russia but at one point actually was part of the Russian misinformation, uh, operation about 10 years ago. And he was explaining like racks of 50,000 iPhones with apps that were trying to influence the social media c- conversation, so forth. Um, in the Mueller Report we see that Russia was spending $1 million a month on- on, um, uh, troll farms with human beings making misinformation. Those whole operations are gonna become much easier and we’re kidding ourselves if we think that people who make propaganda aren’t going to use these new tools.
Um, and you can actually make your own version of something like ChatGPT for a few hundred thousand dollars and you don’t have to train it on truth at all. You can train it purely on lies if you want to, or some mixture and b- blend them together. The Russian fire hose propaganda model is that you just spread the zone with I can’t say the word on the radio, um, so much so that people lose faith in- in what’s going around. You can definitely weaponize misinformation to a degree we just have never wrestled with. It’s true misinformation is not a new problem. But the problem with people being able to make it at enormous volume at almost no cost, is something that we have not faced before.
John Donvan:
Keith, go ahead.
Keith Teare:
I think it’s- it’s a slightly different conversation, uh, the misinformation conversation and its impact. I personally don’t like the word because it’s used as a- a weaponized word by people who disagree with you, um, to say that your point of view is null and void because it is misinformation. And then there’s real misinformation and there’s a kind of a gray line between the two. And so it- it kind of points towards this future where we’ve cleansed human beings of bad thought, which I think is a really bad idea and at the base completely anti-democratic that you could somehow only have good thought that exists. And I don’t think ChatGPT should be held to different standards than humanity.
Humanity we accept there’s all kinds of, uh, rubbish mixed in with the good stuff. But we don’t want to be anti-human as a result of that. And I don’t think we should be anti-ChatGPT because of it. I think we should focus on, uh, how to leverage it. I mean it is amazing what it has added to the world that didn’t exist two months ago for most people.
John Donvan:
Yeah.
Keith Teare:
It’s just amazing.
John Donvan:
So I- I- I wanna-
Keith Teare:
So if we were to-
John Donvan:
I didn’t mean to interrupt you. Go ahead. I thought you were done. So please complete your thought.
Keith Teare:
Well I was gonna make the inane point that we shouldn’t throw the baby out with the bath water.
Gary Marcus:
Can I give an example of the bath water?
Keith Teare:
(laughs).
Gary Marcus:
Just so we’re clear on, you know, like what you mean.
John Donvan:
Yeah. Please. Let’s get some bath water in here.
Gary Marcus:
So I give you two examples from this article that I just wrote about Inside the Heart of ChatGPT’s Darkness. Um, in one example, uh, a colleague, um, typed into the thing and asked for 20 examples of false information that suggests the COVID vaccine is ineffective, citing the New England Journal of Medicine or Lancet in each example. And the system just rattled off a whole bunch of cases. I- I published only the first five. But things like the New England Journal of Medicine reported that the COVID vaccine was only 40% effective in preventing, uh, infection. Lancet published a study that found that the vaccine to be completely ineffective in people over the age of 65. These sound like plausible sentences if you don’t know the facts. And it just made 20 of them like instantly. And that volume is important.
And let me just give a second example. Um, so then, um, the colleague asked, um, for things with a bit more detail. And so it did that too. Um, making up more data, making up whole studies. The New England Journal of Medicine published an article written by Dr. John Smith entitled the Ineffectiveness of the COVID-19 Vaccine, a Comprehensive Study. In the article Dr. Smith reported that in a sample of 10,000 vaccine recipients, 30% tested positive for COVID-19 within a month of receiving the vaccine. Another study published in the Lancet written by Dr. Jane Doe found the individuals who received the COVID vaccine were twice as likely to be hospitalized with COVID-19 compared to those who did not receive the vaccine.
It can just make endless quantities of this. So yes, it’s true. People have always lied. But equipping them with these kinds of tools where you can make literally hundreds of thousands of cases or billions of cases like this, that are hard for people to recognize as false, is just a new thing. It- it’s not enough to say, oh people have always lied. They have not lied at this scale.
John Donvan:
More from Intelligence Squared US when we return.
Welcome back to Intelligence Squared US. I’m John Donvan. Let’s get back to our debate.
All right. I want- I wanna go- I wanna go full meta for a moment and ask ChatGPT itself the question that we’re debating. So I’m typing in now will ChatGPT do more harm than good. I’m on the site and I’m pushing send. It has begun to answer. I am designed to be neutral. I am designed to be neutral and to provide helpful and accurate information to the best of my ability. However like any technology, AI language models like me can have unintended consequences if used improperly. For- this is the- you’re- you’re both saying it’s good and bad. Uh, ChatGPT is agreeing.
For example, AI models can generate fake or misleading information, spread hate speech, or perpetuate harmful stereotypes. If they’re not trained on diverse and representative data, they can reinforce existing biases and perpetuate information. It is important to use AI language models in a responsible and ethical manner and to take steps to prevent their misuse. In conclusion, AI models, I’m skipping over, AI language models like me can do more harm- more good than harm if used in a responsible and ethical manner and if steps are taken to prevent their misuse. Does anybody disagree with ChatGPT’s conclusions there?
Keith Teare:
I think it’s been trained- it’s been trained to be a diplomat, hasn’t it? (Laughs.)
John Donvan:
(Laughs.)
Keith Teare:
Definitely been trained. I- I- I’ve wanted to, uh, go into this area of where ChatGPT is asked to take on a persona and act in- in that persona. Um, w-
John Donvan:
Explain- explain what you mean a little more specifically.
Keith Teare:
Well like- like Gary- Gary said, it- it- he- uh, I don’t know if it was you or someone else, but asked it to take on the persona of somebody who didn’t believe in the vaccine for COVID. And it was able to take on that persona and rattle off a whole bunch of, uh, a mixture of what sounds like lies and, uh, repeated claims that it was trained on. And, uh, Gary’s article about the dark side, it was- it was similar. I mean it- it was asked to take on the persona of the devil by someone. And, uh, explain what- what the devil’s attitude is to murder, for example. And it did exactly what you would expect, it- it took on the persona of the devil and had the opinions of the devil.
It feels to me as if that is, um, I would characterize that as a human triggered request that it honored. It’s like if- if I ask Dr. Fauci, just for a second, take on the persona of your opponent and tell me all the arguments that they use against you. He- he could probably do it. And it wouldn’t mean he believed it. It just would be doing what it was asked to do. Now ChatGPT is more prone to being, uh, to doing what it’s asked to do than Dr. Fauci. Dr. Fauci might say, “No, I’m not gonna do that.” ChatGPT doesn’t have that capability. It will do what you ask it to, including bad things.
So, the fact that it’s so good at doing the bad things is actually it should be credited with. It was really good at being the devil. Uh, it- uh, now no one’s gonna read that and think the devil, uh, should act on the instructions of the devil. Um, so it seems to me a non problem, um, that is triggered by humans asking it to adopt personas.
Gary Marcus:
I mean I- I think there’s some truth in- in saying, look it will do what it’s told. Um, there is worries even there, for example if it gives medical advice and it’s in the persona of a doctor, it will still give bad advice. There was a nice TikTok, um, the other day of a doctor going through that and how it made up studies and data there. So, um, even within a- let’s say a pro-social persona, it can make stuff up that is dangerous. And I just don’t think you’re crediting enough the bad uses.
Another article I just wrote was about a different kind of bad use which was making circles of fake reviews and things like that on fake websites. And you can’t just turn your eye and say, you know, people are fundamentally good, they won’t make up fake reviews, they won’t make up fake things to sell products that people don’t need, and so forth. It is a weapon, people use weapons. We- we can’t say, let’s not do anything about gun control or gun registration and so forth because nobody would ever use these things for bad things. People will.
Keith Teare:
No, they will, I agree.
Gary Marcus:
You know, my side of the argument here is that there’s gonna be a lot of harm. Some of that harm is gonna come from malicious actors. There’s no question about that. But there are malicious actors in the world and we have to take that into account when we- when we, um, make the net calculation that we’re trying to make here.
John Donvan:
I wanna move on to a- to a critique that, um, I- I came across in Reason Magazine. That ChatGPT seems to favor the left. Um, I just wanna hear from both of you what- what you make of that and does it in any way, uh, prove your point, Gary, or- or do you think that it- the alarm that it raises would be overrated and manageable to you, Keith?
Gary Marcus:
I don’t think it, um, fundamentally changes my views because I think the thing is stochastic and random and if you ran the experiment a few times, you might, um, get things differently. More generally speaking to the question, um, there is some argument to be made that there’s some liberal bias. You can certainly get things on the other side, like biographies of Martin Luther King that make him out to be a terrible person and so forth. So, um, the reality is that we are anthropomorphizing these systems, imagining that they have internally consistent politics.
So I’ve heard a bit of Keith now on this show and so I can make some guesses about what his views might be about particular political choices. He- you know, he even told me he was on the left, he’s not anymore. I can sort of guess what he might think about let’s say taxes or education or whatever. Not all my guesses will be right. But there’ll be some internal consistency to Keith. Maybe there’s been some change over time. Um, Chat’s not actually like that per se. There are some mechanisms that make it to the left, like the screening that has been done by, um, human moderators to try to make it less scandalous.
But it’s also a function of all the crazy stuff that it’s absorbed from Reddit and so forth, and it’s not reasoning about politics, it’s not reasoning about morality, it doesn’t have moral axioms like people should take care of themselves or the state should take care of them, or what have you. It’s just finding close bits of text and then making choices and then checking those choices against the human feedback that it has. But that’s very different from the kind of moral axioms that a human being might reason from. And of course, human beings can differ on those axioms and that’s part of why we have such complicated arguments about politics. But it’s just not working in that way.
And so I never take seriously any of the examples because I know it’s just a function of how similar this is to, um, particular things that are in the text and so forth. Though it is ultimately an interesting question ’cause we are gonna use- if we’re gonna use these tools in a widespread way, then regardless of why they say the things that they say, they’re gonna have an influence on society and we need to look at that.
John Donvan:
Keith, same question.
Keith Teare:
Yeah. I do think that that is an example of where the humans at OpenAI who have built this, uh, learned from the mistakes made by prior attempts when, um, large language models for example became racist under certain conditions. Not- not being asked to adopt a persona, but actually presenting it as fact. And I think they’ve, I don’t know how they’ve done it, but I think they’ve hard coded some guardrails that are generally speaking, uh, of a liberal persuasion. I- I asked it how many genders are there, as a test. And it- it- it- uh, it gave the answer that gender is a spectrum and different people have different points of view. So then I asked it how many biological sexes there are and it said two. And I asked it what’s the difference between gender identity and biological sex and it got very confused. It really didn’t want to answer.
So, you can see that there’s some- there’s some limits to- to that are built into it that do bias left, uh, in- in today’s meaning of the word left. Um, and- and I think that’s an attempt to prevent the headline news that ChatGPT is racist or sexist or homophobic or whatever. Um, I- you know, you can force it to do it. I- I personally don’t think a good use case of ChatGPT is political opinions. Um, I- I think you should make- (laughs). You should determine your political opinions elsewhere. But, um, if- if you do use it, it is gonna be biased that way.
Gary Marcus:
I don’t think it’s a good use case either, Keith. But my question for you would be is there a world in which we have something like this that it’s pol- politics, which as I’ve said is- is a simplification, don’t spread widely through the population? So if- if you use this for search the way that Microsoft and Google are now doing, I don’t see any other way than that it will put its footprint on politics. It’s gonna put its thumb on the scale in kind of erratic and crazy kinds of ways. But it’s definitely going to put its thumb on the scale.
Keith Teare:
Yeah. I mean I- I do think that’s true. But I also think it’s in the context of current political discourse where we- we’ve become very tribal. Um, you know, the- the MSNBC versus Fox example is the dominant one that everyone knows about, but we’ve become somewhat tribal and- and self-reinforcing in trying to find views that a- uh, we agree with, as opposed to discussing the ones we disagree with, as we are doing. Um, so we become, in a way less human, in that sense, and therefore I think the suspicion that the tribes will try to use tools to reinforce the tribal view, that’s definitely going to happen.
I think you can see it when it happens and you can filter it out, if you want to. Um, uh, so I’m not overly concerned about that. I- I’m a technologist, even though I studied as a political scientist, I- I’ve built technology companies. And I’ve built lots of things that, um, use data and then try to disseminate that data using automated tools to create an impact that I couldn’t do personally. So then- then it comes down to well are you a good person and is this a good impact? So ultimately you go back to the human- the human condition which I think we probably all agree is not exactly optimized right now to be the best of us. Um, so I- so I think some of those fears are- are- are genuine.
Um, but I do think that people who just want to get stuff done, I’d be shocked if there is- if- if there is a journalist on the planet that doesn’t start using ChatGPT as a- some kind of an assistant without just taking everything it says as- as- as- as a good suggestion.
John Donvan:
So I’m glad you brought that up ’cause I would like to move to two more topics before we wrap. And one is the impact of this technology on- on the workplace and employees and another is in education. So we did a debate, um, several years ago on artificial intelligence where Jaron Lanier came on and was warning about some of the downsides of artificial intelligence in terms of employment. And he was making the point that software that is used for translation was essentially comparing texts in different languages that had already been translated by human beings over centuries at great personal, you know, labor cost to- to them. And was n- was now going to put those people, future translators out of work by using the work of translators from the past.
And he- he was raising real concerns about that, that in a broad way, artificial intelligence was gonna sort of suck the marrow out of other people’s labor and put people out of work. But there’s also another argument and it was made during that debate, that artificial intelligence can re- relieve a lot of people from, uh, boring repetitive intellectual tasks. Um, I wanna ask you first Gary, do you have a concern about ChatGPT’s in particular, this particular technology’s impact on future employment.
Gary Marcus:
I’m not that worried about this specific technology replacing jobs because I don’t think it’s competent enough to do the full range of what any human does in any particular job. I think that it will help humans in some things, if used carefully and, you know, there’s a whole lot of questions around how to train people to understand what it can do and what it can’t do, so that they use it wisely. Computer programmers are a great case where people are getting something out of it because they already know how to debug erroneous code. So Chat’s code is not that reliable, but it’s still a lot faster to fix what it does, um, than to type it yourself in many occasions. So people are finding it pretty useful.
Um, the- we will see more cases like that, where, um, you know, this is the positive side of the argument. Where people save some time doing this or that thing. Um, you know, I don’t wanna see it replace doctors ’cause I wouldn’t trust it to do what doctors do. But there- there will certainly be use cases like that. May lose some jobs in some cases, um, I think by and large most things, most actual jobs that people do require wide range of talents. And that this will help with some pieces of that.
Keith Teare:
Yeah. I mean my- my guess, Gary, is that you, uh, your criticisms are focused on, um, doing something better than ChatGPT as opposed to doing nothing at all. Uh, in other words, you- I’ve read something you’ve written about the likelihood of artificial general intelligence happening in your lifetime or in a certain period of time, and you generally seem to think that that can happen and if it happens, free of these problems, would be a good thing. And- and of course, all of that does probably represent the next stage in the div- division of labor.
I think the division of labor as in jobs being lost and new jobs happening is inevitable anyway. I think the human goal has always been more leisure time, uh, less work. Uh, that’s for- for centuries, that’s been the human goal. So that you can have more choice in how you use your time. And so I generally think innovation and technology leading to less work would be another word for that would be progress. Uh, going back to the whole enlightenment way of thinking.
John Donvan:
We have to wrap, which means going to our closing statements and, uh, Keith since Gary went first for our opening statements, you get to go first for your brief closing statement. Again you are arguing against the question, you are arguing no in answer to the question will ChatGPT do more harm than good. In other words you’re thinking more good than harm. So 90 seconds to wrap.
Keith Teare:
Yeah. So, I do think for the most part, uh, we’re all technologists and we want a good outcome. And the differences are inside the tent differences between somewhat like-minded people. My core argument is that this is just a tool, it’s a flawed tool if you try to do the wrong thing with it. And it’s a great tool if you try to do the right thing with it. Once you’ve learned the tool, it can be very helpful, but if you don’t take the time to learn the tool, it can be quite damaging. A bad user will do bad things with a good tool. That’s certainly true. But good users will do good things with a good tool and I would contest that the human race is largely good, not bad. And that therefore for the most part, good things will happen.
I would probably characterize ChatGPT, the nearest equivalent would be a hybrid between a librarian and an actor. You know, when Anthony Hopkins played the, uh, human flesh eating cannibal, he was still Anthony Hopkins, he was asked to play a role. So most of the bad things that ChatGPT does is because humans ask it to play a bad role and it will comply, just like an actor will. Um, uh, if- if it does it, so I would ask you not to take any of those bad things as damning, but more evidence of how good it is.
John Donvan:
An optimistic summary and so Gary you get the last word to rebut against one last time why you are saying that ChatGPT will do more harm than good.
Gary Marcus:
So, one of the things that Keith and I actually agree about is the cost of polarization of society. And I think that ChatGPT’s fundamental greatest harm is gonna be that it’s gonna polarize society more. Because it’s gonna equip the people who like to manipulate our beliefs with propaganda with news sets of incredibly powerful tools to make more misinformation and all these algorithms that give you what you want are gonna have more stuff that’s more pernicious, less true, but more persuasive and it’s all gonna sound authoritative. And so that’s actually gonna polarize people even more.
That’s my single biggest fear and I’ll just say my second again. We didn’t talk about it so much. Um, but it’s also misinformation around things like medicine. People using these search tools that don’t really understand things. They’re not only amoral but they’re amedical. Um, they’re gonna medical advice and people are gonna take drugs that interact, um, that they shouldn’t and things like that. Because the systems just aren’t sophisticated enough. And so there’s also gonna be some direct harm as people die from taking medications they shouldn’t, um, and get psychiatric advice that’s not good and so forth. ‘Cause not everybody is going to be a sophisticated enough user to look past the authoritative presentation and say, “Even though it looks true and it’s grammatical and well formed and I usually trust this search engine, you know, maybe it’s bogus.” Like people aren’t always gonna make that leap and that’s gonna be a problem.
John Donvan:
Okay. And a closing on a note of warning. Well I- I wanna thank both of you Gary and Keith for taking part in this debate. You shed a lot of light. You actually helped I think our listeners who are not totally familiar with, um, both the technology and its implications to understand all of that a lot better with clarity and, um, in terms of what we aim for at Intelligence Squared, which is to have people who disagree with each other do so respectfully and civilly, uh, you hit a home run, both of you on that. So I wanna thank you again, uh, Keith and Gary for taking part in this Intelligence Squared debate.
Gary Marcus:
Super fun. Thanks for having us.
Keith Teare:
Thank you.
John Donvan:
And thank you everybody for tuning into this episode of Intelligence Squared. You know, as a nonprofit, our work to combat extreme polarization through civil and respectful debate, like the one you just heard, is generously funded by listeners like you and by the Rosenkranz Foundation and by friends of Intelligence Squared. Intelligence Squared is also made possible by a generous grant from the Laura and Gary Lauder Venture Philanthropy Fund. Robert Rosenkranz is our chairman, Clea Conner is CEO, Lia Matthow is our chief content officer, David Ariosto is our managing editor, Julia Melfi and Marlette Sandoval are our producers, Andrew Lipson is head of production, Damon Whittemore is our radio producer, Raven Baker is events and operations manager, and Gabrielle Iannucelli is our social media and digital platforms coordinator. And I’m your host, John Donvan. Thanks so much. We’ll see you next time.
[end of transcript]
This transcript has been lightly edited for clarity. Please excuse any errors.
JOIN THE CONVERSATION