Artificial intelligence has pervaded much of our daily life, whether it’s in the form of scarily believable deepfakes, online news containing “written by AI” taglines or novel tools that could diagnose health conditions. It can feel like everything we do is run through some sort of software, interpreted by some mysterious program and kept on a server who knows where. When will the robots take over already? Have they already taken over?
The recent developments in AI offer existential questions we’ve been wrestling with since we put pen to proverbial paper: Who wrote this, and can I trust it? Fake news is old news, but some still argue over whether Shakespeare existed or represented multiple authors. Large language models (LLMs) are combinations of authors, each with their own style, voice and expertise. If the generative AI program ChatGPT keeps trying—and we keep feeding it Shakespeare—will it write our next great tragedy?
Linguist Naomi S. Baron of American University has been wading in the AI waters for years. In her latest book, Who Wrote This? How AI and the Lure of Efficiency Threaten Human Writing, she dives into the crux of the matter: If we hand over the written word to AI, what will we lose? Scientific American spoke with Baron on the issue of the ownership and trustworthiness of written communication now that AI is on the scene.
[An edited transcript of the interview follows.]
Did you use ChatGPT to write any of this book?
Sort of but just a smidge. I completed Who Wrote This? in mid-November 2022, two weeks before ChatGPT burst on the scene. It was a no-brainer that I needed to incorporate something about the new wonder bot.
My solution was to query ChatGPT about the intersection of this cutting-edge form of AI with issues such as creativity, education and copyright. In the book, I quote some of ChatGPT’s responses.
When I asked ChatGPT if it could hold copyright on short stories that it authored, the answer was “no” the first time I asked and “yes” the second. The discrepancy reflected the particular part of the dataset that the program dipped into. For the “no” answer, ChatGPT informed me that as an LLM, it was “not capable of holding copyrights or owning any form of intellectual property.”
By U.S. copyright law, that’s true. But for the “yes” response, the bot invoked other aspects of U.S. copyright: “In order for a work to be protected by copyright, it must be original and fixed in a tangible form, such as being written down or recorded. If a short story written by GPT meets these criteria, [ChatGPT said], then it would be eligible for copyright protection.
Consistency is the hobgoblin of large language models.
When thinking about AI-written news, is it all just a snake eating its own tail? Is AI writing just fodder to train other AIs on?
You’re right. The only thing relevant to a large language dataset is having text to consume. AI isn’t sentient, and it’s incapable of caring about the source.
But what happens to human communication when it’s my bot talking to your bot? Microsoft, Google and others are building out AI-infused e-mail functions that increasingly “read” what’s in our inbox and then draft replies for us. Today’s AI tools can learn your writing style and produce a reasonable facsimile of what you might have written yourself.
My concern is that it’s all too tempting to yield to such wiles in the name of saving time and minimizing effort. Whatever else makes us human, the ability to use words and grammar for expressing our thoughts and feelings is a critical chunk of that essence.
In your book, you write, “We domesticate technology.” But what does that “domestication” look like for AI?
Think about our canine companions. They descended from wolves, and it took many years, plus evolution, for some of their species to evolve into dogs, to be domesticated.
Social scientists talk about “domestication” of technology. Forty years ago personal computers were novelties. Now they’re ubiquitous, as are software programs running on them. Even Wikipedia—once seen as a dubious information source—has become domesticated.
We take editing tools such as spell-check and autocomplete and predictive texting for granted. The same goes for translation programs. What remains to be seen is how domesticated we will make text-generation programs, such as ChatGPT, that create documents out of whole virtual cloth.
How has your understanding of AI and LLMs changed how you read and approach writing?
What a difference three years makes! For my own writing, I remain old-fashioned. I sometimes still draft by hand. By contrast, in my role as a university professor, I’ve changed how I approach students’ written work. In years past I assumed the text was their own—not so today. With AI-infused editing and style programs such as Microsoft Editor or Grammarly, not to mention full-blown text-generation tools, at students’ beck and call, I no longer know who wrote what.
What are the AI programs that you feel are the least threatening, or that you think should be embraced?
AI’s writing ability is an incredible tour de force. But like the discovery of fire, we must figure out how best to harness it. Given the novelty of current programs, it will take at least several years to feel our way.
Today’s translation programs, while not perfect, are remarkably good, and the benefit is that everyday users who don’t know a language can get immediate access to documents they would have no other way of reading. Of course, a potential drawback is losing motivation for learning foreign languages.
Another promising use of generative AI is for editing human-generated text. I’m enthusiastic when AI becomes a pedagogical tool but less so when it simply mops up after the writer, with no lessons learned. It’s on users to be active participants in the composition process.
As you say in your book, there is a risk of valuing the speed and potential efficiency of ChatGPT over the development of human skills. With the benefit of spell-check, we can lose our own spelling proficiency. What do you think we’ll similarly lose first from ChatGPT’s ability to write legal documents, e-mails or even news articles?
As I argue in my book, the journalism business will likely feel the effects on employment numbers, though I’m not so much worried about the writing skills of the journalists who remain.
E-mails are a more nuanced story. On the one hand, if you use Microsoft Outlook or Gmail, you’ve already been seeing a lot of autocomplete when you write e-mails. On the other hand, the new versions of AI (think of GPT-4) are writing entire e-mails on their own. It can now literally be my bot writing to your bot. I worry that the likes of ChatGPT will lull us into not caring about crafting our own messages, in our own voice, with our own sentiments, when writing to people who are personally important to us.
The copyright infringement cases are interesting because we really are in uncharted territory. You’ll remember the case of the The Authors Guild v. Google, where the guild claimed Google Books enabled copyright infringement when it digitized books without permission and then displayed snippets. After many years of litigation, Google won ... under the ruling of fair use.
From what I’ve been reading from lawyers who are copyright experts, I suspect that OpenAI [the company that developed ChatGPT] will end up winning as well. But here’s the difference from the Authors Guild case: With Google Books, authors stood to lose royalties because users of Google Books were presumably less likely to purchase copies of the books themselves. With ChatGPT, however, if a user invokes the bot to generate a text, and then said user looks to sell that text for a profit, it could be a different ball game. This is the basis of cases in the world of generative art. It’s a brave new legal world.