At London Book Fair, Publishers Urge Permission for AI Training

No prizes were given for guessing the hot topic for publishers attending the London Book Fair this week: AI

AI – artificial intelligence – will transform the global economy, according to the International Monetary Fund. And as they have done since the digital age dawned, scholarly and academic publishers are already embracing this latest technology.

While they explore how AI can improve their business practices for the benefit of researchers and information professionals, many publishers are also proactively asserting their rights when generative AI solutions that create text, images, and other media use copyrighted material to develop and train large language models.

Publishers recognize that gen AI depends on high-quality content for success, and they contend the tech industry must seek permission for its use.

Click below to listen to the latest episode of the Velocity of Content podcast.

Opening the show on Tuesday, CCC organized a panel discussion, “Publishing, Copyright & AI: Taking Action” that included representatives of leading UK scholarly publishers.

Sarah Fricker, Group Head of Legal for IOP Publishing – the Institute of Physics and Claire Harper, Head of Global Rights and Licensing at Sage shared with me what they are doing when it comes to facilitating permission for copyrighted works to train AI models.

“This is work that’s been created, that’s original, that people have spent a lot of time putting together and publishing, and it only to me seems fair and reasonable that if tech companies want to use [published content] to improve their tools that they get the permissions they need from the copyright owners to do that,” Fricker said.

“And there are benefits on both sides,” she continued. “The big tech companies can be sure they’ve got the right version of the article, which is so important. But equally, there’s value placed on that work, and the copyright owner is getting recompense for that value. I think that is a very important balance.

Harper also cited licensed content as trustworthy. “We want to make sure that the content going in [to train Large Language Models] is reliable. Are we going to trust what an AI tool is telling us if it’s just pulling from a random source on the internet? The internet is full of fake news. We know that. So if we can license the content in, we can be a bit more assured that the content coming out is going to be more reliable and accurate.

“As publishers, we’re trying hard to make sure that we get streamlined processes, we get ways of licensing to make this easy as possible, because that’s in our interest as well,” Harper told a standing room only audience at the London Book Fair.

“We want licensed content to go into it. We want to be remunerated for that content. So we want to make it as easy as possible. I think having a collective license would be great. When photocopying came about, that was very disruptive to the industry. But we found a way to license that. So I feel like we could do that in this case as well.”

Subscribe to CCC’s blog