The Financial Times (FT) has granted OpenAI access to its news archives as part of a data sharing agreement, indicative of a growing trend among AI companies to secure private data sources. In this deal, OpenAI’s language model, ChatGPT, will provide summaries, direct quotes, and links to full FT articles, directing back to the original content on the FT’s website. OpenAI has further committed to collaborating with the FT to develop new AI-driven products.
FT’s CEO, John Ridding, hailed the partnership, stating it would benefit the FT and have wider implications for the industry. He asserted the importance of AI platforms compensating publishers for their content, adding that OpenAI understands the significance of transparency, attribution, and fair compensation. However, OpenAI’s understanding of transparency and attribution has been subject to contestation.
OpenAI’s COO, Brad Lightcap, highlighted that the partnership is about finding innovative ways for AI to boost news organizations and journalists, as well as enriching the ChatGPT experience with world-class journalism for millions of people worldwide. Even though the FT’s data is a valuable asset for OpenAI, its existing datasets already comprise trillions of words which have dubious ‘public’ or ‘open source’ labels.
This agreement comes at a time when AI organizations like OpenAI are acknowledging the need to start paying for data due to growing legal pressures. The deal is also beneficial for AI companies, as it helps keep their models updated with fresh, high-quality data.
Nonetheless, this quest for data incites substantial ethical challenges. Technology giants, including OpenAI, Google, and Meta, have been accused of using questionable methods that potentially violate legal and ethical guidelines. For instance, OpenAI reportedly developed a tool named Whisper that transcribes YouTube videos, despite possibly contravening YouTube’s policies.
Google and Meta, too, have allegedly been investigating or executing strategies that evade or reinterpret prevailing copyright and privacy laws to gather more data. Some controversial tactics include modifying privacy regulations to permit AI applications to use publicly available content from platforms like Google Docs. Therefore, even though AI businesses are willing to pay for data now, that does not prevent them from transgressing rules in other areas.