Language data in "high-quality" sets comes from sources such as "books, news articles, scientific papers, Wikipedia, and filtered web content," according to the study. But Russell's insights point toward another potential vulnerability: the shortage of text to train these datasets.Ī study conducted last November by Epoch, a group of AI researchers, estimated that machine learning datasets will likely deplete all "high-quality language data" before 2026. The data-collection practices integral to ChatGPT and other chatbots are facing increased scrutiny, including from creatives concerned about their work being replicated without their consent and from social media executives disgruntled that their platforms' data is being used freely. Russell's predictions widen the growing spotlight being shone in recent weeks on the data harvesting conducted by OpenAI and other generative AI developers to train large language models, or LLMs. This may impact the way generative AI developers collect data and train their technologies in the coming years, but Russell still thinks AI will replace humans in many jobs that he characterized in the interview as "language in, language out." ![]() Stuart Russell said that the technology that hoovers up mountains of text to train artificial intelligence bots like ChatGPT is "starting to hit a brick wall." In other words, there's only so much digital text for these bots to ingest, he told an interviewer last week from the International Telecommunication Union, a UN communications agency. ![]() It often indicates a user profile.ĬhatGPT and other AI-powered bots may soon be "running out of text in the universe" that trains them to know what to say, an artificial intelligence expert and professor at the University of California, Berkeley says. Account icon An icon in the shape of a person's head and shoulders.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |