[ad_1]
A recent article in The Guardian stirred up a lot of excitement—and a little fear—on social media. The reason: The initial draft was reportedly written by GPT-3, OpenAI’s new text generator.
Since its beta release, GPT-3, an artificial intelligence system that takes a cue and generates text, has captivated the tech community and the media. Developers and computer scientists have been using it to write articles, website markup, and even software code. Some entrepreneurs are contemplating creating new products on GPT-3.
While flawed in fundamental ways, GPT-3 still shows how far advances in natural language processing have come. This is by far the largest and most coherent text-generation algorithm ever created.
But it also highlights some of the problems the AI research community faces, including its growing dependence on the wealth of large tech companies. This is a problem that could endanger the scientific mission for which OpenAI and other AI research labs were founded.
The Cost of GPT-3
GPT-3 is a massive deep-learning model. Deep learning is a type of AI system that develops its behavior through experience. Every deep learning model is composed of many layers of parameters that start at random values and gradually tune themselves as the model is trained on examples.
Before deep learning, programmers and domain experts had to manually write the commands that defined the logic and rules to parse and make sense of text. With deep learning, you provide a model with a large corpus of text—say, Wikipedia articles—and it adjusts its parameters to capture the relations between the different words. You can then use the model for a variety of language tasks such as answering questions, automatic email-reply suggestions, and advanced search.
Research and development in the past few years has shown that in general, the performance of deep-learning models improves as they are given larger numbers of parameters and trained on bigger data sets.
In this respect, GPT-3 has broken all records: It is composed of 175 billion parameters, which makes it more than a hundred times larger than its predecessor, GPT-2. And the data set used to train the AI is at least 10 times larger than GPT-2’s 40-gigabyte training corpus. Although there’s much debate about whether larger neural networks will solve the fundamental problem of understanding the context of language, GPT-3 has outperformed all of its predecessors in language-related tasks.
But the benefits of larger neural networks come with trade-offs. The more parameters and layers you add to a neural network, the more expensive its training becomes. According to an estimate by Chuan Li, the Chief Science Officer of Lambda, a provider of hardware and cloud resources for deep learning, it could take up to 355 years and $4.6 million to train GPT-3 on a server with a V100 graphics card.
“Our calculation with a V100 GPU is extremely simplified. In practice, you can’t train GPT-3 on a single GPU, but with a distributed system with many GPUs like the one OpenAI used,” Li says. “One will never get perfect scaling in a large distributed system due to the overhead of device-to-device communication. So in practice, it will take more than $4.6 million to finish the training cycle.”
This estimate is still simplified. Training a neural network is hardly a one-shot process. It involves a lot of trial and error, and engineers must often change the settings and retrain the network to obtain optimal performance.
“There are certainly behind-the-scenes costs as well: parameter tuning, the prototyping that it takes to get a finished model, the cost of researchers, so it certainly was expensive to create GPT-3,” says Nick Walton, the co-founder of Latitude and the creator of AI dungeon, a text-based game created on GPT-2.
Walton said that the real cost of the research behind GPT-3 could be anywhere between 1.5 to 5 times the cost of training the final model, but he added, “It’s really hard to say without knowing what their process looks like internally.”
Going to a For-Profit Model
OpenAI was founded in late 2015 as a nonprofit research lab with the mission to develop human-level AI for the benefit of all humanity. Among its founders were Tesla CEO Elon Musk and Sam Altman, former Y Combinator president, who collectively donated $1 billion to the lab’s research. Altman later became the CEO of OpenAI.
But given the huge costs of training deep-learning models and hiring AI talent, $1 billion would cover only a few years’ worth of OpenAI’s research. It was clear from the beginning that the lab would run into cash problems long before it reached its goal.
“We’ll need to invest billions of dollars in upcoming years into large-scale cloud compute, attracting and retaining talented people, and building AI supercomputers,” the lab declared in 2019, when it renamed itself OpenAI LP and restructured to a “capped-profit” company. The change allowed venture capital firms and large tech companies to invest in OpenAI for returns “capped” at a hundred times their initial investment.
Shortly after the announcement, Microsoft invested $1 billion in OpenAI. The infusion of cash allowed the company to continue to work on GPT-3 and other expensive deep-learning projects. But investor money always comes with strings attached.
Shifting Toward Obscurity
In June, when it announced GPT-3, the company did not release its AI model to the public, as is the norm in scientific research. Instead, it released an application programming interface (API) that allows developers to give GPT-3 input and obtain the results. In the future, the company will commercialize GPT-3 by renting out access to the API.
“Commercializing the technology helps us pay for our ongoing AI research, safety, and policy efforts,” OpenAI wrote in a blog post announcing the GPT-3 API.
But to make GPT-3 profitable, OpenAI will have to make sure other companies can’t replicate it, which is why it is not making the source code and trained model public. Organizations and individuals can request access to the GPT-3 API—but not every request is approved.
Among those who weren’t given access to GPT-3 API are Gary Marcus, cognitive scientist and AI researcher, and Ernest Davis, computer science professor at New York University, who were interested in testing the capabilities and limits of GPT-3.
“OpenAI has thus far not allowed us research access to GPT-3, despite both the company’s name and the nonprofit status of its oversight organization. Instead, OpenAI put us off indefinitely despite repeated requests—even as it made access widely available to the media,” Marcus and Davis wrote in an article published in MIT Technology Review. “OpenAI’s striking lack of openness seems to us to be a serious breach of scientific ethics, and a distortion of the goals of the associated nonprofit.”
The two scientists managed to run the experiments through a colleague who had access to the API, but their research was limited to a small number of tests. Marcus had been a vocal critic of the hype surrounding GPT-3’s predecessor.
Can AI Research Be Saved?
GPT-3 shows the growing challenges of scientific AI research. The focus on creating larger and larger neural networks is increasing the costs of research. And, for the moment, the only organizations that can dispense that kind of money are large tech companies such as Google, Microsoft, and SoftBank.
But those companies are interested in short-term returns on investment, not long-term goals that benefit humanity in its entirety.
OpenAI now has a commitment to Microsoft and other potential investors, and it must show proof that it is a profitable company to ensure future funding. At the same time, it wants to pursue its scientific mission of creating beneficial AGI (artificial general intelligence, essentially human-level AI), which does not have short-term returns and is at least decades away.
Those two goals conflict in other ways. Scientific research is predicated on transparency and information sharing among different communities of scientists. In contrast, creating profitable products requires hiding research and hoarding company secrets to keep the edge over competitors.
Finding the right balance between the nonprofit mission and the for-profit commitment will be extremely difficult. And OpenAI’s situation is not an isolated example. DeepMind, the UK-based research lab that is considered one of OpenAI’s peers, faced similar problems after it was acquired by Google in 2014.
Many scientists believe that AGI—if ever achieved—will be one of the most impactful inventions of humanity. If this is true, then achieving AGI will require the concerted efforts and contributions of the international community, not merely the deep pockets of companies whose main focus is their bottom line.
A good model might be the Large Hadron Collider project, which obtained a $9 billion budget from funding agencies in CERN’s member and non-member states. While member states will eventually benefit from the results of CERN’s work, they don’t expect the organization to turn in profits in the short term.
A similar initiative might help OpenAI and other research labs to continue chasing the dream of human-level AI without having to worry about returning investor money.
Further Reading
[ad_2]
Source link