Earlier this month, Meta announced the development of its large language model Open Pretrained Transformer (OPT-175B), which has been trained on 175 billion parameters from public datasets.
Unlike many other large language models, OPT-175B will be available for free to all researchers or institutions that request access. The company notes that this effort is an attempt to “democratize” large language models, which will allow for further research into the models’ potential benefits — and dangers — to society.
“We believe the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular, given their centrality in many downstream language applications,” a team of researchers at the company wrote in a May 3 blog post.
Meta notes that large language models have played a major role in artificial intelligence and natural language processing research, however, because established models like GPT-3 have mostly been accessible to the public through paid APIs, “full research access” has been limited. “This restricted access has limited researchers’ ability to understand how and why these large language models work.”
Models like OPT-175B are capable of various complex processes, from generating a coherent and grammatically sound text to solving math problems and answering reading comprehension questions. However, many have brought up concerns about the models’ potential to parrot inappropriate or offensive language. Meta hopes that, by taking a transparent approach and sharing the model with all researchers, OPT-175B can hopefully shed light on these issues. Additionally, some researchers have brought up environmental concerns about these models — Meta notes that OPT-175B has a significantly smaller carbon footprint than GPT-3.
“We hope that OPT-175B will bring more voices to the frontier of large language model creation, help the community collectively design responsible release strategies, and add an unprecedented level of transparency and openness to the development of large language models in the field,” the company’s blog post reads.