A 134-page report from the University of Michigan’s Gerald R. Ford School of Public Policy warns researchers and policymakers alike against the potential dangers of large language models (LLMs), which have been on the rise in recent years.
The comprehensive report, compiled by a team of seven researchers, addresses numerous apprehensions regarding the development and use of LLMs, including their potential impact on the environment, privacy and security concerns, and the possibility of amplifying hate speech. The researchers also highlight potential solutions to these problems, in an attempt to recognize the models’ greater potential to improve societal conditions.
“LLMs have already generated serious concerns,” the report reads. “Because they are trained on text from old books and webpages, LLMs reproduce historical biases and hateful speech towards marginalized communities.”
Because LLMs are trained on extremely large amounts of data — including outdated language from historical documents as well as texts drawn from all over the internet. If hateful language is included within the training data, the LLMs may reproduce such language, replicating and perhaps intensifying human biases.
“Trained on texts that have marginalized the experiences and knowledge of certain groups, and produced by a small set of technology companies, LLMs are likely to systematically misconstrue, minimize, and misrepresent the voices of historically excluded people while amplifying the perspectives of the already powerful,” the researchers write.
The researchers also note that the physical spaces used to train and store the models could have a harmful impact on the environment as well. According to the report, the LLMs are trained in data centers that use roughly 360,000 gallons of water per day, in addition to significant amounts of other natural resources. The researchers fear that as the need for more data centers increases, they will be located near marginalized communities, thereby draining the resources from particularly vulnerable groups of people.
While the researchers admit that LLMs can contribute to the greater good, they ultimately argue that there must be some sort of regulations put in place to ensure that they do not exacerbate inequalities. These regulations include evaluating them through the Federal Trade Commission, scrutinizing the ways in which they are implemented in apps, as well as encouraging bodies like the International Organization for Standardization to put out yearly reports evaluating major LLMs.
“LLMs have great potential to benefit society,” the research team reports. “However, the priorities of the current development landscape make it difficult for the technology to achieve this goal.”