ACL 2023: Computational Linguistics in the Age of Large Language Models

Asy is everywhere, big language models are an important topic of conversation at this year’s meeting in Association for Computational Linguistics (ACL).

Yang Liu, a senior main scientist with Alexa AI and generally meat at this year’s meeting in Association for Computational Linguistic.

“We have several sessions about large language models that were not a session at previous conferences,” says Yang Liu, a senior main scientist with Alexa AI and General Meeting of ACL. “And both keynote speakers are related to this topic.”

According to the ACL website, one of the keynote speakers, Geoffrey Hinton, who won a proportion of the Turing Award 2018 for Sperm Contribution to Deep Learning, by addressing “the disputed question of whether the current multimodal LLMs have subjective experience.” The second keynote speaker, Alison Gopnik, a professor of psychology and philosophy at the University of California, Berkeley, is entitled her speech “great language models as cultural technologies”.

“We also have a panel on large language models and there is another session about ethics and NLP [natural-language processing] As these models become more and more powerful adds “Liu.” These are the businesses that the whole community is aware of. And not just our society: the white world looks at the development of these technologies and their lift to society. “

Hallucination

One of the biggest problems with large language models (LLMs), of race, is their tendency to “hallucinate” or generate claims that sound plausible but are actually false. Currently, LIU says NLP scientists are trying to tackle this problem in several ways. One is through one step after approach trying to verify LLMS ‘output.

Related content

Generative AI raises new challenges in defining, measuring and mitigating concerns about justice, toxicity and intellectual property. But the work has started with the solution.

“When you have the system’s response, you can do knowledge,” explains Liu. “‘Can you find a source for this?’ When a model says Joe Biden is the current president, you can easily seek and find something credible for it.

At the moment, however, the “error rate is pretty high,” says Liu. “Even if I give you two texts, A and B, and I ask you,” Do they mean the same? “, It’s not a loose problem in NLP.”

Another approach, says Liu, is to more carefully curate the data used to train LLMs. “They are trained with trillion of tokens,” she says – where a “token” is a word, multiword expressions treated as a device or inferior unit. “If you want to check the information provided to these models, the first step is to ensure that the data is high quality data.”

Researchers are also trying to change the internal function of trained LLMs to control their output against Billy accurate claims. An LLM works by calculating the likelihood of the next token in a series of symbols; LLMs Cautious heads – Maybe dozens of them per. Networking – Determine Howvoy, as the model must weight each past dock when calculating the probability of the next token.

Feverous.png

Related content

The Amazon-sponsored Fever Data Set and shared the task challenges researchers to create more advanced fact control systems.

“A series of work aimed at improving the billing battery is of Editing activationWhich changes such a probability distribution, “explains Liu.” These methods do not change the trained models, but use strategies to change inference or prediction results. For example, a recent article on this topic first identifies a sparse set of attention heads that are very corrected with truth. Then they perform ‘inference intervention’: They change activations along the dissertation sand -correlated directions. There are also various methods that change model parameters to reduce hallucination. “

“Explicit knowledge study can also be used to add hallucination,” she adds. “In dissertation methods, a knowledge -enclosure component is used first.

Education of Proxy

A difficult time preventing hallucination has to do with the way LLMS is trained, explains Liu. LLM training uses input -masking where words in input phrases are randomly removed and LLM must deliver them. The masking is performed automatically and the output error is straightforward to calculate. But explicitly educating the models for billing accuracy would complicate the image.

20B-ENCODE-DECODER.GIF

Related content

With a Coder-Decoder-Architecture-snarer than Decoder Kun-Alexa-Teacher Model distinguishes other large language models on few shooting tasks such as summary and machine translation.

“What people have found is that it is a good power of attorney for many downstream uses,” says Liu. “It builds the basic foundation model, and then on top of it you can try to improve it to get it to follow the instructions and perform different tasks. But changing this foundation model, adding additional exercise loss target, it is difficult and calculating exhaustive.”

“I think it makes sense to continuously improve these models after prepayment – to examples via the reward model with human feedback in Løkken,” adds Liu. Reinforcement of learning with human feedback is a popular method of improving the performance of LLMs, where the model is bold training and seeking human feedback to distinguish between choices that it assigns low probabilities.

“If there is also a mistake, something you are interested in, you can get models optimized against these dimensions,” says Liu. “I think the model performance along these dimensions is improved; it’s just that the acceptance criterion is very high. Say, 95% – it seems very accurate from the classification point of view. But in search, if you have a single error and then people say,” Oh no, you answer! “, It’s a problem.”

Graphing calculation. 16x9.png

Related content

Two Papers from Amazon Web Services AI presents algorithms that relieve the intensive hyperparameter search and fine tuning required by privacy and retain deep learning in very large scales.

One option, says Liu, is that researchers find ways to include improving the impact battery of LLMs, the public is better trained on how to use them.

“Maybe users will change their attitudes and companies will change too,” she says. “People play with LLMs, they see some mistakes, and people make their own fact control. You treat their like any online news source. This is related to our panel of ethics: The whole community looks at this new tool. How should we treat these? Earth Truth, or is it a tool that gives you something and you double control it?

Leave a Comment