Friday, June 14, 2024

Mixture of Agents Research

Mixture of Agents Research

Mixture of Agents Research Summary

The topic of this text is the Mixture of Agents (MOA) research that surpasses GPT-40 on alpaca eval 2.0, with a substantial win of 57.5 compared to 65.1.

The researcher published the code, and the author is considering doing a tutorial using this code.

The basic architecture of MOA is multiple layers where each layer has three different agents working together in collaboration to come up with the final output for a given prompt.

Each agent takes outputs from the previous layer as auxiliary information to generate refined responses. This approach allows MOA to effectively integrate diverse capabilities and insights from various models, resulting in a more robust and versatile combined model.

The researcher notes that while MOA achieves higher accuracy, it does come at the cost of slower time-to-first-token latency. Reducing this latency is an exciting future direction for this research.

The article also mentions the collaborativeness of LLMS (Large Language Models), where an LLMS tends to generate better responses when presented with outputs from other models, even if these other models are less capable on their own.

The researcher evaluated the score when leveraging responses from other models and found that each model increases significantly from its base score on alpaca eval 2.0.

The article also discusses the roles of proposers (models that generate initial reference responses) and aggregators (models that synthesize different responses into a single high-quality response).

The researcher proposes a layered process to improve responses, where initially several proposers independently generate responses to a given prompt, which are then presented to aggregators in the next layer who synthesize them into higher-quality responses.

The article concludes by discussing the results of using six open-source models as proposers and Quin 1.5 110b Chat as the final aggregator.

The researcher also benchmarks the LC win rate of each layer and finds a consistent and monotonic performance gain can be achieved after each layer.


No comments:

Post a Comment

Featured Post

OpenAI's Search GPT: A New Era of Conversational Search

Here's an unpacking of what this means: What is Search GPT? : Search GPT is a prototype designed to provide fast and timely answers ...