🎉 Hey Gate Square friends! Non-stop perks and endless excitement—our hottest posting reward events are ongoing now! The more you post, the more you win. Don’t miss your exclusive goodies! 🚀
1️⃣ #TokenOfLove# | Festival Ticket Giveaway
Cheer for your idol on Gate Square! Pick your favorite star — HyunA, SUECO, DJ KAKA, or CLICK#15 — and post with SingerName + TokenOfLove hashtag to win one of 20 music festival tickets.
Details 👉 https://www.gate.com/post/status/13217654
2️⃣ #GateTravelSharingAmbassadors# | Share Your Journey, Win Rewards
Gate Travel is now live! Post with the hashtag and sha
Mathematics ability exceeds ChatGPT, 70B open source large model is on fire: fine-tuning AI with AI, produced by Microsoft All-China Class
Source: "Qubit" (ID: QbitAI), Author: Feng Se
Use AI-generated instructions to fine-tune the large alpaca model, and the mathematical ability exceeds ChatGPT——
Microsoft's latest open source model WizardMath is here.
And it is under the condition that the key parameters are only 70 billion, which is far less than the latter three.
For example, solve the following quartic polynomial equation:
Some netizens said to the author:
Enhance large model capabilities with AI-generated instructions
OpenAI's large models (InstructGPT, GPT-4, etc.) have been able to perform a variety of complex and diverse tasks with great success, partly due to fine-tuning using open-domain instruction data generated by real human users.
However, not everyone has access to such command datasets as this company does.
One is because the entire annotation process is extremely expensive and time-consuming, and the other is that it is difficult for humans to create a sufficient proportion of difficult instructions.
Therefore, developing a relatively low-cost, large-scale open-domain instruction automatic production method has become the key to the current instruction tuning language model.
Here, the authors name their method Evol Instruction.
It is a new method of using AI to replace humans to automatically generate open-field instructions covering various difficulty levels.
Specifically, Evol Instruction is divided into Instruction Evolver and Instruction Eliminator.
Among them, the instruction evolver can upgrade a simple instruction to a more complex instruction or create a new instruction through two paths of deep evolution (blue line) or extensive evolution (red line).
Which one should be implemented? Just choose randomly.
Add constraints, deepening, concretizing, increase reasoning steps, and complicate input.
Since all instructions are done by AI, sometimes mistakes are inevitable. Therefore, the instruction eliminator is used to filter failed instructions.
Here is a concrete example of a method that starts with "1+1=?" and ends up automatically generating quite a few new instructions through the above steps.
Here, the author selects Alpaca's training data (generated by only 175 artificially created seed instructions) as the initial data set, and then uses ChatGPT's API to perform four evolution cycles, and finally obtains 250,000 instructions.
In order to make a fair comparison with Vicuna's 70k real user data (ShareGPT), the author extracted an equal amount of samples from the 250,000 pieces of data, trained the LLaMA 7B model, and finally obtained WizardLM. As a result, the performance of WizardLM was significantly better than Vicuna.
(Alpaca: Stanford fine-tuned model based on LLaMa-7B; Vicuna, UC Berkeley fine-tuned based on LLaMa-13B)
In addition, humans prefer the output of WizardLM to ChatGPT under more complex test instructions, suggesting that this method can significantly improve the ability of LLM to handle complex instructions.
Based on this, the author used Evol Instruction to generate many instructions related to the field of mathematics, and then fine-tuned the large alpaca model to obtain WizardMath.
Its effect is as shown at the beginning. Its mathematical ability is measured on the GSM8k data set, surpassing many large models including ChatGPT, Claude Instant 1, PaLM 2-540B, etc., ranking fifth, second only to GPT-4 and Claud1. 3 and 2.0, and after Flan-PaLM 2 with 540 billion parameters.
By analogy, the author also got WizardCoder, which specializes in coding capabilities on the alpaca, and the effect surpasses Claude and Bard (for details, please click the address at the end of the article).
team introduction
There are 9 authors in this article, all Chinese.
There are 3 characters in one work:
Can Xu, Senior Application Scientist, S+D NLP Group, Microsoft Asia Internet Engineering Academy, previously worked on chat robot systems in Microsoft Xiaoice Research Group and Microsoft Asia Research Institute;
Qingfeng Sun, Microsoft Research scientist, research direction is natural language processing and information retrieval, proficient in building efficient search systems, contributed core deep models to Microsoft Bing and Office 365;
Kai Zheng, Microsoft Research scientist, research direction is natural language processing, search and recommendation ranking, also contributed core deep model to Microsoft Bing and Office 365.
Another author, Jiazhan Feng, is a Peking University student. This co-authored paper was produced during his internship at Microsoft.
Project home page:
Paper address: