Xinhua News Agency Research Institute released a report on domestic large-scale models: Xunfei Xinghuo No. 1, Baidu Wenxin No. 2

2023-08-14 07:19:23

Author: Bu Shuqing

Since ChatGPT triggered an upsurge in artificial intelligence investment, domestic large-scale models have sprung up like mushrooms. As of the beginning of July, there have been more than 80 large-scale artificial intelligence models with a parameter scale of more than 1 billion in China. Under this new trend, how should domestic large-scale models be selected? Which big model is the most powerful?

On August 12, the China Enterprise Development Research Center of the Research Institute of Xinhua News Agency released the "Artificial Intelligence Large Model Experience Report 2.0" (hereinafter referred to as the "Report"), which is the most popular domestically-made enterprise such as Baidu Wenxin Yiyan and Ali Tongyi Qianwen. General large model applications are evaluated.

A total of 500 random questions were designed for this evaluation, benchmarking against the human level who have received higher education, emphasizing the actual value to industry and life, and strictly according to the four basic ability index, IQ index, EQ index, and tool efficiency index The evaluation dimension is weighted to ensure the rigor of the entire evaluation process.

The final result was a little surprising. **Xunfei Xinghuo ranked first in this evaluation with a total score of 1013 points, and ranked first in the two dimensions of IQ index and tool efficiency index among the four evaluation dimensions; Baidu Wenxin Yiyan It ranked second and third respectively with SenseTime. **

According to the "Report", Xunfei Xinghuo has seven core capabilities, namely, text generation, language understanding, knowledge question and answer, logical reasoning, mathematical ability, coding ability, and multimodal ability. Possess cross-domain knowledge and language understanding ability, and be able to understand and perform tasks based on natural dialogue.

** In the basic ability part, the gap between humans and AI is not significant. **Baidu Wenxin's performance is the most eye-catching among the models; SenseTime, Zhipu AI hatGLM, 360 Smart Brain performed well, Xunfei Xinghuo, Ali Tongyi Qianwen, Lanzhou Technology Mchat, Kunlun Wanwei Tiangong performed well Still good.

In terms of IQ assessment,** humans still have a clear advantage, and the scores are the highest. **Among all the evaluated models, Xunfei Xinghuo and Zhipu A-ChatGLM ranked first; Baidu Wenxin Yiyan and Kunlun Wanwei Tiangong performed well.

**The gap between AI and humans is most pronounced when it comes to emotional intelligence. **The "Report" stated that no obvious signs of AI's ability to perceive emotions have been observed. Despite this, SenseTime showed a high EQ that surpassed its peers, ranking first with a score of 346. Baidu Wenxin Yiyan and Lanzhou Technology · Mchat ranked second and third respectively.

Finally, in terms of improving work efficiency, the "Report" believes that AI provides strong support for humans, and the processing speed of AI far exceeds that of humans. However, despite the advantages of high speed and high efficiency of AI, human intelligence and imagination still play an irreplaceable role in some complex and innovative tasks. The evaluation results show that Xunfei Xinghuo ranked first with a score of 350 and is far ahead, while Baidu Wenxin Yiyan and Shangtang Shangdi ranked second and third respectively.

The "Report" believes that compared with June 2023, the current large-scale model products in China have made significant progress. However, compared with highly educated humans, there is still a certain degree of gap between big models in terms of IQ and EQ. Although in different fields, AI and humans show different advantages and disadvantages, but on the whole, the development of AI large-scale models has brought important positive impacts on the quality and efficiency of human work and life, and large-scale models are accelerating into Live and enter the industry.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

0/400

No comments

Topic
#ETH Hits New ATH
20k Popularity
#Powell Turns Dovish
15k Popularity
#Gate Alpha FST Points Airdrop
12k Popularity
#Altcoin Market Cap Up 2.64%
3k Popularity
#Aave & WLFI Token Allocation Dispute
2k Popularity

Sitemap