With the widespread application of blockchain technology, various range proof protocols based on zero-knowledge proofs have been proposed. However, existing range proof protocols suffer from issues such as high commun...
详细信息
With the continuous development of intelligent connected vehicle industry, cameras and other vehicle-mounted devices are widely used, so the amount of data collection is increasing. There is a large amount of sensitiv...
详细信息
With the increasing requirement of people, the functions of in-vehicle infotainment systems are becoming more and more abundant, and their security also affects the safety of vehicles. Therefore, it is more and more i...
详细信息
Automated surface defect detection is crucial for ensuring product quality in industrial settings. This paper presents a multi-scale texture network that addresses this challenge by effectively analyzing textures at v...
详细信息
Stream computing engine is an important part of big data system, and benchmarking is one of the main means to measure the engine's performance. In this paper, we compare the differences between two engines, Spark ...
详细信息
This paper presents an approach to softwaredevelopment which uses a generative AI Model as compiler to translate human language requirements into high-level programming language. We propose an executable human-langua...
详细信息
Smart contracts have emerged as one of the most successful applications in the blockchain domain, playing a significant role in various blockchain ecosystems. Inspired by smart contracts, a multitude of cryptographic ...
详细信息
Forecasting Human mobility is of great significance in the simulation and control of infectious diseases like COVID-19. To get a clear picture of potential future outbreaks, it is necessary to forecast multi-step Ori...
详细信息
We designed a large language model evaluation system based on open-ended questions. The system accomplished multidimensional evaluation of LLMs using open-ended questions, and it presented evaluation results with eval...
详细信息
ISBN:
(数字)9798350376982
ISBN:
(纸本)9798350376999
We designed a large language model evaluation system based on open-ended questions. The system accomplished multidimensional evaluation of LLMs using open-ended questions, and it presented evaluation results with evaluation reports. Currently, the evaluation of large-scale language models often exists with two prominent limitations: (1) The evaluation methods are often single-minded, resulting in less credible results. (2) Most evaluations are based on datasets with closed-ended questions, treating generative large language models as discriminative models, which fails to adequately reflect the high output flexibility characteristic of these models. For these two limitations, we proposed an evaluation system for LLMs based on open-ended questions. Our experiments on the adapted open-source datasets demonstrated the effectiveness of this system. The code of the system was released on https://***/JerryMazeyu/GreatLibrarian.
Natural language processing (NLP) is rapidly developing. A series of Large Language Models (LLMs) have emerged, represented by ChatGPT, which have made significant breakthroughs in natural language understanding and g...
详细信息
ISBN:
(数字)9798350389500
ISBN:
(纸本)9798350389517
Natural language processing (NLP) is rapidly developing. A series of Large Language Models (LLMs) have emerged, represented by ChatGPT, which have made significant breakthroughs in natural language understanding and generation, enabling fluent dialogue with humans, understanding human intentions, and completing complex tasks. However, in addition to the fairness and toxicity of traditional language models, some new problems, including hallucination, have also emerged in LLMs, making them hard to use. Evaluating LLMs manually is challenging due to subjectivity and inefficiency. In this paper, we focused on the fuzzy matching, toxicity detection, and hallucination detection in the evaluation of LLMs automatically, and fine-tune the Mixtral-8x7B Model, which can be deployed in private cloud environment, and prove the effectiveness of our method through experiments.
暂无评论