Advanced Encryption Standard (AES), as one of the most popular encryption algorithms, has been widely studied on single GPU and CPU. However, the research on multi-GPU platforms is not deep enough, and with the rapid ...
详细信息
Sparse Matrix-Matrix Multiplication (SpMM) is a widely used algorithm in Machine Learning, particularly in the increasingly popular Graph Neural Networks (GNNs). SpMM is an essential arithmetic operation in GNNs and h...
详细信息
In this study, an adaptive neural network(NN) control is proposed for nonlinear two-degree-offreedom(2-DOF) helicopter systems considering the input constraints and global prescribed ***, radial basis function NN(RBFN...
详细信息
In this study, an adaptive neural network(NN) control is proposed for nonlinear two-degree-offreedom(2-DOF) helicopter systems considering the input constraints and global prescribed ***, radial basis function NN(RBFNN) is employed to estimate the unknown dynamics of the helicopter system. Second, a smooth nonaffine function is exploited to approximate and address nonlinear constraint functions. Subsequently, a new prescribed function is proposed, and an original constrained error is transformed into an equivalent unconstrained error using the error transformation and barrier function transformation methods. The analysis of the established Lyapunov function proves that the controlled system is globally uniformly bounded. Finally, the simulation and experimental results on a constructed Quanser's test platform verify the rationality and feasibility of the proposed control.
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In thi...
详细信息
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In this paper, we conducted a comprehensive evaluation of large multimodal models, such as GPT4V and Gemini, in various text-related visual tasks including text recognition, scene text-centric visual question answering(VQA), document-oriented VQA, key information extraction(KIE), and handwritten mathematical expression recognition(HMER). To facilitate the assessment of optical character recognition(OCR) capabilities in large multimodal models, we propose OCRBench, a comprehensive evaluation benchmark. OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available. Furthermore, our study reveals both the strengths and weaknesses of these models, particularly in handling multilingual text, handwritten text, non-semantic text, and mathematical expression *** importantly, the baseline results presented in this study could provide a foundational framework for the conception and assessment of innovative strategies targeted at enhancing zero-shot multimodal *** evaluation pipeline and benchmark are available at https://***/Yuliang-Liu/Multimodal OCR.
In China's healthcare system, the quality of medical records, as an important medical document that records the complete course of a patient's illness, is related to medical safety and clinical research. Since...
详细信息
Discrete cosine transform for 8×8 block(DCT8×8) is widely used in image compression due to its high signal decorrelation rate. Current research for DCT is mainly focused on CPU and single GPU platforms, and ...
详细信息
Mobile Edge Computing (MEC) is a promising technology that provides computing services at the edge of wireless networks to reduce the latency and the energy consumption for Smart Mobile Devices (SMDs). Additionally, t...
详细信息
The Text-to-Image Generation(T2I) Models acquires implicit social biases during the training process, which can easily cause social disputes and negative impacts in sensitive fields such as news broadcasting, educatio...
详细信息
Intracranial hemorrhage(ICH) is a serious disease with high morbidity, recurrence, and disability rates, and computed tomography (CT) is considered an important standard for diagnosis. Since CT images of ICH have prob...
详细信息
Mobile edge computing improves data processing efficiency and reduces latency by deploying computing and storage resources at the network edge, making it suitable for real-time applications. In vehicular networks, due...
详细信息
暂无评论