The rapid advancement and proliferation of Cyber-Physical Systems (CPS) have led to an exponential increase in the volume of data generated continuously. Efficient classification of this streaming data is crucial for ...
详细信息
Reduplication is a highly productive process in Bengali word formation, with significant implications for various natural language processing (NLP) applications, such as parts-of-speech tagging and sentiment analysis....
详细信息
Light clients implement a simple solution for Bitcoin’s scalability problem, as they do not store the entire blockchain but only the state of particular addresses of interest. To be able to keep track of the updated ...
详细信息
Supply chain management and Hyperledger are two interconnected domains. They leverage blockchain technology to enhance efficiency, transparency, and security in supply chain operations. Together, they provide a decent...
详细信息
Despite the effectiveness of vision-language supervised fine-tuning in enhancing the performance of vision large language models(VLLMs), existing visual instruction tuning datasets include the following limitations.(1...
详细信息
Despite the effectiveness of vision-language supervised fine-tuning in enhancing the performance of vision large language models(VLLMs), existing visual instruction tuning datasets include the following limitations.(1) Instruction annotation quality: despite existing VLLMs exhibiting strong performance,instructions generated by those advanced VLLMs may still suffer from inaccuracies, such as hallucinations.(2) Instructions and image diversity: the limited range of instruction types and the lack of diversity in image data may impact the model's ability to generate diversified and closer to real-world scenarios outputs. To address these challenges, we construct a high-quality, diverse visual instruction tuning dataset MMInstruct,which consists of 973k instructions from 24 domains. There are four instruction types: judgment, multiplechoice, long visual question answering, and short visual question answering. To construct MMInstruct, we propose an instruction generation data engine that leverages GPT-4V, GPT-3.5, and manual correction. Our instruction generation engine enables semi-automatic, low-cost, and multi-domain instruction generation at 1/6 the cost of manual construction. Through extensive experiment validation and ablation experiments,we demonstrate that MMInstruct could significantly improve the performance of VLLMs, e.g., the model fine-tuning on MMInstruct achieves new state-of-the-art performance on 10 out of 12 benchmarks. The code and data shall be available at https://***/yuecao0119/MMInstruct.
Semi-supervised learning techniques utilize both labeled and unlabeled images to enhance classification performance in scenarios where labeled images are limited. However, challenges such as integrating unlabeled imag...
详细信息
Semi-supervised learning techniques utilize both labeled and unlabeled images to enhance classification performance in scenarios where labeled images are limited. However, challenges such as integrating unlabeled images with incorrect pseudo-labels, determining appropriate thresholds for the pseudo-labels, and label prediction fluctuations on low-confidence unlabeled images, hinder the effectiveness of existing methods. This research introduces a novel framework named Interpolation Consistency for Bad Generative Adversarial Networks (IC-BGAN) that utilizes a new loss function. The proposed model combines bad adversarial training, fusion techniques, and regularization to address the limitations of semi-supervised learning. IC-BGAN creates three types of image augmentations and label consistency regularization in interpolation of bad fake images, real and bad fake images, and unlabeled images. It demonstrates linear interpolation behavior, reducing fluctuations in predictions, improving stability, and facilitating the identification of decision boundaries in low-density areas. The regularization techniques boost the discriminative capability of the classifier and discriminator, and send a better signal to the bad generator. This improves the generalization and the generation of diverse inter-class fake images as support vectors with information near the true decision boundary, which helps to correct the pseudo-labeling of unlabeled images. The proposed approach achieves notable improvements in error rate from 2.87 to 1.47 on the Modified National Institute of Standards and Technology (MNIST) dataset, 3.59 to 3.13 on the Street View House Numbers (SVHN) dataset, and 12.13 to 9.59 on the Canadian Institute for Advanced Research, 10 classes (CIFAR-10) dataset using 1000 labeled training images. Additionally, it reduces the error rate from 22.11 to 18.40 on the CINIC-10 dataset when using 700 labeled images per class. The experiments demonstrate the IC-BGAN framework outp
With the popularity of the Internet of Vehicles(IoV), a large amount of data is being generated every day. How to securely share data between the IoV operator and various value-added service providers becomes one of t...
详细信息
With the popularity of the Internet of Vehicles(IoV), a large amount of data is being generated every day. How to securely share data between the IoV operator and various value-added service providers becomes one of the critical issues. Due to its flexible and efficient fine-grained access control feature, Ciphertext-Policy Attribute-Based Encryption(CP-ABE) is suitable for data sharing in IoV. However, there are many flaws in most existing CP-ABE schemes, such as attribute privacy leakage and key misuse. This paper proposes a Traceable and Revocable CP-ABE-based Data Sharing with Partially hidden policy for IoV(TRE-DSP). A partially hidden access structure is adopted to hide sensitive user attribute values, and attribute categories are sent along with the ciphertext to effectively avoid privacy exposure. In addition, key tracking and malicious user revocation are introduced with broadcast encryption to prevent key misuse. Since the main computation task is outsourced to the cloud, the burden of the user side is relatively low. Analysis of security and performance demonstrates that TRE-DSP is more secure and practical for data sharing in IoV.
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,in...
详细信息
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound *** existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,*** address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule *** MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding *** transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the *** approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the ***,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation *** results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)*** findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.
Point clouds offer realistic 3D representations of objects and scenes at the expense of large data volumes. To represent such data compactly in real-world applications, Video-Based Point Cloud Compression (V-PCC) conv...
详细信息
This study proposes a malicious code detection model DTL-MD based on deep transfer learning, which aims to improve the detection accuracy of existing methods in complex malicious code and data scarcity. In the feature...
详细信息
暂无评论