The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene. However, existing methods struggle to effectively h...
详细信息
The goal of infrared and visible image fusion(IVIF)is to integrate the unique advantages of both modalities to achieve a more comprehensive understanding of a scene. However, existing methods struggle to effectively handle modal disparities,resulting in visual degradation of the details and prominent targets of the fused images. To address these challenges, we introduce Prompt Fusion, a prompt-based approach that harmoniously combines multi-modality images under the guidance of semantic prompts. Firstly, to better characterize the features of different modalities, a contourlet autoencoder is designed to separate and extract the high-/low-frequency components of different modalities, thereby improving the extraction of fine details and textures. We also introduce a prompt learning mechanism using positive and negative prompts, leveraging Vision-Language Models to improve the fusion model's understanding and identification of targets in multi-modality images, leading to improved performance in downstream tasks. Furthermore, we employ bi-level asymptotic convergence optimization. This approach simplifies the intricate non-singleton non-convex bi-level problem into a series of convergent and differentiable single optimization problems that can be effectively resolved through gradient *** approach advances the state-of-the-art, delivering superior fusion quality and boosting the performance of related downstream tasks. Project page: https://***/hey-it-s-me/PromptFusion.
Detecting surface defects on unused rails is crucial for evaluating rail quality and durability to ensure the safety of rail ***,existing detection methods often struggle with challenges such as complex defect morphol...
详细信息
Detecting surface defects on unused rails is crucial for evaluating rail quality and durability to ensure the safety of rail ***,existing detection methods often struggle with challenges such as complex defect morphology,texture similarity,and fuzzy edges,leading to poor accuracy and missed *** order to resolve these problems,we propose MSCM-Net(Multi-Scale Cross-Modal Network),a multiscale cross-modal framework focused on detecting rail surface ***-Net introduces an attention mechanism to dynamically weight the fusion of RGB and depth maps,effectively capturing and enhancing features at different scales for each *** further enrich feature representation and improve edge detection in blurred areas,we propose a multi-scale void fusion module that integrates multi-scale feature *** improve cross-modal feature fusion,we develop a cross-enhanced fusion module that transfers fused features between layers to incorporate interlayer *** also introduce a multimodal feature integration module,which merges modality-specific features from separate decoders into a shared decoder,enhancing detection by leveraging richer complementary ***,we validate MSCM-Net on the NEU RSDDS-AUG RGB-depth dataset,comparing it against 12 leading methods,and the results show that MSCM-Net achieves superior performance on all metrics.
Machine learning models are increasingly being adopted across various fields, such as medicine, business, autonomous vehicles, and cybersecurity, to analyze vast amounts of data, detect patterns, and make predictions ...
详细信息
Highway safety researchers focus on crash injury severity,utilizing deep learning—specifically,deep neural networks(DNN),deep convolutional neural networks(D-CNN),and deep recurrent neural networks(D-RNN)—as the pre...
详细信息
Highway safety researchers focus on crash injury severity,utilizing deep learning—specifically,deep neural networks(DNN),deep convolutional neural networks(D-CNN),and deep recurrent neural networks(D-RNN)—as the preferred method for modeling accident *** learning’s strength lies in handling intricate relation-ships within extensive datasets,making it popular for accident severity level(ASL)prediction and *** prior success,there is a need for an efficient system recognizing ASL in diverse road *** address this,we present an innovative Accident Severity Level Prediction Deep Learning(ASLP-DL)framework,incorporating DNN,D-CNN,and D-RNN models fine-tuned through iterative hyperparameter selection with Stochastic Gradient *** framework optimizes hidden layers and integrates data augmentation,Gaussian noise,and dropout regularization for improved *** and factor contribution analyses identify influential *** on three diverse crash record databases—NCDB 2018–2019,UK 2015–2020,and US 2016–2021—the D-RNN model excels with an ACC score of 89.0281%,a Roc Area of 0.751,an F-estimate of 0.941,and a Kappa score of 0.0629 over the NCDB *** proposed framework consistently outperforms traditional methods,existing machine learning,and deep learning techniques.
A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software developm...
详细信息
A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software development in the opensource community. Nevertheless, the massive number of PRs in an open-source software(OSS) project increases the workload of developers. To reduce the burden on developers, many previous studies have investigated factors that affect the chance of PRs getting accepted and built prediction models based on these factors. However, most prediction models are built on the data after PRs are submitted for a while(e.g., comments on PRs), making them not useful in practice. Because integrators still need to spend a large amount of effort on inspecting PRs. In this study, we propose an approach named E-PRedictor(earlier PR predictor) to predict whether a PR will be merged when it is created. E-PRedictor combines three dimensions of manual statistic features(i.e., contributor profile, specific pull request, and project profile) and deep semantic features generated by BERT models based on the description and code changes of PRs. To evaluate the performance of E-PRedictor, we collect475192 PRs from 49 popular open-source projects on GitHub. The experiment results show that our proposed approach can effectively predict whether a PR will be merged or not. E-PRedictor outperforms the baseline models(e.g., Random Forest and VDCNN) built on manual features significantly. In terms of F1@Merge, F1@Reject, and AUC(area under the receiver operating characteristic curve), the performance of E-PRedictor is 90.1%, 60.5%, and 85.4%, respectively.
Glaucoma is an ophthalmic disorder which results in permanent vision loss because high intraocular pressure damages the optic nerve in the eye. This paper proposes a two-stage network for automated glaucoma identifica...
详细信息
Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly...
详细信息
Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly heterogeneous appearance and *** convolution neural networks(CNNs)have recently improved glioma segmentation ***,extensive down-sampling such as pooling or stridden convolution in CNNs significantly decreases the initial image resolution,resulting in the loss of accurate spatial and object parts information,especially information on the small sub-region tumors,affecting segmentation ***,this paper proposes a novel multi-level parallel network comprising three different level parallel subnetworks to fully use low-level,mid-level,and high-level information and improve the performance of brain tumor *** also introduce the Combo loss function to address input class imbalance and false positives and negatives imbalance in deep *** proposed method is trained and validated on the BraTS 2020 training and validation *** the validation dataset,ourmethod achieved a mean Dice score of 0.907,0.830,and 0.787 for the whole tumor,tumor core,and enhancing tumor core,*** with state-of-the-art methods,the multi-level parallel network has achieved competitive results on the validation dataset.
To handle input and output time delays that commonly exist in many networked control systems(NCSs), a new robust continuous sliding mode control(CSMC) scheme is proposed for the output tracking in uncertain single inp...
详细信息
To handle input and output time delays that commonly exist in many networked control systems(NCSs), a new robust continuous sliding mode control(CSMC) scheme is proposed for the output tracking in uncertain single input-single-output(SISO) networked control systems. This scheme consists of three consecutive steps. First, although the network-induced delay in those systems can be effectively handled by using Pade approximation(PA), the unmatched disturbance cames out as another difficulty in the control design. Second, to actively estimate this unmatched disturbance, a generalized proportional integral observer(GPIO) technique is utilized based on only one measured state. Third, by constructing a new sliding manifold with the aid of the estimated unmatched disturbance and states, a GPIO-based CSMC is synthesized, which is employed to cope with not only matched and unmatched disturbances, but also networkinduced delays. The stability of the entire closed-loop system under the proposed GPIO-based CSMC is detailedly *** promising tracking efficiency and feasibility of the proposed control methodology are verified through simulations and experiments on Quanser's servo module for motion control under various test conditions.
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive...
详细信息
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive,and privacy-aware vehicular applications in Io V result in the transformation from cloud computing to edge computing,which enables tasks to be offloaded to edge nodes(ENs) closer to vehicles for efficient execution. In ITS environment,however, due to dynamic and stochastic computation offloading requests, it is challenging to efficiently orchestrate offloading decisions for application requirements. How to accomplish complex computation offloading of vehicles while ensuring data privacy remains challenging. In this paper, we propose an intelligent computation offloading with privacy protection scheme, named COPP. In particular, an Advanced Encryption Standard-based encryption method is utilized to implement privacy protection. Furthermore, an online offloading scheme is proposed to find optimal offloading policies. Finally, experimental results demonstrate that COPP significantly outperforms benchmark schemes in the performance of both delay and energy consumption.
Mashup developers often need to find open application programming interfaces(APIs) for their composition application development. Although most enterprises and service organizations have encapsulated their businesses ...
详细信息
Mashup developers often need to find open application programming interfaces(APIs) for their composition application development. Although most enterprises and service organizations have encapsulated their businesses or resources online as open APIs, finding the right high-quality open APIs is not an easy task from a library with several open APIs. To solve this problem, this paper proposes a deep learning-based open API recommendation(DLOAR) approach. First, the hierarchical density-based spatial clustering of applications with a noise topic model is constructed to build topic models for Mashup clusters. Second,developers' requirement keywords are extracted by the Text Rank algorithm, and the language model is built. Third, a neural network-based three-level similarity calculation is performed to find the most relevant open APIs. Finally, we complement the relevant information of open APIs in the recommended list to help developers make better choices. We evaluate the DLOAR approach on a real dataset and compare it with commonly used open API recommendation approaches: term frequency-inverse document frequency, latent dirichlet allocation, Word2Vec, and Sentence-BERT. The results show that the DLOAR approach has better performance than the other approaches in terms of precision, recall, F1-measure, mean average precision,and mean reciprocal rank.
暂无评论