Dialogue-based relation extraction(DialogRE) aims to predict relationships between two entities in dialogue. Current approaches to dialogue relationship extraction grapple with long-distance entity relationships in di...
详细信息
Dialogue-based relation extraction(DialogRE) aims to predict relationships between two entities in dialogue. Current approaches to dialogue relationship extraction grapple with long-distance entity relationships in dialogue data as well as complex entity relationships, such as a single entity with multiple types of connections. To address these issues, this paper presents a novel approach for dialogue relationship extraction termed the hypergraphs and heterogeneous graphs model(HG2G). This model introduces a two-tiered structure, comprising dialogue hypergraphs and dialogue heterogeneous graphs, to address the shortcomings of existing methods. The dialogue hypergraph establishes connections between similar nodes using hyper-edges and utilizes hypergraph convolution to capture multi-level features. Simultaneously, the dialogue heterogeneous graph connects nodes and edges of different types, employing heterogeneous graph convolution to aggregate cross-sentence information. Ultimately, the integrated nodes from both graphs capture the semantic nuances inherent in dialogue. Experimental results on the DialogRE dataset demonstrate that the HG2G model outperforms existing state-of-the-art methods.
Although ray tracing produces high-fidelity, realistic images, it is considered computationally burdensome when implemented on a high rendering rate system. Perception-driven rendering techniques generate images with ...
详细信息
Although ray tracing produces high-fidelity, realistic images, it is considered computationally burdensome when implemented on a high rendering rate system. Perception-driven rendering techniques generate images with minimal noise and distortion that are generally acceptable to the human visual system, thereby reducing rendering costs. In this paper, we introduce a perception-entropy-driven temporal reusing method to accelerate real-time ray tracing. We first build a just noticeable difference(JND) model to represent the uncertainty of ray samples and image space masking effects. Then, we expand the shading gradient through gradient max-pooling and gradient filtering to enlarge the visual receipt field. Finally, we dynamically optimize reusable time segments to improve the accuracy of temporal reusing. Compared with Monte Carlo ray tracing, our algorithm enhances frames per second(fps) by 1.93× to 2.96× at 8 to 16 samples per pixel, significantly accelerating the Monte Carlo ray tracing process while maintaining visual quality.
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of ***,there is a large performance gap between weakly supervised and fully supervised salient o...
详细信息
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of ***,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background ***,an intuitive idea is to infer annotations that cover more complete object and background regions for *** this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent ***,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster ***,the same annotations for pixels with similar colours within each kernel neighbourhood was set *** experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.
Data race is one of the most important concurrent anomalies in multi-threaded *** con-straint-based techniques are leveraged into race detection,which is able to find all the races that can be found by any oth-er soun...
详细信息
Data race is one of the most important concurrent anomalies in multi-threaded *** con-straint-based techniques are leveraged into race detection,which is able to find all the races that can be found by any oth-er sound race ***,this constraint-based approach has serious limitations on helping programmers analyze and understand data ***,it may report a large number of false positives due to the unrecognized dataflow propa-gation of the ***,it recommends a wide range of thread context switches to schedule the reported race(in-cluding the false one)whenever this race is exposed during the constraint-solving *** ad hoc recommendation imposes too many context switches,which complicates the data race *** address these two limitations in the state-of-the-art constraint-based race detection,this paper proposes DFTracker,an improved constraint-based race detec-tor to recommend each data race with minimal thread context ***,we reduce the false positives by ana-lyzing and tracking the dataflow in the *** this means,DFTracker thus reduces the unnecessary analysis of false race *** further propose a novel algorithm to recommend an effective race schedule with minimal thread con-text switches for each data *** experimental results on the real applications demonstrate that 1)without removing any true data race,DFTracker effectively prunes false positives by 68%in comparison with the state-of-the-art constraint-based race detector;2)DFTracker recommends as low as 2.6-8.3(4.7 on average)thread context switches per data race in the real world,which is 81.6%fewer context switches per data race than the state-of-the-art constraint based race ***,DFTracker can be used as an effective tool to understand the data race for programmers.
Models based on MLP-Mixer architecture are becoming popular,but they still sufer from adversarial *** it has been shown that MLP-Mixer is more robust to adversarial attacks compared to convolutional neural networks(CN...
详细信息
Models based on MLP-Mixer architecture are becoming popular,but they still sufer from adversarial *** it has been shown that MLP-Mixer is more robust to adversarial attacks compared to convolutional neural networks(CNNs),there has been no research on adversarial attacks tailored to its *** this paper,we fll this *** propose a dedicated attack framework called Maxwell’s demon Attack(MA).Specifcally,we break the chan‑nel-mixing and token-mixing mechanisms of the MLP-Mixer by perturbing inputs of each Mixer layer to achieve high *** demonstrate that disrupting the MLP-Mixer’s capture of the main information of images by mask‑ing its inputs can generate adversarial examples with cross-architectural *** evaluations show the efectiveness and superior performance of *** generated based on masked inputs obtain a higher success rate of black-box attacks than existing transfer ***,our approach can be easily combined with existing methods to improve the transferability both within MLP-Mixer based models and to models with difer‑ent *** achieve up to 55.9%attack performance *** work exploits the true generaliza‑tion potential of the MLP-Mixer adversarial space and helps make it more robust for future deployments.
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
Denoising(DN) and demosaicing(DM) are the first crucial stages in the image signal processing pipeline. Recently, researches pay more attention to solve DN and DM in a joint manner, which is an extremely undetermined ...
详细信息
Denoising(DN) and demosaicing(DM) are the first crucial stages in the image signal processing pipeline. Recently, researches pay more attention to solve DN and DM in a joint manner, which is an extremely undetermined inverse problem. Existing deep learning methods learn the desired prior on synthetic dataset, which limits the generalization of learned network to the real world data. Moreover, existing methods mainly focus on the raw data property of high green information sampling rate for DM, but occasionally exploit the high intensity and signalto-noise(SNR) of green channel. In this work, a deep guided attention network(DGAN) is presented for real image joint DN and DM(JDD), which considers both high SNR and high sampling rate of green information for DN and DM, respectively. To ease the training and fully exploit the data property of green channel, we first train DN and DM sub-networks sequentially and then learn them jointly, which can alleviate the error accumulation. Besides, in order to support the real image JDD, we collect paired raw clean RGB and noisy mosaic images to conduct a realistic dataset. The experimental results on real JDD dataset show the presented approach performs better than the state-of-the-art methods, in terms of both quantitative metrics and qualitative visualization.
Constructing an effective common latent embedding by aligning the latent spaces of cross-modal variational autoencoders(VAEs) is a popular strategy for generalized zero-shot learning(GZSL). However, due to the lac...
详细信息
Constructing an effective common latent embedding by aligning the latent spaces of cross-modal variational autoencoders(VAEs) is a popular strategy for generalized zero-shot learning(GZSL). However, due to the lack of fine-grained instance-wise annotations, existing VAE methods can easily suffer from the posterior collapse problem. In this paper, we propose an innovative asymmetric VAE network by aligning enhanced feature representation(AEFR) for GZSL. Distinguished from general VAE structures, we designed two asymmetric encoders for visual and semantic observations and one decoder for visual reconstruction. Specifically, we propose a simple yet effective gated attention mechanism(GAM) in the visual encoder for enhancing the information interaction between observations and latent variables, alleviating the possible posterior collapse problem effectively. In addition, we propose a novel distributional decoupling-based contrastive learning(D2-CL) to guide learning classification-relevant information while aligning the representations at the taxonomy level in the latent representation space. Extensive experiments on publicly available datasets demonstrate the state-of-the-art performance of our method. The source code is available at https://***/seeyourmind/AEFR.
This paper examines fault-tolerant quantized control for neural networks under persistent dwell-time switching, considering the presence of actuator faults and dynamic output quantization. The dynamic scaling factor (...
详细信息
With the development of information technology and cloud computing,data sharing has become an important part of scientific *** traditional data sharing,data is stored on a third-party storage platform,which causes the...
详细信息
With the development of information technology and cloud computing,data sharing has become an important part of scientific *** traditional data sharing,data is stored on a third-party storage platform,which causes the owner to lose control of the *** a result,there are issues of intentional data leakage and tampering by third parties,and the private information contained in the data may lead to more significant ***,data is frequently maintained on multiple storage platforms,posing significant hurdles in terms of enlisting multiple parties to engage in data sharing while maintaining *** this work,we propose a new architecture for applying blockchains to data sharing and achieve efficient and reliable data sharing among heterogeneous *** design a new data sharing transaction mechanism based on the system architecture to protect the security of the raw data and the processing *** also design and implement a hybrid concurrency control protocol to overcome issues caused by the large differences in blockchain performance in our system and to improve the success rate of data sharing *** took Ethereum and Hyperledger Fabric as examples to conduct crossblockchain data sharing *** results show that our system achieves data sharing across heterogeneous blockchains with reasonable performance and has high scalability.
暂无评论