The proceedings contain 5 papers. The topics discussed include: Aardvark: comparative visualization of dataanalysis scripts;NeighViz: towards better understanding of neighborhood effects on social groups with spatial...
ISBN:
(纸本)9798350330205
The proceedings contain 5 papers. The topics discussed include: Aardvark: comparative visualization of dataanalysis scripts;NeighViz: towards better understanding of neighborhood effects on social groups with spatial data;a declarative specification for authoring metrics dashboards;visual comparison of text sequences generated by large language models;and HPCClusterScape: increasing transparency and efficiency of shared high-performance computing clusters for large-scale AI models.
Information security (InfoSec) as becoming one of the most important fields of information systems at corporate level accounts for a large proportion of data losses, attacks and breaches. One of the possible proactive...
详细信息
In recent years, we have observed a growing interest in using satellite technology to monitor air quality on Earth. This article presents an attempt to correlate carbon monoxide (CO) levels between ground-based data a...
详细信息
ISBN:
(纸本)9798350360332;9798350360325
In recent years, we have observed a growing interest in using satellite technology to monitor air quality on Earth. This article presents an attempt to correlate carbon monoxide (CO) levels between ground-based data and data obtained from the Sentinel-5P satellite mission. Using modern artificial intelligence models, such as deep neural networks, a multidimensional dataanalysis was conducted to identify patterns and dependencies. This study represents an important step towards the integration of ground and satellite data for more accurate monitoring and forecasting of global air quality, as well as an attempt to provide information on the carbon monoxide levels on Earth in the absence of ground-based sensors. It highlights the potential for using advanced technology in environmental science to tackle global challenges. The article further delves into the methodology used for data processing and analysis, emphasizing the robustness and accuracy of AI techniques in handling largedatasets. This research also opens avenues for future studies focusing on other harmful pollutants, offering a comprehensive view of the environmental health of our planet. This innovative approach is not only critical for environmental monitoring but also holds significant implications for public health, policy-making, and sustainable development.
Content addressable memory (CAM) is a special-purpose search engine that can support parallel search directly in memory. CAMs are of increasing interest for machine learning and data analytics applications that requir...
详细信息
ISBN:
(数字)9798350330991
ISBN:
(纸本)9798350330991;9798350331004
Content addressable memory (CAM) is a special-purpose search engine that can support parallel search directly in memory. CAMs are of increasing interest for machine learning and data analytics applications that require intensive search operations. However, conventional CMOS CAMs have large cell areas and high energy consumption, which limits applicability. Also, many data-intensive applications need more efficient data representation and approximate matching functions, which may not be efficiently realized by conventional ternary CAMs. As such, we introduce a more compact and high-performance CAM design based on non-volatile ferroelectic FET devices. Furthermore, we present a reconfigurable CAM design, MHCAM, to support approximate search for multi-dimensional data. We use DNA alignment as a proxy application to illustrate the design's application-level benefits.
This paper presents the prototyping and validation of a Financial data Management and analysis System designed to enhance fraud detection using advanced Big data and Cloud technologies. The system incorporates a Java-...
详细信息
In the Internet era, the exponential growth of fine-grained image databases poses a considerable challenge for efficient information retrieval. Hashing-based approaches gained traction for their computational and stor...
详细信息
In the Internet era, the exponential growth of fine-grained image databases poses a considerable challenge for efficient information retrieval. Hashing-based approaches gained traction for their computational and storage efficiency, yet fine-grained hashing retrieval presents unique challenges due to small inter-class and large intra-class variations inherent to fine-grained entities. Thus, traditional hashing algorithms falter in discerning these subtle, yet critical, visual differences and fail to generate compact yet semantically rich hash codes. To address this, we introduce a Dual Activation Hashing Network (DAHNet) designed to convert high-dimensional image data into optimized binary codes via an innovative feature activation paradigm. The architecture consists of dual branches specifically tailored for global and local semantic activation, thereby establishing direct correspondences between hash codes and distinguishable object parts through a hierarchical activation pipeline. Specifically, our spatial-oriented semantic activation module modulates dominant visual regions while amplifying the activations of subtle yet semantically rich areas in a controlled manner. Building on these activated visual representations, the proposed inter-region semantic enrichment module further enriches them by unearthing semantically complementary cues. Concurrently, DAHNet integrates a channel-oriented semantic activation module that exploits channel-specific correlations to distill contextual cues from spatially-activated visual features, thereby reinforcing robust learning to hash. To maintain the similarity of the original entities, we amalgamate final hash codes from both activation branches, capturing both local textural details and global structural information. Comprehensive evaluations on five fine-grained image retrieval benchmarks demonstrate DAHNet's superior performance over existing state-of-the-art hashing solutions, especially on 12-bit, improving performance by
large Language Models (LLMs) have demonstrated a huge impact on education and literacy in recent years. We evaluated the recommendations provided by two popular LLMs (OpenAI's ChatGPT and Google's Bard) to edu...
详细信息
Virtual and augmented reality technologies have significantly advanced and come down in price during the last few years. These technologies can provide a great tool for highly interactive visualization approaches of a...
详细信息
Vision-Language Navigation (VLN) requires the agent to follow language instructions to reach a target position. A key factor for successful navigation is to align the landmarks implied in the instruction with diverse ...
详细信息
Vision-Language Navigation (VLN) requires the agent to follow language instructions to reach a target position. A key factor for successful navigation is to align the landmarks implied in the instruction with diverse visual observations. However, previous VLN agents fail to perform accurate modality alignment especially in unexplored scenes, since they learn from limited navigation data and lack sufficient open-world alignment knowledge. In this work, we propose a new VLN paradigm, called COrrectable LaNdmark DiScOvery via large ModEls (CONSOLE). In CONSOLE, we cast VLN as an open-world sequential landmark discovery problem, by introducing a novel correctable landmark discovery scheme based on two large models ChatGPT and CLIP. Specifically, we use ChatGPT to provide rich open-world landmark cooccurrence commonsense, and conduct CLIP-driven landmark discovery based on these commonsense priors. To mitigate the noise in the priors due to the lack of visual constraints, we introduce a learnable cooccurrence scoring module, which corrects the importance of each cooccurrence according to actual observations for accurate landmark discovery. We further design an observation enhancement strategy for an elegant combination of our framework with different VLN agents, where we utilize the corrected landmark features to obtain enhanced observation features for action decision. Extensive experimental results on multiple popular VLN benchmarks (R2R, REVERIE, R4R, RxR) show the significant superiority of CONSOLE over strong baselines. Especially, our CONSOLE establishes the new state-of-the-art results on R2R and R4R in unseen scenarios.
Metagenomics, the study of the genome sequences of diverse organisms in a common environment, has led to significant advances in many fields. Since the species present in a metagenomic sample are not known in advance,...
详细信息
ISBN:
(纸本)9798350326598;9798350326581
Metagenomics, the study of the genome sequences of diverse organisms in a common environment, has led to significant advances in many fields. Since the species present in a metagenomic sample are not known in advance, metagenomic analysis commonly involves the key tasks of determining the species present in a sample and their relative abundances. These tasks require searching large metagenomic databases containing information on different species' genomes. Metagenomic analysis suffers from significant data movement overhead due to moving large amounts of low-reuse data from the storage system to the rest of the system. In-storage processing can be a fundamental solution for reducing this overhead. However, designing an in-storage processing system for metagenomics is challenging because existing approaches to metagenomic analysis cannot be directly implemented in storage effectively due to the hardware limitations of modern SSDs. We propose MegIS, the first in-storage processing system designed to significantly reduce the data movement overhead of the end-to-end metagenomic analysis pipeline. MegIS is enabled by our lightweight design that effectively leverages and orchestrates processing inside and outside the storage system. Through our detailed analysis of the end-to-end metagenomic analysis pipeline and careful hardware/software co-design, we address in-storage processing challenges for metagenomics via specialized and efficient 1) task partitioning, 2) data/computation flow coordination, 3) storage technology-aware algorithmic optimizations, 4) data mapping, and 5) lightweight in-storage accelerators. MegIS's design is flexible, capable of supporting different types of metagenomic input datasets, and can be integrated into various metagenomic analysis pipelines. Our evaluation shows that MegIS outperforms the state-of-the-art performance-and accuracy-optimized software metagenomic tools by 2.7x-37.2x and 6.9x-100.2x, respectively, while matching the accuracy of
暂无评论