Existing multimodal summarization methods primarily focus on multimodal fusion to efficiently utilize the visual information for summarization. However, they fail to exploit the deep interaction between textual and vi...
详细信息
Boolean Matrix Factorization (BMF) aims to find an approximation of a given binary matrix as the Boolean product of two low-rank binary matrices. Binary data is ubiquitous in many fields, and representing data by bina...
详细信息
ISBN:
(纸本)9781728186719
Boolean Matrix Factorization (BMF) aims to find an approximation of a given binary matrix as the Boolean product of two low-rank binary matrices. Binary data is ubiquitous in many fields, and representing data by binary matrices is common in medicine, natural language processing, bioinformatics, computer graphics, among many others. Factorizing a matrix into low-rank matrices is used to gain more information about the data, like discovering relationships between the features and samples, roles and users, topics and articles, etc. In many applications, the binary nature of the factor matrices could enormously increase the interpretability of the data. Unfortunately, BMF is computationally hard and heuristic algorithms are used to compute Boolean factorizations. Very recently, the theoretical breakthrough was obtained independently by two research groups. Ban et al. (SODA 2019) and Fomin et al. (Trans. algorithms2020) show that BMF admits an efficient polynomial-time approximation scheme (EPTAS). However, despite the theoretical importance, the high double-exponential dependence of the running times from the rank makes these algorithms unimplementable in practice. The primary research question motivating our work is whether the theoretical advances on BMF could lead to practical algorithms. The main conceptional contribution of our work is the following. While EPTAS for BMF is a purely theoretical advance, the general approach behind these algorithms could serve as the basis in designing better heuristics. We also use this strategy to develop new algorithms for related F-p-Matrix Factorization. Here, given a matrix A over a finite field GF(p) where p is a prime, and an integer r, our objective is to find a matrix B over the same field with GF(p)-rank at most r minimizing some norm of A - B. Our empirical research on synthetic and real-world data demonstrates the advantage of the new algorithms over previous works on BMF and F-p-Matrix Factorization.
Federated Learning (FL) offers a privacy-preserving solution by enabling multiple clients to train a shared model collaboratively without centralizing data. However, the decentralized nature of FL presents challenges,...
详细信息
Logs as semi-structured text are rich in semantic information, making their comprehensive understanding crucial for automated log analysis. With the recent success of pre-trained language models in natural language pr...
详细信息
Large multimodal deep learning models such as Contrastive Language Image Pretraining (CLIP) have become increasingly powerful with applications across several domains in recent years. CLIP works on visual and language...
详细信息
ISBN:
(纸本)9798400700552
Large multimodal deep learning models such as Contrastive Language Image Pretraining (CLIP) have become increasingly powerful with applications across several domains in recent years. CLIP works on visual and language modalities and forms a part of several popular models, such as DALL-E and Stable Diffusion. It is trained on a large dataset of millions of image-text pairs crawled from the internet. Such large datasets are often used for training purposes without filtering, leading to models inheriting social biases from internet data. Given that models such as CLIP are being applied in such a wide variety of applications ranging from social media to education, it is vital that harmful biases are detected. However, due to the unbounded nature of the possible inputs and outputs, traditional bias metrics such as accuracy cannot detect the range and complexity of biases present in the model. In this paper, we present an audit of CLIP using an established technique from natural language processing called Word Embeddings Association Test (WEAT) to detect and quantify gender bias in CLIP and demonstrate that it can provide a quantifiable measure of such stereotypical associations. We detected, measured, and visualised various types of stereotypical gender associations with respect to character descriptions and occupations and found that CLIP shows evidence of stereotypical gender bias.
The ever-growing usage and popularity of Internet of Things devices, coupled with Big data technologies and machine learning algorithms, have allowed for data engineers to explore new opportunities in healthcare and c...
详细信息
The ever-growing usage and popularity of Internet of Things devices, coupled with Big data technologies and machine learning algorithms, have allowed for data engineers to explore new opportunities in healthcare and continuous care. Furthermore, there is a need to reduce the gap on time from when information is created to when actions and insights can be offered. However, a challenge in implementing a large-scale dataprocessing architecture is deciding which tools are appropriate, and how to apply them in the best way possible. For example, streaming systems are now mature enough that hospitals worldwide can use their extremely large datasets, along with data producers, to predict and influence future events. Thus, the main objective of this systematic review is to identify the state-of-the-art in data platforms on healthcare that allow the creation of metrics and actions in real-time. The PRISMA guideline for reporting systematic reviews was implemented to deliver a transparent and consistent report, validating the technological advances in a critical sector. Multiple pertinent articles and papers were retrieved from the SCOPUS abstract and citation database on May 13, 2022, using several relevant keywords to identify potentially relevant documents published from January 2020 onward. These documents must have already been published in English and been already published, and accessible through the B-ON consortium that allows Portuguese students to legally download from most publishers. Over seven studies have been selected for deeper discussion based on their relevance and impact for this review, showcasing their main objectives, data sources, and tools used, as well as their approaches for interoperability and support of machine learning algorithms for decision support. In closing, the collected articles have shown that while Big data is currently in use at health institutions of all sizes, the ability of processing large amounts of data from sensors and events, a
Colonoscopic polyp segmentation is essential and valuable to early diagnosis and treatment of colorectal cancer. It remains challenging to accurately extract these polyps due to their small sizes, irregular shapes, im...
详细信息
ISBN:
(纸本)9781728198354
Colonoscopic polyp segmentation is essential and valuable to early diagnosis and treatment of colorectal cancer. It remains challenging to accurately extract these polyps due to their small sizes, irregular shapes, image artifacts, and illumination variations. This work proposes a new encoder-decoder architecture called pyramid transformer driven multibranch fusion to precisely segment different types of colorectal polyps during colonoscopy. Specifically, our architecture employs a simple, convolution-free pyramid transformer as its encoder that is a flexible and powerful feature extractor. Next, a multibranch fusion decoder is employed to reserve the detailed appearance information and fuse semantic global cues, which can deal with blurred polyp edges caused by nonuniform illumination and the shaky colonoscope. Additionally, a hybrid spatial-frequency loss function is introduced for accurate training. We evaluate our proposed architecture on colonoscopic polyp images with four types of polyps with different pathological features, with the experimental results showing that our architecture significantly outperforms other deep learning models. Particularly, our method improves the average dice similarity and intersection over union to 90.7% and 0.848, respectively.
Training robots by model-free deep reinforcement learning (DRL) to carry out robotic manipulation tasks without sufficient successful experiences is challenging. Hindsight experience replay (HER) is introduced to enab...
详细信息
Training robots by model-free deep reinforcement learning (DRL) to carry out robotic manipulation tasks without sufficient successful experiences is challenging. Hindsight experience replay (HER) is introduced to enable DRL agents to learn from failure experiences. However, the HER-enabled model-free DRL still suffers from limited training performance due to its uniform sampling strategy and scarcity of reward information in the task environment. Inspired by the progress incentive mechanism in human psychology, we propose Progress Intrinsic Motivation-based HER (P-HER) in this work to overcome these difficulties. First, the Trajectory Progress-based Prioritized Experience Replay (TPPER) module is developed to prioritize sampling valuable trajectory data thereby achieving more efficient training. Second, the Progress Intrinsic Reward (PIR) module is introduced in agent training to add extra intrinsic rewards for encouraging the agents throughout the exploration of task space. Experiments in challenging robotic manipulation tasks demonstrate that our P-HER method outperforms original HER and state-of-the-art HER-based methods in training performance. Our code of P-HER and its experimental videos in both virtual and real environments are available at https://***/weixiang-smart/P-HER. Note to Practitioners-This work is motivated to develop a fast and effective learning method for intelligent robotic manipulation of typical industrial tasks, including pushing, picking, and placing workpieces, which are essential and fundamental processing plan activities for accomplishing robotic machining and assembly applications towards smart manufacturing. The introduction of reinforcement learning enables robots to learn manipulation tasks autonomously, which can save the effort for engineers to teach or hard program the robot and also reduce labor costs. However, the existing HER-based reinforcement learning algorithms are with low training efficiency and performance due to
Computing the optimal solution to a spatial filtering problems in a Wireless Sensor Network can incur large bandwidth and computational requirements if an approach relying on data centralization is used. The so-called...
详细信息
Amidst the rapid advancement of Internet of Things (IoT) technology, achieving precise indoor localization has emerged as a pivotal research area. Localization algorithms relying on Radio Frequency Identification (RFI...
详细信息
ISBN:
(纸本)9798350349184;9798350349191
Amidst the rapid advancement of Internet of Things (IoT) technology, achieving precise indoor localization has emerged as a pivotal research area. Localization algorithms relying on Radio Frequency Identification (RFID) received signal strength indicator (RSSI) have gained widespread adoption in numerous indoor positioning systems due to their straightforward implementation and cost-effectiveness. However, in indoor settings, challenges like building obstructions and multipath effects often lead to signal reception failures by RFID antennas, consequently compromising the reliability of positioning outcomes. Recent research has approached indoor localization as a regression problem, employing deep learning models for analysis and prediction. But most current indoor localization models primarily focus on either spatial or temporal features within RSSI data, leading to suboptimal localization outcomes. To tackle these challenges, this paper proposes an enhanced methodology that leverages Generative Adversarial Networks (GAN) to impute missing RSSI data. Additionally, Convolutional Neural Networks (CNN) are utilized to extract spatial domain features, while Long Short-Term Memory Networks (LSTM) are employed for extracting temporal domain features. Ultimately, this paper designs a novel model, GCLA, which integrates an Attention mechanism with a location coding strategy to fuse features for precise location prediction. Experimental results show that the proposed GCLA model can obtain stable localization results after a short training on a small number of datasets.
暂无评论