Gas Leakage and its fatal effects are a great concern throughout the world, especially in developing countries like Bangladesh. Every year lots of people died and countless damages to assets occur due to the fire caus...
详细信息
The continuous advancement of remote sensor technology is contributing to a daily surge in data production, necessitating improvements in the accuracy of big data classification. This research proposes a unique featur...
详细信息
As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-...
详细信息
As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging *** this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video *** authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual *** outputs of audio and visual branches are combined in a decision fusion *** humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified ***,the position of the keyword is localised in the attention map without additional position ***-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy *** code is available at https://***/jialeren/AVKT.
Recent advances in the field of image generation have attracted attention due to the growing number of diverse data sources and test samples. A primary driver of this evolution is the application of neural networks, p...
详细信息
Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy ***,biometric systems have emerged as a technology with...
详细信息
Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy ***,biometric systems have emerged as a technology with the capability to identify or authenticate individuals based on their physiological and behavioral *** different viable biometric modalities,the human ear structure can offer unique and valuable discriminative characteristics for human recognition *** recent years,most existing traditional ear recognition systems have been designed based on computer vision models and have achieved successful ***,such traditional models can be sensitive to several unconstrained environmental *** such,some traits may be difficult to extract automatically but can still be semantically perceived as soft *** research proposes a new group of semantic features to be used as soft ear biometrics,mainly inspired by conventional descriptive traits used naturally by humans when identifying or describing each ***,the research study is focused on the fusion of the soft ear biometric traits with traditional(hard)ear biometric features to investigate their validity and efficacy in augmenting human identification *** proposed framework has two subsystems:first,a computer vision-based subsystem,extracting traditional(hard)ear biometric traits using principal component analysis(PCA)and local binary patterns(LBP),and second,a crowdsourcing-based subsystem,deriving semantic(soft)ear biometric *** feature-level fusion experiments were conducted using the AMI database to evaluate the proposed algorithm’s *** obtained results for both identification and verification showed that the proposed soft ear biometric information significantly improved the recognition performance of traditional ear biometrics,reaching up to 12%for LBP and 5%for PCA descriptors;when fusing all three capa
Writing comprehensive commit messages is tedious yet important, because these messages describe changes of code, such as fixing bugs or adding new features. However, most existing methods focus on either only the chan...
详细信息
Osteoarthritis (OA) of the knee is an inflammation that impacts the knee bone due to the significant weight-bearing of the body. The disease results in degeneration and rupture of the cartilage elements in the knee jo...
详细信息
We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries: oblivious and adaptive. Consider the model y∗ = X∗β∗ + η where X∗ is a...
We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries: oblivious and adaptive. Consider the model y∗ = X∗β∗ + η where X∗ is an n × d random design matrix, β∗ ∈ Rd is a k-sparse vector, and the noise η is independent of X∗ and chosen by the oblivious adversary. Apart from the independence of X∗, we only require a small fraction entries of η to have magnitude at most 1. The adaptive adversary is allowed to arbitrarily corrupt an Ε-fraction of the samples (X1∗, y1∗), ..., (Xn∗, yn∗ ). Given the Ε-corrupted samples (X1, y1), ..., (Xn, yn), the goal is to estimate β∗. We assume that the rows of X∗ are iid samples from some d-dimensional distribution D with zero mean and (unknown) covariance matrix Σ with bounded condition number. We design several robust algorithms that outperform the state of the art even in the special case of Gaussian noise η ∼ N(0, 1)n. In particular, we provide a polynomial-time algorithm that with high probability recovers β∗ up to error O(√Ε) as long as n ≥ O∼ (k2/Ε), only assuming some bounds on the third and the fourth moments of D. In addition, prior to this work, even in the special case of Gaussian design D = N(0, Σ) and noise η ∼ N(0, 1), no polynomial time algorithm was known to achieve error o(√Ε) in the sparse setting n 2. We show that under some assumptions on the fourth and the eighth moments of D, there is a polynomial-time algorithm that achieves error o(√Ε) as long as n ≥ O∼ (k4/Ε3). For Gaussian distribution D = N(0, Σ), this algorithm achieves error O(Ε3/4). Moreover, our algorithm achieves error o(√Ε) for all log-concave distributions if Ε ≤ 1/polylog(d). Our algorithms are based on the filtering of the covariates that uses sum-of-squares relaxations, and weighted Huber loss minimization with 1 regularizer. We provide a novel analysis of weighted penalized Huber loss that is suitable for heavy-tailed designs in the presence of two adversaries
The FOD-R Dataset is a collection of images that depict common types of foreign object debris (FOD) that can be found on runways or taxiways. The dataset has primarily been annotated using bounding boxes to facilitate...
详细信息
Globally, cardiovascular disease is the leading cause of death. In CVDs, the heart is a vital organ for supplying blood to other body organs. Complex datasets with heart problems are predicted by machine learning meth...
详细信息
暂无评论