Speech is a fundamental means of human interaction. Speaker Identification (SI) plays a crucial role in various applications, such as authentication systems, forensic investigation, and personal voice assistance. Howe...
详细信息
Speech is a fundamental means of human interaction. Speaker Identification (SI) plays a crucial role in various applications, such as authentication systems, forensic investigation, and personal voice assistance. However, achieving robust and secure SI in both open and closed environments remains challenging. To address this issue, researchers have explored new techniques that enable computers to better understand and interact with humans. Smart systems leverage Artificial Neural Networks (ANNs) to mimic the human brain in identifying speakers. However, speech signals often suffer from interference, leading to signal degradation. The performance of a Speaker Identification System (SIS) is influenced by various environmental factors, such as noise and reverberation in open and closed environments, respectively. This research paper is concerned with the investigation of SI using Mel-Frequency Cepstral Coefficients (MFCCs) and polynomial coefficients, with an ANN serving as the classifier. To tackle the challenges posed by environmental interference, we propose a novel approach that depends on symmetric comb filters for modeling. In closed environments, we study the effect of reverberation on speech signals, as it occurs due to multiple reflections. To address this issue, we model the reverberation effect with comb filters. We explore different domains, including time, Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), and Discrete Sine Transform (DST) domains for feature extraction to determine the best combination for SI in case of reverberation environments. Simulation results reveal that DWT outperforms other transforms, leading to a recognition rate of 93.75% at a Signal-to-Noise Ratio (SNR) of 15 dB. Additionally, we investigate the concept of cancelable SI to ensure user privacy, while maintaining high recognition rates. Our simulation results show a recognition rate of 97.5% at 0 dB using features extracted from speech signals and their DCTs. Fo
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate *** this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in *** support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the *** the model complexity and the overall model *** fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA *** far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no ***,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA *** indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that *** the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model *** Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.
computer vision methods for depth estimation usually use simple camera models with idealized optics. For modern machine learning approaches, this creates an issue when attempting to train deep networks with simulated ...
详细信息
computer vision methods for depth estimation usually use simple camera models with idealized optics. For modern machine learning approaches, this creates an issue when attempting to train deep networks with simulated data, especially for focus-sensitive tasks like Depth-from-Focus. In this work, we investigate the domain gap caused by off-axis aberrations that will affect the decision of the best-focused frame in a focal stack. We then explore bridging this domain gap through aberration-aware training (AAT). Our approach involves a lightweight network that models lens aberrations at different positions and focus distances, which is then integrated into the conventional network training pipeline. We evaluate the generality of network models on both synthetic and real-world data. The experimental results demonstrate that the proposed AAT scheme can improve depth estimation accuracy without fine-tuning the model for different datasets. The code will be available in ***/vccimaging/Aberration-Aware-Depth-from-Focus. Author
W-type barium-nickel ferrite(BaNi_(2)Fe_(16)O_(27))is a highly promising material for electromagnetic wave(EMW)absorption be-cause of its magnetic loss capability for EMW,low cost,large-scale production potential,high...
详细信息
W-type barium-nickel ferrite(BaNi_(2)Fe_(16)O_(27))is a highly promising material for electromagnetic wave(EMW)absorption be-cause of its magnetic loss capability for EMW,low cost,large-scale production potential,high-temperature resistance,and excellent chemical ***,the poor dielectric loss of magnetic ferrites hampers their utilization,hindering enhancement in their EMW-absorption *** efficient strategies that improve the EMW-absorption performance of ferrite is highly desired but re-mains ***,an efficient strategy substituting Ba^(2+)with rare earth La^(3+)in W-type ferrite was proposed for the preparation of novel La-substituted ferrites(Ba_(1-x)LaxNi_(2)Fe_(15.4)O_(27)).The influences of La^(3+)substitution on ferrites’EMW-absorption performance and the dissipative mechanism toward EMW were systematically explored and ***^(3+)efficiently induced lattice defects,enhanced defect-induced polarization,and slightly reduced the ferrites’bandgap,enhancing the dielectric properties of the ***^(3+)also enhanced the ferromagnetic resonance loss and strengthened magnetic *** effects considerably improved the EMW-absorption perform-ance of Ba_(1-x)LaxNi_(2)Fe_(15.4)O_(27)compared with pure W-type *** x=0.2,the best EMW-absorption performance was achieved with a minimum reflection loss of-55.6 dB and effective absorption bandwidth(EAB)of 3.44 GHz.
Dual-buck (DB) structured ac-ac converters are becoming advanced due to their inherent protection from open- and short-circuit risks, and elimination of commutation issue. However, the existing DB ac-ac converters pro...
详细信息
The healthcare system currently relies on the facility to store and process large amounts of health data, supported by efficient management. The Internet of Things (IoT) has driven the growth of Adroit Healthcare, whi...
详细信息
In the process of protecting power systems against different types of cyberattacks, the primary step is to precisely model such frameworks from attacker's perspective. This paper investigates a false data injectio...
详细信息
The strong impact of the strain-induced Dzyaloshinskii-Moriya interaction (SIDMI) on the magnetization dynamics of skyrmions in nanomagnetic structures is demonstrated. The effects of SIDMI are characterized by skyrmi...
详细信息
The strong impact of the strain-induced Dzyaloshinskii-Moriya interaction (SIDMI) on the magnetization dynamics of skyrmions in nanomagnetic structures is demonstrated. The effects of SIDMI are characterized by skyrmion equations (SEs) of motion and magnetoelastic (ME) equations. The study is performed on a model system of MgO/CoFe/Pt stacked on a piezoelectric substrate. The results demonstrate a major nonlinear amplification in both the first- and higher-harmonic magnitudes of the skyrmion breathing mode due to SIDMI. Remarkably, this enhancement can trigger a skyrmion collapse, enabling its deletion with ultraweak strain-induced excitations. The SIDMI effect is shown to be much more significant than the conventional ME effect. These findings open different avenues for the efficient manipulation of nanomagnetic structures through strain.
This innovative practice full paper describes an innovative pedagogical framework that integrates community engaged learning and social justice principles into a traditional web development course, fostering a holisti...
详细信息
This study introduces a novel control framework for human-drone interaction (HDI) in industrial warehouses, targeting pick-and-delivery operations. The goals are to enhance operator safety as well as well-being and, a...
详细信息
This study introduces a novel control framework for human-drone interaction (HDI) in industrial warehouses, targeting pick-and-delivery operations. The goals are to enhance operator safety as well as well-being and, at the same time, to improve efficiency and reduce production costs. To these aims, the speed and separation monitoring (SSM) operation method is employed for the first time in HDI, drawing an analogy to the safety requirements outlined in collaborative robots' ISO standards. The so-called protective separation distance is used to ensure the safety of operators engaged in collaborative tasks with drones. In addition, we employ the rapid upper limb assessment (RULA) method to evaluate the ergonomic posture of operators during interactions with drones. To validate the proposed approach in a realistic industrial setting, a quadrotor is deployed for pick-and-delivery tasks along a predefined trajectory from the picking bay to the palletizing area, where the interaction between the drone and a moving operator takes place. The drone navigates toward the interaction space while avoiding collisions with shelves and other drones in motion. The control strategy for the drone cruise navigation integrates simultaneously the time-variant artificial potential field (APF) technique for trajectory planning and the iterative linear quadratic regulator (LQR) controller for trajectory tracking. Differently, in the descent phase, the receding horizon LQR algorithm is employed to follow a trajectory planned in accordance with the SSM, which starts from the approach point at the border of the interaction space and ends in the volume with the operator's minimum RULA. The presented control strategy facilitates drone management by adapting the drone's position to changes in the operator's position while satisfying HDI safety requirements. The results of the proposed HDI framework simulations for the case study demonstrate the effectiveness of the method in ensuring a safe and er
暂无评论