Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate *** this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in *** support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the *** the model complexity and the overall model *** fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA *** far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no ***,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA *** indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that *** the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model *** Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.
While deep learning techniques have shown promising performance in the Major Depressive Disorder (MDD) detection task, they still face limitations in real-world scenarios. Specifically, given the data scarcity, some e...
详细信息
With the increasing popularity of smart portable electronic gadgets, voice-based online person verification systems have become prevalent. However, these systems are susceptible to attacks where illegitimate individua...
详细信息
With the increasing popularity of smart portable electronic gadgets, voice-based online person verification systems have become prevalent. However, these systems are susceptible to attacks where illegitimate individuals exploit the recorded voices of legitimate users, leading to false confirmations—spoofing attacks. To overcome this limitation, this article presents an innovative solution by combining speech and online handwritten signatures to mitigate the risks associated with spoofing attacks in voice-based authentication systems because a person has to be present in front of the system to produce an online handwritten signature. To accomplish this objective, this work proposes a novel bidirectional Legendre memory unit (BLMU), a type of recurrent neural network (RNN), for person authentication (verification) and recognition. The Legendre memory unit (LMU) is an innovative memory cell for RNNs that efficiently retains temporal/non-temporal sequential information over a long period with minimal resources. It achieves information orthogonalization by solving coupled ordinary differential equations (ODEs) and leveraging Legendre polynomials, ensuring effective data representation. The proposed framework for person authentication and recognition comprises seven convolution layers, four BLMU layers, two dense layers, and one output layer. The performance of the proposed BLMU-based deep learning framework has been evaluated on a self-generated/private dataset of combined feature matrix of voice signals and online handwritten signatures in the Devanagari script. To assess performance, experiments have also been conducted using various RNN architectures, such as LSTM, BLSTM, and ordinary differential equation recurrent neural network (ODE-RNN), to have a performance comparison with the proposed BLMU-based deep learning (DL) framework. The results demonstrate the superiority of the proposed BLMU-based DL framework in enhancing the accuracy of person verification systems,
Railway accidents are an under-scrutinised cause of death in India. Train accidents are caused by various consequences of collisions, derailments, signal errors and so on. Furthermore, when train derailments become di...
详细信息
Mobile devices within Fifth Generation(5G)networks,typically equipped with Android systems,serve as a bridge to connect digital gadgets such as global positioning system,mobile devices,and wireless routers,which are v...
详细信息
Mobile devices within Fifth Generation(5G)networks,typically equipped with Android systems,serve as a bridge to connect digital gadgets such as global positioning system,mobile devices,and wireless routers,which are vital in facilitating end-user communication ***,the security of Android systems has been challenged by the sensitive data involved,leading to vulnerabilities in mobile devices used in 5G *** vulnerabilities expose mobile devices to cyber-attacks,primarily resulting from security ***-permission apps in Android can exploit these channels to access sensitive information,including user identities,login credentials,and geolocation *** such attack leverages"zero-permission"sensors like accelerometers and gyroscopes,enabling attackers to gather information about the smartphone's *** underscores the importance of fortifying mobile devices against potential future *** research focuses on a new recurrent neural network prediction model,which has proved highly effective for detecting side-channel attacks in mobile devices in 5G *** conducted state-of-the-art comparative studies to validate our experimental *** results demonstrate that even a small amount of training data can accurately recognize 37.5%of previously unseen user-typed ***,our tap detection mechanism achieves a 92%accuracy rate,a crucial factor for text *** findings have significant practical implications,as they reinforce mobile device security in 5G networks,enhancing user privacy,and data protection.
Automobiles are the inevitable mode of ***,increasing fuel prices and carbon dioxide emissions are posing a serious threat to automobile users and the ***,the development of new lightweight materials has been a key ar...
详细信息
Automobiles are the inevitable mode of ***,increasing fuel prices and carbon dioxide emissions are posing a serious threat to automobile users and the ***,the development of new lightweight materials has been a key area of ***-based commercial alloys(AZ and ZK series alloys)are the lightest among all structural ***,there is still a question about the replacement of Aluminum-based alloys due to HCP crystal *** this connection,Mg-Al-Ca-Mn(AXM)Mg alloy can be a choice as an alternative to the existing Mg-based commercial alloys for structural *** contains(Al,Mg)_(2)Ca,Al_(2)Ca,Mg_(2)Ca,and Al_(8)Mn_(5)as the secondary phases,contributing to the microstructural refinement and property ***,the formation of those precipitates depends on the amount of Al,Ca,and Mn,especially,the Ca/Al *** addition,the secondary processes influence the grain refinement and property enhancement of texture ***,this review article focuses on elaborating on the significance of the Ca/Al ratio for the precipitate formation,secondary process,and texture *** co-segregation behavior of other micro-alloying elements like Cerium,Lanthanum,and Zinc in AXM Mg alloy systems has also been discussed for property enhancement.
Dear Editor,In this letter, a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated (HOFA) systems with noises. The method can effe...
Dear Editor,In this letter, a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated (HOFA) systems with noises. The method can effectively deal with nonlinearities, constraints, and noises in the system, optimize the performance metric, and present an upper bound on the stable output of the system.
The integration of cloud and edge computing, along with machine learning, plays a vital role in the development of efficient healthcare systems in smart cities. However, machine and deep learning (DL) models are prone...
详细信息
Diabetic retinopathy (DR), a type of eye disease, is a danger for diabetics. Manual labour, which is prone to inaccuracy and time consuming, makes dealing with this illness considerably more difficult. Normally comput...
详细信息
Diabetic retinopathy (DR), a type of eye disease, is a danger for diabetics. Manual labour, which is prone to inaccuracy and time consuming, makes dealing with this illness considerably more difficult. Normally computer-assisted diagnosis has appeared as a promising tool for the early identification and severity grading of DR. As technologies are revolutionizing day by day, in which the most advance technology deep learning's algorithm gives a tremendous support for healthcare fields. This article proposes an efficient classification of DR models for categories the DR into different grades and to identify the severity. There various prediction techniques employed in DR detection. Radial Basics Network, Multilayer Perceptron and Recurrent Neural Network are binary classifiers employed for DR classification. Further the Bag of Visual Words and Convolutional Neural Networks implements for the stages of 3. The performance shows that Convolutional Neural Network perform superior over other methods and attains 98.3%. It is of great significance to apply deep-learning techniques for DR recognition. However, deep-learning algorithms often depend on large amounts of labeled data, which is expensive and time-consuming to obtain in the medical imaging area. In addition, the DR features are inconspicuous and spread out over high-resolution fundus images. Therefore, it is a big challenge to learn the distribution of such DR features. To overcome this, This research work proposes a multichannel-based generative adversarial network (M-GAN) for data augmentation as well as classification to grade DR The usefulness and effectiveness of GAN for classification of fundus images are explored for the first *** medical data is also a tedious and challenging one because it is quite expensive and confidential, to overcome this proposed model is acts data augmentation model, moreover the features in the input data’s are reduced by Dimensionality reduction Module (DRM) based on Pri
This study addresses the fixed-time-synchronized control problem of perturbed multi-input multioutput(MIMO) systems. In the task of fixed-time-synchronized control, different dimensions of the output signal in MIMO sy...
详细信息
This study addresses the fixed-time-synchronized control problem of perturbed multi-input multioutput(MIMO) systems. In the task of fixed-time-synchronized control, different dimensions of the output signal in MIMO systems are required to reach the desired value simultaneously within a fixed time *** MIMO system is categorized into two cases: the input-dimension-dominant and the state-dimensiondominant cases. The classification is defined according to the dimension of system signals and, more importantly, the capability of converging at the same time. For each kind of MIMO system, sufficient Lyapunov conditions for fixed-time-synchronized convergence are explored, and the corresponding robust sliding mode controllers are designed. Moreover, perturbations are compensated using the super-twisting technique. The brake control of the vertical takeoff and landing aircraft is considered to verify the proposed method for the input-dimension-dominant case, which shows the essential advantages of decreasing the energy consumption and the output trajectory length. Furthermore, comparative numerical simulations are performed to show the semi-time-synchronized property for the state-dimension-dominant case.
暂无评论