Image captioning means generating relevant texts from images that describe the image. With the growing advancement of deep learning, automatic image caption generation has become an exciting problem among researchers....
Image captioning means generating relevant texts from images that describe the image. With the growing advancement of deep learning, automatic image caption generation has become an exciting problem among researchers. Image captioning is used in various fields of computerscience including computer vision and NLP. It is helpful in many cases such as visual impairment. Though many research works have already been published on this topic using different deep learning models, we can work with various features of the deep learning models to achieve better accuracy. In this paper, we are focusing on combining several deep-learning models to get the desired accuracy. VGG16 model is used which is a CNN architecture to extract essential features from the image. For the generation of relevant captions, LSTM is adopted rather than RNNs as LSTM generates better captions compared to RNNs and it consumes less time. Finally, the model is trained on Flickr 8k datasets.
Recent advances in generative artificial intelligence (AI), and particularly the integration of large language models (LLMs), have had considerable impact on multiple domains. Meanwhile, enhancing dynamic network perf...
详细信息
AIDS is a sexually transmitted disease. The medical term is called Acquired Immune Deficiency Syndrome. Medical methods such as cocktail therapy and anticancer drugs can kill AIDS. Although there are currently cured c...
详细信息
Eye tracking is underpinned by the eye-mind hypothesis, which posits that individuals tend to direct their gaze toward the information they are currently cognitively engaged with. This aspect has garnered significant ...
详细信息
The service delivery has lately witnessed strides in the form of paradigm shifts from conventional logistics to drone-oriented supply chain to conserve ecosystem. The use of drones or unmanned aerial vehicles (UAVs) f...
详细信息
Given the prospects of the low-altitude economy (LAE) and the popularity of unmanned aerial vehicles (UAVs), there are increasing demands on monitoring flying objects at low altitude in wide urban areas. In this work,...
详细信息
Chinese spelling check (CSC) detects and corrects spelling errors in Chinese texts. Previous approaches have combined character-level phonetic and graphic information, ignoring the importance of segment-level informat...
详细信息
In the realm of disaster response, the integration of Internet of Things (IoT) technologies has paved the way for innovative solutions to enhance human detection and rescue operations. This research work introduces a ...
详细信息
ISBN:
(数字)9798350359299
ISBN:
(纸本)9798350359305
In the realm of disaster response, the integration of Internet of Things (IoT) technologies has paved the way for innovative solutions to enhance human detection and rescue operations. This research work introduces a revolutionary approach through the development of the Challenging Human Detection and Rescue Robot (CHDRR). This robot incorporates advanced sensors, including the MQ6 Gas Sensor, LM335 Temperature Sensor, Humidity Sensor, and a Passive Infrared (PIR) Human Body Detection sensor, collectively forming a comprehensive sensory suite. The CHDRR leverages the power of IoT to enable real-time data collection, analysis, and communication, transforming disaster response capabilities. The MQ6 Gas Sensor ensures the detection of hazardous gases in the environment, providing crucial information for safe rescue operations. The LM335 Temperature Sensor and Humidity Sensor contribute to environmental monitoring, allowing the robot to adapt its strategies based on prevailing conditions. The PIR Human Body Detection sensor serves as a key element in the CHDRR's human detection capabilities. Through sophisticated algorithms and machine learning, the robot can identify and locate human presence amidst challenging disaster scenarios. The fusion of these sensors with IoT technology empowers the CHDRR to make informed decisions autonomously, optimizing its efficiency in dynamically changing environments.
Analysis of public sentiment is extremely useful for comprehending the responses of the general public during important events, and the FIFA World Cup 2022 was no exception. Within the scope of this study, we used dee...
Analysis of public sentiment is extremely useful for comprehending the responses of the general public during important events, and the FIFA World Cup 2022 was no exception. Within the scope of this study, we used deep learning models such as roBERTa, distilBERT, and XLNet to conduct an analysis of the views that were stated on Twitter during the first day of the tournament. These models were fine-tuned using a comprehensive dataset consisting of 30,000 tweets, which had been preprocessed. The performance of these models was assessed using measures such as accuracy, F1-score, precision, recall, etc. In addition, we used an Explainable AI known as Local Interpretable Model-Agnostic Explanations (LIME) so that we could better understand how model decisions were made in sentiment classification. Our research has shown that roBERTa is an excellent model for classifying sentiment, and it has also shown the significance of interpretability achieved using LIME. Our research enhances the understanding of sentiment analysis during major sports events and suggests future directions for research in this domain.
Understanding consumer attitudes toward specific products is crucial for boosting sales in the e-commerce industry. To effectively target customers with popular products based on reviews, the classification of consume...
Understanding consumer attitudes toward specific products is crucial for boosting sales in the e-commerce industry. To effectively target customers with popular products based on reviews, the classification of consumer feedback becomes imperative. However, classifying product reviews can be challenging, particularly when dealing with imbalanced data labels, which often result in suboptimal classification performance. This study builds upon previous efforts that utilized the Amazon Fine Food Reviews dataset for classification tasks. While these prior attempts showed promise, they were hindered by either poor embeddings or the prevalent class imbalance issue. In response, this research tries to solve these problems by using word embeddings with RoBERTa, a pre-trained transformer-based language model, to classify reviews. Additionally, the XGBoost classifier was implemented, along with embeddings from the language model. Losses were first calculated with equal weights for all class labels, and a re-weighted loss was subsequently adopted to balance the impact of each class on the loss function during training. The incorporation of RoBERTa and XGBoost, along with the class label re-weighting, contributed to improved capturing of intricate word relationships within reviews. As a result, this approach achieves significantly improved accuracy in both binary and multiclass classifications compared to earlier endeavors. Notably, it attained an impressive accuracy of 83.84% in multiclass classification and 93.29% in binary classification tasks, marking a substantial advancement in the field of consumer review analysis.
暂无评论