Leveraging cutting-edge eye-tracking technology and machine learning algorithms, a real-time, non-invasive solution that empowers individuals with motor disabilities, allowing them to communicate seamlessly through na...
详细信息
Leveraging cutting-edge eye-tracking technology and machine learning algorithms, a real-time, non-invasive solution that empowers individuals with motor disabilities, allowing them to communicate seamlessly through natural eye movements. The project encompasses a comprehensive pipeline, starting with the collection of precise eye movement data using state-of-the-art eye-tracking hardware. It employs sophisticated image processing techniques to preprocess the acquired data, filtering out noise and detecting blink patterns accurately. This computer vision project not only showcases the potential of eye blink detection for text-based communication but also highlights the importance of innovative solutions that empower individuals with physical limitations to interact with technology effortlessly. Our recommended approach is continually used to test the effects of light and the distance between a user's eyes and a mobile device to assess the exact position, according to test results, offers 90% general exactness and 100% recognition accuracy for a distance of 15 cm with a false light.
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, ap...
详细信息
ISBN:
(纸本)9783031804373;9783031804380
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, applying it to Arabic language text is still in its early stages. Additionally, the Arabic language is challenging due to its right-to-left writing system and extensive vocabulary of 1.3 million words. In this paper, we explore text-to-image generation for generating images from Arabic language text descriptions. Firstly, we fine-tune a transformer-based model pre-trained on the Arabic text to transform the text information into affine transformation within the DF-GAN generator. Secondly, we present a text transformer that combines LSTM layers to address the limitation of unrecognized words. Thirdly, a mask predictor is trained into the generator using a weakly supervised method and incorporated into the affine transformation for a more effective integration of image and text features. In addition, we add the DAMSM loss function as a regularization to the loss function to achieve convergences and stability in the training phase. The experiment on two challenging datasets CUB and Oxford-flower shows that our architectures can accurately generate high-quality images faithfully representing the Arabic textual descriptions. We believe the scaling of this task could have critical applications in fields such as Arabic visual learning, e-commerce, advertising, and entertainment.
Integrating deep learning in speech separation has revolutionized audio signal processing, impacting fields like speech recognition, audio-visual content creation, telecommunication, hearing aid technologies, etc. In ...
详细信息
Considering the varying advantages, disadvantages, and implementation difficulties of current indoor positioning algorithms, this paper conducts a comparative analysis of common UWB ranging methods. The Two-Way Rangin...
详细信息
Accurate self-localization of unmanned aerial systems (UAS) is needed to reduce their dependency on global navi-gation satellite systems (GNSS). Image retrieval techniques comparing aerial images with a reference data...
详细信息
Interpreting human neural signals to decode static speech intentions such as text or images and dynamic speech intentions such as audio or video is showing great potential as an innovative communication tool. Human co...
详细信息
The Russo-Ukrainian conflict underscores challenges in obtaining reliable firsthand accounts. Traditional methods such as satellite imagery and journalism fall short due to limited access to zones. Secure social media...
详细信息
ISBN:
(纸本)9783031785375;9783031785382
The Russo-Ukrainian conflict underscores challenges in obtaining reliable firsthand accounts. Traditional methods such as satellite imagery and journalism fall short due to limited access to zones. Secure social media platforms such as Telegram offer safer communication from conflict zones but lack effective message grouping, hindering insight collection. The proposed framework aims to enhance firsthand account gathering by crowdsourcing secure social media data. We gathered 250,000 Telegram messages on the conflict and developed a language model-based framework to identify contextual groupings. Evaluation reveals 477 new groupings from 13 news sources, enriching firsthand information. This research emphasizes the significance of secure social media crowdsourcing in conflict zones, paving the way for future advancements.
With the development of autonomous driving technology, vehicles integrate camera, lidar, and radar modules for environment perception. In mass-produced models, cameras and lidars mostly serve as primary sensors, while...
With the increasing requirements for power supply systems and the continuous improvement of emergency rescue command center system construction, the requirements for early warning emergency systems are constantly incr...
详细信息
With the progress of science and technology, Internet technology has also been greatly developed, people can communicate through the network, so there are a lot of interpersonal interaction in the network. NSGA-ii (No...
详细信息
暂无评论