This article introduces an automatic data acquisition system for the calibration device of the inclinometer using a circular grating encoder as the standard device. The system utilizes LabVIEW to develop a complete ho...
详细信息
Over the past decade, considerable research has investigated vision-Based Assistive Technologies (VBAT) to support people with vision impairments to understand and interact with their immediate environment using machi...
详细信息
ISBN:
(纸本)9798400706776
Over the past decade, considerable research has investigated vision-Based Assistive Technologies (VBAT) to support people with vision impairments to understand and interact with their immediate environment using machine learning, computer vision, image enhancement, and/or augmented/virtual reality. However, this has almost totally overlooked a growing demographic: people with Cerebral Visual Impairment (CVI). Unlike ocular vision impairments, CVI arises from damage to the brain's visual processing centres. Through a scoping review, this paper reveals a signifcant research gap in addressing the needs of this demographic. Three focus studies involving 7 participants with CVI explored the challenges, current strategies, and opportunities for VBAT. We also discussed the assistive technology needs of people with CVI compared with ocular lowvision. Our fndings highlight the opportunity for the Human-Computer Interaction and Assistive Technologies research community to explore and address this underrepresented domain, thereby enhancing the quality of life for people with CVI.
Due to the lack of understanding of the concept of garbage classification and the lack of awareness of garbage classification in the process of garbage classification, a home intelligent sorting trash bin design based...
详细信息
image captioning generation is a combination of the visual domain and natural language processing. The transformer framework has become the mainstream approach. This paper combines reinforcement learning and transform...
详细信息
ISBN:
(纸本)9781510666184;9781510666191
image captioning generation is a combination of the visual domain and natural language processing. The transformer framework has become the mainstream approach. This paper combines reinforcement learning and transformer methods to reward dynamics backpropagation and normalization in the testing phase. Its characteristic is that when the steps of reinforcement learning increase, the agent model has more knowledge of the fully information, which reduces the computing cost of the system. The experimental results show that the reinforcement transformer structure has achieved a certain improvement in speed.
In today's ever-changing world, the ability of machine learning models to continually learn new data without forgetting previous knowledge is of utmost importance. However, in the scenario of few-shot class-increm...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
In today's ever-changing world, the ability of machine learning models to continually learn new data without forgetting previous knowledge is of utmost importance. However, in the scenario of few-shot class-incremental learning (FSCIL), where models have limited access to new instances, this task becomes even more challenging. Current methods use prototypes as a replacement for classifiers, where the cosine similarity of instances to these prototypes is used for prediction. However, we have identified that the embedding space created by using the relu activation function is incomplete and crowded for future classes. To address this issue, we propose the Expanding Hyperspherical Space (EHS) method for FSCIL. In EHS, we utilize an odd-symmetric activation function to ensure the completeness and symmetry of embedding space. Additionally, we specify a region for base classes and reserve space for unseen future classes, which increases the distance between class distributions. Pseudo instances are also used to enable the model to anticipate possible upcoming samples. During inference, we provide rectification to the confidence to prevent bias towards base classes. We conducted experiments on benchmark datasets such as CIFAR100 and miniimageNet, which demonstrate that our proposed method achieves state-of-the-art performance.
Rainfall image restoration is one of the research hotspots in the field of computer vision. For raindrops attached to the camera lens or glass, it will hinder the visualization of the complete image and also lead to i...
详细信息
Crankshaft is one of the mechanical components of the vehicle engine, and quality control of it holds significant importance in the production line. In this paper, a vision-based system was developed to detect apparen...
Crankshaft is one of the mechanical components of the vehicle engine, and quality control of it holds significant importance in the production line. In this paper, a vision-based system was developed to detect apparent structural defects on the crankshaft surface. By examining the different approaches in computer vision tasks, the semantic segmentation technique was chosen to solve this problem. In the first stage, a dataset consisting of 400 crankshaft experimental images with structural defects such as scratch, pitting, and grinding were collected. Then, the Convolutional Neural Network (CNN) with MobileNet architecture was trained to detect apparent defects, and an Intersection over Union (IoU) evaluation criteria of 64.7% was obtained. In the third stage, some imageprocessing techniques were used to increase the performance. By applying the DexiNed edge detection filter on the train-set images, the IoU was increased by 8.4%. Considering the importance of this issue in the automotive industry, it has been tried again to boost the performance by augmenting the dataset images. On the other hand, this can also prevent overfitting of the model. By training the model under the same conditions as the previous stages, the IoU in this stage increased by 13.2% and reached 86.3%.
The proceedings contain 11 papers. The topics discussed include: toward objective variety testing score based on computer vision and unsupervised machine learning: application to apple shape;using deep learning for th...
ISBN:
(纸本)9789897586934
The proceedings contain 11 papers. The topics discussed include: toward objective variety testing score based on computer vision and unsupervised machine learning: application to apple shape;using deep learning for the dynamic evaluation of road marking features from laser imaging;Belfort birth records transcription: preprocessing, and structured data generation;fitting tree model with CNN and geodesics to track blood vessels in 2D medical images and application to ultrasound localization microscopy data;production-ready end-to-end visual quality inspection for defect detection on surfaces based on a multi-stage ai system;multimodal deepfake detection for short videos;HERO-GPT: zero-shot conversational assistance in industrial domains exploiting large language models;leveraging temporal context in human pose estimation: a survey;and chaotic convolutional long short-term memory network for respiratory motion prediction.
In traditional industry, the detection of the automobile components quality is mainly completed by human eyes. Low detection accuracy, high labor consumption, and slow detection speed are important reasons for the slo...
详细信息
In recent years, the integration of advanced imaging techniques and deep learning methods has significantly advanced computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Transformers,...
详细信息
暂无评论