Natural resources are the material basis for human survival and national economic and social development. Natural resources survey is related to major national decision-making and deployment, fully supports and serves...
详细信息
Infrared thermal imaging technology has been widely applied in the field of target detection at present, which lays the foundation of research on situational awareness for the Maritime Autonomous Surface Ship. Compare...
详细信息
As one of the promising technology, Brain-computer Interface assists people by exploring various aspects of brain functionality. Analyzing and decoding brain signals to classify mental workload can help in the diagnos...
详细信息
ISBN:
(纸本)9783031451690;9783031451706
As one of the promising technology, Brain-computer Interface assists people by exploring various aspects of brain functionality. Analyzing and decoding brain signals to classify mental workload can help in the diagnosis and treatment of various neurodegenerative diseases and neurological disorders like learning disability. In this pursuit, functional near-infrared spectroscopy has emerged as a promising non-invasive technique that uses blood flow patterns to analyze brain signals. In this study open-access functional near-infrared spectroscopy datasets are used for mental workload classification. We have considered it as a two class problem and proposed a one-dimensional convolutional neural network to distinguish between mental task and rest, and compared the performance with two other machine learning methods, support vector machine and deep neural network. the proposed one-dimensional convolutional neural network model eliminates the requirement for complex brain imageprocessing providing a computationally efficient alternative and comparable accuracy.
In the present cooperative competition of robot, the robot needs to use its own camera module to collect real-time images and identify the enemy robot according to the collected images. However, due to the interferenc...
In the present cooperative competition of robot, the robot needs to use its own camera module to collect real-time images and identify the enemy robot according to the collected images. However, due to the interference of some external factors in the arena, the contrast and clarity of the images collected by the robot are poor, which is not conducive to the accurate recognition of the robot. therefore, this paper will preprocess the images collected by the robot, and reduce the interference caused by off-site factors through image filtering, image enhancement, and grayscale processing. then use the method based on HSV color detection to achieve accurate identification of enemy robots. Finally, the accurate coordinates of the enemy robot can be obtained by using the target contour method to meet the accuracy requirements. through the above methods, the robot can quickly and accurately obtain the position and shape of the enemy robot.
Nanomaterials are used in almost every field of engineering. Synthesis techniques and conditions greatly affect the properties of synthesized nanomaterials. Identifying the nanomaterial from FESEM and TEM images with ...
详细信息
this paper focuses on recognizing the individual Tamil characters in images of natural scenes. Even though natural character dataset are available publicly for different languages, no specific standardized dataset is ...
详细信息
the application of artificial intelligence (AI) in aquaculture may improve the efficiency of fish farming management. computervision is one of the fields in AI beneficial for aquaculture. However, the underwater imag...
详细信息
We propose a novel computational approach to automatically analyze the physical process behind printing of early modern letterpress books via clustering the running titles found at the top of their pages. Specifically...
详细信息
ISBN:
(纸本)9783031705427;9783031705434
We propose a novel computational approach to automatically analyze the physical process behind printing of early modern letterpress books via clustering the running titles found at the top of their pages. Specifically, we design and compare custom neural and feature-based kernels for computing pairwise visual similarity of a scanned document's running titles and cluster the titles in order to track any deviations from the expected pattern of a book's printing. Unlike body text which must be reset for every page, the running titles are one of the static type elements in a skeleton forme i.e. the frame used to print each side of a sheet of paper, and were often re-used during a book's printing. To evaluate the effectiveness of our approach, we manually annotate the running title clusters on about 1600 pages across 8 early modern books of varying size and formats. Our method can detect potential deviation from the expected patterns of such skeleton formes, which helps bibliographers understand the phenomena associated with a text's transmission, such as censorship. We also validate our results against a manual bibliographic analysis of a counterfeit early edition of thomas Hobbes' Leviathan (1651) [27]. (Code and data available at https://***/nvog/clustering-running-titles)
the joint understanding of vision and language has been recently gaining a lot of attention in boththe computervision and Natural Language processing communities, withthe emergence of tasks such as image captioning...
详细信息
ISBN:
(纸本)9781728188089
the joint understanding of vision and language has been recently gaining a lot of attention in boththe computervision and Natural Language processing communities, withthe emergence of tasks such as image captioning, image-text matching, and visual question answering. As bothimages and text can be encoded as sets or sequences of elements - like regions and words - proper reduction functions are needed to transform a set of encoded elements into a single response, like a classification or similarity score. In this paper, we propose a novel fully-attentive reduction method for vision and language. Specifically, our approach computes a set of scores for each element of each modality employing a novel variant of cross-attention, and performs a learnable and cross-modal reduction, which can be used for both classification and ranking. We test our approach on image-text matching and visual question answering, building fair comparisons with other reduction choices, on both COCO and VQA 2.0 datasets. Experimentally, we demonstrate that our approach leads to a performance increase on both tasks. Further, we conduct ablation studies to validate the role of each component of the approach.
A novel normalized-centered image moment invariant-based visual servoing scheme is presented. A fresh combination of the moments invariants is proposed for planar symmetrical objects that are parallel to the image pla...
详细信息
暂无评论