The enhancement of machine learning (ML) models relies heavily on the volume and integrity of data, emphasizing the importance of efficient data collection and rigorous ground-truth labeling. While deep learning, part...
详细信息
The enhancement of machine learning (ML) models relies heavily on the volume and integrity of data, emphasizing the importance of efficient data collection and rigorous ground-truth labeling. While deep learning, particularly its subset convolutional neural network (CNN), has shown promise in crack detection, there remains a need for sophisticated algorithms to identify structural defects accurately. This study presents a novel deep CNN model tailored for the binary classification of concrete surfaces, addressing a significant need in infrastructure engineering. The deep CNN model was developed using a comprehensive dataset (40,000 images each measuring 227 x 227 pixels). Various metrics, including precision, sensitivity, binary accuracy, and F1 score, were utilized to evaluate the model's performance. Additionally, the model's generalization capability was assessed by testing its proficiency in accurately classifying unseen data. The study demonstrates that the model's predictive performance improves with additional epochs, indicating enhanced learning over learning cycles. Validation metrics suggest potential generalization capability despite slight accuracy declines, showcasing the model's robustness in accurately classifying positive instances. The findings reveal significant advancements in deep CNN models for concrete material classification, surpassing previous comparable models. Employing CNN models holds promising outcomes for quality control and repair processes in infrastructure engineering applications. Future research directions include exploring the application of the deep CNN model to classify alternative materials and assessing its generalization capability using larger and more diverse datasets. Overall, this study contributes to the advancement of ML techniques in infrastructure engineering, with implications for optimizing material classification processes and enhancing infrastructure repair outcomes.
Quick and reliable measurement of wood chip moisture content is an everlasting problem for numerous forest-reliant industries such as biofuel, pulp and paper, and bio-refineries. Moisture content is a critical attribu...
详细信息
Quick and reliable measurement of wood chip moisture content is an everlasting problem for numerous forest-reliant industries such as biofuel, pulp and paper, and bio-refineries. Moisture content is a critical attribute of wood chips due to its direct relationship with the final product quality. Conventional techniques for determining moisture content, such as oven-drying, possess some drawbacks in terms of their time-consuming nature, potential sample damage, and lack of real-time feasibility. Furthermore, alternative techniques, including NIR spectroscopy, electrical capacitance, X-rays, and microwaves, have demonstrated potential;nevertheless, they are still constrained by issues related to portability, precision, and the expense of the required equipment. Hence, there is a need for a moisture content determination method that is instant, portable, non-destructive, inexpensive, and precise. This study explores the use of deep learning and machinevision to predict moisture content classes from RGB images of wood chips. A large-scale image dataset comprising 1,600 RGB images of wood chips has been collected and annotated with ground truth labels, utilizing the results of the oven-drying technique. Two high-performing neural networks, MoistNetLite and MoistNetMax, have been developed leveraging Neural Architecture Search (NAS) and hyperparameter optimization. The developed models are evaluated and compared with state-of-the-art deep learning models. Results demonstrate that MoistNetLite achieves 87% accuracy with minimal computational overhead, while MoistNetMax exhibits exceptional precision with a 91% accuracy in wood chip moisture content class prediction. With improved accuracy (9.6% improvement in accuracy by MoistNetMax compared to the best baseline model ResNet152V2) and faster prediction speed (MoistNetLite being twice as fast as MobileNet), our proposed MoistNet models hold great promise for the wood chip processing industry to be efficiently deployed on p
Models based on the transformer architecture have seen widespread application across fields such as natural language processing (NLP), computer vision, and robotics, with large language models (LLMs) like ChatGPT revo...
详细信息
Models based on the transformer architecture have seen widespread application across fields such as natural language processing (NLP), computer vision, and robotics, with large language models (LLMs) like ChatGPT revolutionizing machine understanding of human language and demonstrating impressive memory capacity and reproduction capabilities. Traditional machine learning algorithms struggle with catastrophic forgetting, detrimental to the diverse and generalized abilities required for robotic deployment. This article investigates the receptance weighted key value (RWKV) framework, known for its advanced capabilities in efficient and effective sequence modeling, integration with the decision transformer (DT), and experience replay architectures. It focuses on potential performance enhancements in sequence decision-making and lifelong robotic learning tasks. We introduce the decision-RWKV (DRWKV) model and conduct extensive experiments using the D4RL database within the OpenAI Gym environment and on the D'Claw platform to assess the DRWKV model's performance in single-task tests and lifelong learning scenarios, showing its ability to handle multiple subtasks efficiently. The code for all algorithms, training, and image rendering in this study is available online (open source).
Chromosome analysis and classification are essential in clinical applications to diagnose various structural and numerical abnormalities. Recently, karyotype analysis using intelligent imageprocessing methods, especi...
详细信息
Depth information is useful in many imageprocessing and computer visionapplications, but in photography, depth information is lost in the process of projecting a real-world scene onto a 2D plane. Extracting depth in...
详细信息
The use of AI technologies in remote sensing (RS) tasks has been the focus of many individuals in both the professional and academic domains. Having more accessible interfaces and tools that allow people of little or ...
详细信息
The use of AI technologies in remote sensing (RS) tasks has been the focus of many individuals in both the professional and academic domains. Having more accessible interfaces and tools that allow people of little or no experience to intuitively interact with RS data of multiple formats is a potential provided by this integration. However, the use of AI and AI agents to help automate RS-related tasks is still in its infancy stage, with some frameworks and interfaces built on top of well-known vision language models (VLM) such as GPT-4, segment anything model (SAM), and grounding DINO. These tools do promise and draw guidelines on the potentials and limitations of existing solutions concerning the use of said models. In this work, the state of the art AI foundation models (FM) are reviewed and used in a multi-modal manner to ingest RS imagery input and perform zero-shot object detection using natural language. The natural language input is then used to define the classes or labels the model should look for, then, both inputs are fed to the pipeline. The pipeline presented in this work makes up for the shortcomings of the general knowledge FMs by stacking pre-processing and post-processingapplications on top of the FMs;these applications include tiling to produce uniform patches of the original image for faster detection, outlier rejection of redundant bounding boxes using statistical and machine learning methods. The pipeline was tested with UAV, aerial and satellite images taken over multiple areas. The accuracy for the semantic segmentation showed improvement from the original 64% to approximately 80%-99% by utilizing the pipeline and techniques proposed in this work. GitHub Repository: MohanadDiab/LangRS.
The exceptional ability of optical metasurfaces to manipulate light has enabled integrated analog computing and imageprocessing in ultracompact, energy-efficient platforms that support high speeds. To date, metasurfa...
详细信息
The exceptional ability of optical metasurfaces to manipulate light has enabled integrated analog computing and imageprocessing in ultracompact, energy-efficient platforms that support high speeds. To date, metasurfaces have demonstrated various analog processing functions, including differentiation, convolution, and classification. However, a fundamental limitation of existing designs is their static functionality, which restricts adaptability to diverse application scenarios. To address this challenge, momentum-space reconfigurable metasurfaces operating in the near-infrared range are experimentally demonstrated, capable of switchable imageprocessing functions including image differentiation and bright-field imaging. These meta-devices are achieved by integrating nematic liquid crystals with silicon metasurfaces that support resonances of quasi-bound states in the continuum (quasi-BICs). The quasi-BIC modes enable further design freedom over the angular dispersion of metasurfaces. The results showcase an electrically tunable, CMOS-compatible approach to reconfigurable optical computing, offering significant potential for applications such as online training of diffractive neural networks, machinevision, and augmented reality.
Computer vision, driven by artificial intelligence, has become pervasive in diverse applications such as self-driving cars and law enforcement. However, the susceptibility of these systems to attacks has raised signif...
详细信息
Recent developments in image analysis and interpretation using computer vision techniques have shown potential for novel applications in microbiology laboratories to support the task of automation, aiming for faster a...
详细信息
Stereoscopic imaging from multiple viewpoints offers critical advantages in machinevision and related applications, yet traditional multi-camera setups require complex synchronization mechanisms. This study introduce...
详细信息
暂无评论