Artificial Intelligence is a fast-growing domain that facilitates the innovation in various fields of business and manufacturing industries. This field of machine learning provides the automatic inspection of the manu...
详细信息
In this article, we present a CMOS image sensor (CIS) for coded-exposure-based compressive focal-stack imaging. The proposed CIS has a pixel design, which includes two capacitive trans-impedance amplifiers (CTIAs) and...
详细信息
In this article, we present a CMOS image sensor (CIS) for coded-exposure-based compressive focal-stack imaging. The proposed CIS has a pixel design, which includes two capacitive trans-impedance amplifiers (CTIAs) and a static random access memory (SRAM), and is capable of per-frame exposure encoding with adjustable spatiotemporal resolutions. A proof-of-concept CIS prototype with a 192 x 192 pixel array is designed and fabricated in a 0.13-mu m CMOS process with a pixel size of 12.6 x 12.6 mu m(2). Operating at 30 frames per second (fps), the CIS demonstrates spatial-temporal coded exposure at a maximum rate of 768 masks/frame. The column-wise 10-bit single-slope (SS) analog-to-digital converter (ADC) includes a ramp-slope adaptation feature used for power optimization. During a frame of coded exposure, a linear focal sweep is implemented by a voice-coil motor (vCM) lens mounted in front of the proposed CIS. Through the sparse reconstruction of the coded image, a focal stack consisting of a volume of defocused images is used to synthesize the scene depth map. By introducing coded exposure, the proposed on-chip compressive focal-stack imaging approach facilitates a frame-saving method for passive depth sensing in machinevision and other imaging applications.
This paper discusses the critical relevance of precise forecasting in liver disease, as well as the need for early identification and categorization for immediate action and personalized treatment strategies. The pape...
详细信息
This paper discusses the critical relevance of precise forecasting in liver disease, as well as the need for early identification and categorization for immediate action and personalized treatment strategies. The paper describes a unique strategy for improving liver disease classification using ultrasound imageprocessing. The recommended technique combines the properties of the Extreme Learning machine (ELM), Convolutional Neural Network (CNN), along Grey Wolf Optimisation (GWO) to form an integrated model known as CNN-ELM-GWO. The data is provided by Pakistan's Multan Institute of Nuclear Medicine and Radiotherapy, and it is then pre-processed utilizing bilateral and optimal wavelet filtering techniques to increase the dataset's quality. To properly extract significant visual information, feature extraction employs a deep CNN architecture using six convolutional layers, batch normalization, and max-pooling. The ELM serves as a classifier, whereas the CNN is a feature extractor. The GWO algorithm, based on grey wolf searching strategies, refines the CNN and ELM hyperparameters in two stages, progressively boosting the system's classification accuracy. When implemented in Python, CNN-ELM-GWO exceeds traditional machine learning algorithms (MLP, RF, KNN, and NB) in terms of accuracy, precision, recall, and F1-score metrics. The proposed technique achieves an impressive 99.7% accuracy, revealing its potential to significantly enhance the classification of liver disease by employing ultrasound images. The CNN-ELM-GWO technique outperforms conventional approaches in liver disease forecasting by a substantial margin of 27.5%, showing its potential to revolutionize medical imaging and prospects.
In the real world, knowledge comes from books and papers. Now that information only reaches to those with clear vision. In the community there are a part of people suffering either from poor eyesight or blindness. Bra...
详细信息
In the billions of faces that are shaped by thousands of different cultures and ethnicities, one thing remains universal: the way emotions are expressed. To take the next step in human-machine interactions, a machine ...
详细信息
ISBN:
(数字)9781510662117
ISBN:
(纸本)9781510662100;9781510662117
In the billions of faces that are shaped by thousands of different cultures and ethnicities, one thing remains universal: the way emotions are expressed. To take the next step in human-machine interactions, a machine must be able to clarify facial emotions. Allowing machines to recognize micro-expressions gives them a deeper dive into a person's true feelings at an instant which allows designers to create more empathetic machines that will take human emotion into account while making optimal decisions;e.g., these machines will be potentially able to detect dangerous situations, alert caregivers to challenges, and provide appropriate responses. Micro-expressions are involuntary and transient facial expressions capable of revealing genuine emotions. We propose to design and train a set of neural network (NN) models capable of micro-expression recognition in real-time applications. Different NN models are explored and compared in this study to design a hybrid deep learning model by combining a convolutional neural network (CNN), a recurrent neural network (RNN, e.g., long short-term memory [LSTM]), and a vision transformer. The CNN can extract spatial features (of a neighborhood within an image) whereas the LSTM can summarize temporal features. In addition, a transformer with an attention mechanism can capture sparse spatial relations residing an image or between frames in a video clip. The inputs of the model are short facial videos, while the outputs are the micro-expressions gleaned from the videos. The deep learning models are trained and tested with publicly available facial micro-expression datasets to recognize different micro-expressions (e.g., happiness, fear, anger, surprise, disgust, sadness). The results of our proposed models are compared with that of literature-reported methods tested on the same datasets. The proposed hybrid models perform the best.
The quality of computer vision systems to detect abnormalities in various medical imaging processes, such as dual-energy X-ray absorptiometry, magnetic resonance imaging (MRI), ultrasonography, and computed tomography...
详细信息
The quality of computer vision systems to detect abnormalities in various medical imaging processes, such as dual-energy X-ray absorptiometry, magnetic resonance imaging (MRI), ultrasonography, and computed tomography, has significantly improved as a result of recent developments in the field of deep learning. There is discussion of current techniques and algorithms for identifying, categorizing, and detecting DFU. On the small datasets, a variety of techniques based on traditional machine learning and imageprocessing are utilized to find the DFU. These literary works have kept their datasets and algorithms private. Therefore, the need for end-to-end automated systems that can identify DFU of all grades and stages is critical. The study's goals were to create new CNN-based automatic segmentation techniques to separate surrounding skin from DFU on full foot images because surrounding skin serves as a critical visual cue for evaluating the progression of DFU as well as to create reliable and portable deep learning techniques for localizing DFU that can be applied to mobile devices for remote monitoring. The second goal was to examine the various diabetic foot diseases in accordance with well-known medical categorization schemes. According to a computer visionviewpoint, the authors looked at the various DFU circumstances including site, infection, neuropathy, bacterial infection, area, and depth. machine learning techniques have been utilized in this study to identify key DFU situations as ischemia and bacterial infection.
Object detection is a method used in computer vision for identifying specific items inside an image or video. Most effective object detection systems make use of machine learning or deep learning. Object detection is ...
详细信息
ISBN:
(纸本)9798350391558;9798350379990
Object detection is a method used in computer vision for identifying specific items inside an image or video. Most effective object detection systems make use of machine learning or deep learning. Object detection is a method of computer vision that allows us to find specific things in pictures and videos. Labeling and counting items in a scene, as well as pinpointing their locations and following their movement, are all possible because to object detection's ability to precisely identify and localize them. For instance, it is easy to recognize circles as a distinct class because of their shared characteristic of being spherical. These unique characteristics are used for object class recognition. Facial traits like as skin tone and eye distance are employed in a manner analogous to that used for fingerprinting in order to positively identify a person by their face. The object detection task is typically made much more challenging due to the test images being sampled from a distinct data distribution. Many unsupervised domain adaptation approaches have been presented to solve the difficulties introduced by the discrepancy between the domains of the training and test data. Cross-domain object detection has many applications, including autonomous driving because to the ease with which labels can be generated for a large number of scenes in video games. Object detection methods can be categorized as either neural network-based or non-neural. This research presents a Superior Attribute Weighted Set for Object Skeleton Detection using ResNet50 (SAWS-OSD-ResNet50). The proposed model when compared with the traditional methods performs better in object detection.
Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been fur...
详细信息
ISBN:
(纸本)1577358872
Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we identify that these prompts are trained based on global image features which limits itself in two aspects: First, by using global features, these prompts could be focusing less on the discriminative foreground image, resulting in poor generalization to various out-of-distribution test cases. Second, existing work weights all prompts equally whereas intuitively, prompts should be reweighed according to the semantics of the image. We address these as part of our proposed Contextual Prompt Learning (CoPL) framework, capable of aligning the prompts to the localized features of the image. Our key innovations over earlier works include using local image features as part of the prompt learning process, and more crucially, learning to weight these prompts based on local features that are appropriate for the task at hand. This gives us dynamic prompts that are both aligned to local image features as well as aware of local contextual relationships. Our extensive set of experiments on a variety of standard and few-shot datasets show that our method produces substantially improved performance when compared to the current state of the art methods. We also demonstrate both few-shot and out-of-distribution performance to establish the utility of learning dynamic prompts that are aligned to local image features.
In the era of digital imagery, there is a great interest in finding new and creative ways to express ourselves and make our images look beautiful. One such fascinating method is cartoonization, a process that transfor...
详细信息
Computer vision and its technologies are being used in the area of agricultural automation to identify, locate, and track targets for further imageprocessing. Mostly, agricultural production has been highly dependent...
详细信息
Computer vision and its technologies are being used in the area of agricultural automation to identify, locate, and track targets for further imageprocessing. Mostly, agricultural production has been highly dependent on natural resources like soil, water, and other related natural minerals from the soil. Soil classification is a way of arranging soils that have similar characteristics into groups. Identifying and classifying soils has a great role to play in agricultural productivity as it helps to provide relevant information which aids agricultural experts to recommend the type of crop best suited for a specific type of soil. This study mainly concentrated on classifying soil types such as clay soil, loam soil, sandy soil, peat soil, silt soil, and chalk soil. The soil images were collected from Amhara region at different locations by using a sony digital camera. To reduce image noise due to handshake we used a camera stand or arm to avoid other types of noises like environmental lighting effects and shadow. Once the dataset was collected, preprocessing such as resizing and gamma correction was performed to remove noise from the images, and contrast adjustment was also performed. Experimental research was applied as a general methodology and the experiment was conducted based on two approaches. The first approach used CNN as an end-to-end classifier and the second used a hybrid approach which used CNN as a feature extractor and SvM as a classifier. When CNN was used as an end-to-end classifier, a classification accuracy of 88% was achieved, whereas when the hybrid approach which used CNN as feature extractor and SvM as classifier was employed, a classification accuracy of 95% was achieved. Finally, we conclude that the hybrid approach is better than that of the End-to-End classification using our proposed model.
暂无评论