We present a method to formulate algorithm discovery as program search, and apply it to discover optimization algorithms for deep neural network training. We leverage efficient search techniques to explore an infinite...
We present a method to formulate algorithm discovery as program search, and apply it to discover optimization algorithms for deep neural network training. We leverage efficient search techniques to explore an infinite and sparse program space. To bridge the large generalization gap between proxy and target tasks, we also introduce program selection and simplification strategies. Our method discovers a simple and effective optimization algorithm, Lion (EvoLved Sign Momentum). It is more memory-efficient than Adam as it only keeps track of the momentum. Different from adaptive optimizers, its update has the same magnitude for each parameter calculated through the sign operation. We compare Lion with widely used optimizers, such as Adam and Adafactor, for training a variety of models on different tasks. On image classification, Lion boosts the accuracy of viT by up to 2% on imageNet and saves up to 5x the pre-training compute on JFT. On vision-language contrastive learning, we achieve 88.3% zero-shot and 91.1% fine-tuning accuracy on imageNet, surpassing the previous best results by 2% and 0.1%, respectively. On diffusion models, Lion outperforms Adam by achieving a better FID score and reducing the training compute by up to 2.3x. For autoregressive, masked language modeling, and fine-tuning, Lion exhibits a similar or better performance compared to Adam. Our analysis of Lion reveals that its performance gain grows with the training batch size. It also requires a smaller learning rate than Adam due to the larger norm of the update produced by the sign function. Additionally, we examine the limitations of Lion and identify scenarios where its improvements are small or not statistically significant.
Object classification and detection involve numerous applications like imageprocessing, picture retrieval, security and surveillance, video communication, robot vision and observation. They are often classified based...
详细信息
ISBN:
(纸本)9781665460859
Object classification and detection involve numerous applications like imageprocessing, picture retrieval, security and surveillance, video communication, robot vision and observation. They are often classified based on their properties like colour, shape, quality and texture. However, their accuracy is mostly predicted by colour recognition. Colour based detection of objects has a discriminating quality to detect an object of any primary colour. The extraction of two or three-dimensional objects based on colour plays a vital role in real-time imageprocessing technology. This review paper proposes the method for colour-based object classification applied in the lays industry using the ‘K-nearest neighbour algorithm’ which is the simplest form of machine learning classification method. The ML model is designed in such a way that once the colour of the lays pack is detected it also specifies its particular flavour. Hence for efficient machine-based colour identification in industries, this method would be an appropriate one and can be used for many other applications.
Modern systems of parameter control and decision making are most often based on the analysis of images or video data. Most often, images are distorted by the presence of a noise component. Another interfering factor m...
详细信息
ISBN:
(纸本)9781510638808
Modern systems of parameter control and decision making are most often based on the analysis of images or video data. Most often, images are distorted by the presence of a noise component. Another interfering factor may be the incorrect white balance or color contrast settings. This problem arises with a sharp change in the luminous of the scene or with insufficient lighting. These problems can occur in various directions of using photo and video data, such as: autonomous control systems;unmanned aerial vehicles;safety systems;art photography;medicine and telemedicine;satellite images, etc. Ways to improve the quality of images can be as of developing sensors or new electronic components, as preprocessingalgorithms. The development of programs does not require changes to existing equipment, and therefore is a more urgent task. In parallel processing of multichannel images, the task of primary processing is complicated. Such images may include channel: RGB;infrared;ultraviolet;x-ray;3D or combined image. Most often, the methods used for one type of image require adjustment for another and may not be optimal for solving this problem. The article considers the issue of using the multi-criteria smoothing method, with the possibility of adaptive parameter changes for various types of images. As an approach to implement the improvement of the group of images, the work proposes phased processing for each multi-channel image. As a first step, an algorithm for changing the color space is applied, in which multiple adaptive compression of the range occurs, based on a change in the size of the clusters. This algorithm allows adaptive absorption of adjacent pixel regions by analysis of histograms of the gradients. The application of this approach allows performing primary localization and simplification of the image. In the next step, we search for areas of significance (maximum number of transitions or complexity of an object). We check the coincidence of areas in a multi-cha
Children are most commonly affected by many neurological disorders now a days. One of the common disease is hydrocephalus occurring 1/1000 in infantile age group and also in adults as a result of congenital, acquired,...
详细信息
Children are most commonly affected by many neurological disorders now a days. One of the common disease is hydrocephalus occurring 1/1000 in infantile age group and also in adults as a result of congenital, acquired, tumors, spina bifida, bleeding or infection. Hydrocephalus may cause disability and even death when it is left untreated. Hydrocephalus occurs when, due to previous causes in the brain, excessive cerebrospinal fluid builds up in the brain. When diagnosing tumours and hydrocephalus, imageprocessing plays a vital role. The gap between the visual representation of data captured by MRI and the information relevant to the individual is a major challenge in the medical field. The latest imageprocessing and data mining technologies are used to classify images with high precision. This research paper suggest a definition of imageprocessing and segmentation algorithms for the evaluation of hydrocephalus in children and the determination of its MRI volume. Earlier research work are intended for identifying Hydrocephalus from CT brain images. But in the proposed work MRI images are used for diagnosis. This paper presents techniques and algorithms of imageprocessing and Segmentation which will be very useful in diagnosing Hydrocephalus.
The paper studies a technology for the establishment of Earth remote sensing systems based on the object-oriented approach to the complex system analysis and design. The technology relies upon the system of knowledge ...
详细信息
The paper studies a technology for the establishment of Earth remote sensing systems based on the object-oriented approach to the complex system analysis and design. The technology relies upon the system of knowledge combining the knowledge of a certain subject domain, the knowledge of imageprocessingalgorithms and the typology of the problems in question. Another important element of the technology is a set of human-computer interaction methods that make it possible to arrange a task setting and solving cycle without or with little involvement of imageprocessing experts. The paper presents the input conditions and the main stages of the technology, as well as examples of possible use of the system in agricultural monitoring.
Assessing sow posture is essential for understanding their physiological condition and helping farmers improve herd productivity. Deep learning-based techniques have proven effective for image interpretation, offering...
详细信息
Assessing sow posture is essential for understanding their physiological condition and helping farmers improve herd productivity. Deep learning-based techniques have proven effective for image interpretation, offering a better alternative to traditional imageprocessing methods. However, distinguishing transitional postures such as sitting and kneeling is challenging with only conventional top-view RGB images. This study aimed to develop and compare different deep learning-based sow posture classifiers using different architectures and image types. Using Kinect v.2 cameras, RGB and depth images were collected from 9 sows housed individually in farrowing crates. A total of 26,362 images were manually labelled by posture: "standing", "kneeling", "sitting", "ventral recumbency" and "lateral recumbency". Different deep learning algorithms were developed to detect sow postures from three types of images: colour (RGB), depth (depth image transformed into greyscale), and fused (colour-depth composite images). Results indicated that the ResNet-18 model presented the best results and that including depth information improved the performance of all models tested. Depth and fused models achieved higher accuracies than the models using only RGB images. The best model used only depth images as input and presented an accuracy of 98.3 %. The mean precision and recall values were 97.04 % and 97.32 %, respectively (F1-score = 97.2 %). The study shows improved posture classification using depth images. Future research can improve model accuracy and speed by expanding the database, exploring fused methods and computational models, considering different breeds of sows, and incorporating more postures. These models can be integrated into computer vision systems to automatically characterise sow behavior.
Health services and telemedicine have proven to be an important area for information protection in research, especially with medical services and smart health care applications. In these systems, medical imaging prote...
详细信息
Purpose: The aim of the present study, conducted by a working group of the Italian Association of Medical Physics (AIFM), was to define typical z -resolution values for different digital breast tomosynthesis (DBT) mod...
详细信息
Purpose: The aim of the present study, conducted by a working group of the Italian Association of Medical Physics (AIFM), was to define typical z -resolution values for different digital breast tomosynthesis (DBT) models to be used as a reference for quality control (QC). Currently, there are no typical values published in internationally agreed QC protocols. Methods: To characterize the z -resolution of the DBT models, the full width at half maximum (FWHM) of the artifact spread function (ASF), a technical parameter that quantifies the signal intensity of a detail along reconstructed planes, was analyzed. Five different commercial phantoms, CIRS Model 011, CIRS Model 015, Modular DBT phantom, Pixmam 3-D, and Tomophan, were evaluated on reconstructed DBT images and 82 DBT systems (6 vendors, 9 models) in use at 39 centers in Italy were involved. Results: The ASF was found to be dependent on the detail size, the DBT angular acquisition range, the reconstruction algorithm and applied imageprocessing. In particular, a progressively greater signal spread was observed as the detail size increased and the acquisition angle decreased. However, a clear correlation between signal spread and angular range width was not observed due to the different signal reconstruction and imageprocessing strategies implemented in the algorithms developed by the vendors studied. Conclusions: The analysis led to the identification of typical z -resolution values for different DBT model -phantom configurations that could be used as a reference during a QC program.
Nowadays, imaging and spectroscopy systems operating in the long-wavelength infrared range (LWIR) are rapidly developed and extensively applied in numerous demanding branches of science and technology. This pushes fur...
详细信息
ISBN:
(纸本)9781510637191
Nowadays, imaging and spectroscopy systems operating in the long-wavelength infrared range (LWIR) are rapidly developed and extensively applied in numerous demanding branches of science and technology. This pushes further developments into the realms of improving the sensitivity and performance of the LWIR systems, as well as reducing their dimensions and cost. Among modern LWIR technologies, uncooled shutterless bolometric matrices form a favorable platform for addressing these challenging problems, being technologically reliable, compact, and cost-effective. Nevertheless, such detectors features high noises and require real-time digital signal processing. In this work, consisted of two parts, we developed a portable LWIR camera, which relies on a commercial uncooled bolometric matrix, and proposed few approaches aimed at the image acquisition improvement. The first part describes algorithms for image calibration. These algorithms were implemented experimentally in a processing module relying on the Field-Programmable Gate Array (FPGA) and the high-speed double data rate Synchronous Dynamic Random Access Memory (SDRAM). The developed LWIR camera holds strong potential in such applications, as non-destructive sensing and medical imaging.
various performance benefits such as low latency and high bandwidth have turned fog computing into a well-accepted extension of the cloud computing paradigm. Many fog computing systems have been proposed so far, consi...
详细信息
various performance benefits such as low latency and high bandwidth have turned fog computing into a well-accepted extension of the cloud computing paradigm. Many fog computing systems have been proposed so far, consisting of distributed compute nodes which are often organized hierarchically in layers. To achieve low latency, these systems commonly rely on the assumption that the nodes of adjacent layers reside close to each other. However, this assumption may not hold in fog computing systems that span over large geographical areas, due to the wide distribution of the nodes. To avoid relying on this assumption, in this paper we design distributed algorithms whereby the compute nodes measure the network proximity to each other, and self-organize into a hierarchical or a flat structure accordingly. Moreover, we implement these algorithms on geographically distributed compute nodes, and we experiment with imageprocessing and smart city use cases. Our results show that compared to alternative methods, the proposed algorithms decrease the communication latency of latency-sensitive processes by 27%-43%, and increase the available network bandwidth by 36%-86%. Furthermore, we analyze the scalability of our algorithms, and we show that a flat structure (i.e., without layers) scales better than the commonly used layered hierarchy due to generating less overhead when the size of the system grows. (C) 2020 The Authors. Published by Elsevier B.v.
暂无评论