Typically, object detection methods for autonomous driving that rely on supervised learning make the assumption of a consistent feature distribution between the training and testing data, this such assumption may fail...
详细信息
Typically, object detection methods for autonomous driving that rely on supervised learning make the assumption of a consistent feature distribution between the training and testing data, this such assumption may fail in different weather conditions. Due to the domain gap, a detection model trained under clear weather may not perform well in foggy and rainy conditions. Overcoming detection bottlenecks in foggy and rainy weather is a real challenge for autonomous vehicles deployed in the wild. To bridge the domain gap and improve the performance of object detection in foggy and rainy weather, this paper presents a novel framework for domain-adaptive object detection. The adaptations at both the image-level and objectlevel are intended to minimize the differences in image style and object appearance between domains. Furthermore, in order to improve the model's performance on challenging examples, we introduce a novel adversarial gradient reversal layer that conducts adversarial mining on difficult instances in addition to domain adaptation. Additionally, we suggest generating an auxiliary domain through data augmentation to enforce a new domain-level metric regularization. Experimental findings on public V2V benchmark exhibit a substantial enhancement in object detection specifically for foggy and rainy driving scenarios IEEE
Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However,...
详细信息
Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However, evaluating T2I foundation models presents significant challenges due to the complex, multi-dimensional psychological factors that influence human preferences for generated images. In this work, we propose MindScore, a multi-view framework for assessing the generation capacity of T2I models through the lens of human preference. Specifically, MindScore decomposes the evaluation into four complementary modules that align with human cognitive processing of images: matching, faithfulness, quality,and realness. The matching module quantifies the semantic alignment between generated images and prompt text, while the faithfulness module measures how accurately the images reflect specific prompt details. Furthermore, we incorporate quality and realness modules to capture deeper psychological preferences, recognizing that unpleasant or distorted images often trigger adverse human responses. Extensive experiments on three T2I datasets with human preference annotations clearly validate the superiority of our proposed MindScore over various state-of-the-art baselines. Our case studies further reveal that MindScore offers valuable insights into T2I generation from a human-centric perspective.
The substring edit error replaces a substring u of x with another string v, where the lengths of u and v are bounded by a given constant k. It encompasses localized insertions, deletions, and substitutions within a wi...
详细信息
Wide field of view and light weight optics are critical for advanced eyewear,with applications in augmented/virtual reality and night *** refractive lenses are often stacked to correct aberrations at a wide field of v...
详细信息
Wide field of view and light weight optics are critical for advanced eyewear,with applications in augmented/virtual reality and night *** refractive lenses are often stacked to correct aberrations at a wide field of view,leading to limited performance and increased size and *** particular,simultaneously achieving a wide field of view and large aperture for light collection is desirable but challenging to realize in a compact ***,we demonstrate a wide field of view(greater than 60°)meta-optic doublet eyepiece with an entrance aperture of 2.1 *** the design wavelength of 633 nm,the meta-optic doublet achieves comparable performance to a refractive lens-based eyepiece *** meta-doublet eyepiece illustrates the potential for meta-optics to play an important role in the development of high-quality monochrome near-eye displays and night vision systems.
In today's dynamic and highly competitive market, brand differentiation has become both essential and complex. The growth of social media and enhanced digital accessibility have transformed brand promotion into a ...
详细信息
Logic locking has emerged to prevent piracy and overproduction of integrated circuits ever since the split of the design house and manufacturing foundry was established. While there has been a lot of research using a ...
详细信息
Machine learning techniques have become ubiquitous both in industry and academic *** model sizes and training data volumes necessitate fast and efficient distributed training *** communications greatly simplify inter-...
详细信息
Machine learning techniques have become ubiquitous both in industry and academic *** model sizes and training data volumes necessitate fast and efficient distributed training *** communications greatly simplify inter-and intra-node data transfer and are an essential part of the distributed training process as information such as gradients must be shared between processing *** this paper,we survey the current state-of-the-art collective communication libraries(namely xCCL,including NCCL,oneCCL,RCCL,MSCCL,ACCL,and Gloo),with a focus on the industry-led ones for deep learning *** investigate the design features of these xCCLs,discuss their use cases in the industry deep learning workloads,compare their performance with industry-made benchmarks(i.e.,NCCL Tests and PARAM),and discuss key take-aways and interesting *** believe our survey sheds light on potential research directions of future designs for xCCLs.
Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcia...
详细信息
Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcial intelligence models predicting short-term and long-term health outcomes forpatients with the Fontan circulation are needed. Generative adversarial networks (GANs) provide a solutionfor generating realistic and useful synthetic data that can be used to train such models. Methods: Despitetheir promise, GANs have not been widely adopted in the congenital heart disease research communitydue, in some part, to a lack of knowledge on how to employ them. In this research study, a GAN was usedto generate synthetic data from the Pediatric Heart Network Fontan I dataset. A subset of data consistingof the echocardiographic and BNP measures collected from Fontan patients was used to train the *** sets of synthetic data were created to understand the effect of data missingness on synthetic datageneration. Synthetic data was created from real data in which the missing values were imputed usingMultiple Imputation by Chained Equations (MICE) (referred to as synthetic from imputed real samples). Inaddition, synthetic data was created from real data in which the missing values were dropped (referred to assynthetic from dropped real samples). Both synthetic datasets were evaluated for ffdelity by using visualmethods which involved comparing histograms and principal component analysis (PCA) plots. Fidelitywas measured quantitatively by (1) comparing synthetic and real data using the Kolmogorov-Smirnovtest to evaluate the similarity between two distributions and (2) training a neural network to distinguishbetween real and synthetic samples. Both synthetic datasets were evaluated for utility by training aneural network with synthetic data and testing the neural network on its ability to classify patients thathave ventricular dysfunction using echocardiograph measures an
Exploration strategy design is a challenging problem in reinforcement learning(RL),especially when the environment contains a large state space or sparse *** exploration,the agent tries to discover unexplored(novel)ar...
详细信息
Exploration strategy design is a challenging problem in reinforcement learning(RL),especially when the environment contains a large state space or sparse *** exploration,the agent tries to discover unexplored(novel)areas or high reward(quality)*** existing methods perform exploration by only utilizing the novelty of *** novelty and quality in the neighboring area of the current state have not been well utilized to simultaneously guide the agent’s *** address this problem,this paper proposes a novel RL framework,called clustered reinforcement learning(CRL),for efficient exploration in *** adopts clustering to divide the collected states into several clusters,based on which a bonus reward reflecting both novelty and quality in the neighboring area(cluster)of the current state is given to the *** leverages these bonus rewards to guide the agent to perform efficient ***,CRL can be combined with existing exploration strategies to improve their performance,as the bonus rewards employed by these existing exploration strategies solely capture the novelty of *** on four continuous control tasks and six hard-exploration Atari-2600 games show that our method can outperform other state-of-the-art methods to achieve the best performance.
Searching the occurrences of specific code patterns (code search) is a common task in software engineering, and programming by example (PBE) techniques have been applied to ease customizing code patterns. However, pre...
详细信息
暂无评论