Foundation models are pretrained on massive datasets. However, collecting medical datasets is expensive and time-consuming, and raises privacy concerns. Here we show that synthetic data generated via conditioning with...
详细信息
Foundation models are pretrained on massive datasets. However, collecting medical datasets is expensive and time-consuming, and raises privacy concerns. Here we show that synthetic data generated via conditioning with disease labels can be leveraged for building high-performing medical foundation models. We pretrained a retinal foundation model, first with approximately one million synthetic retinal images with physiological structures and feature distribution consistent with real counterparts, and then with only 16.7% of the 904,170 real-world colour fundus photography images required in a recently reported retinal foundation model (RETFound). The data-efficient model performed as well or better than RETFound across nine public datasets and four diagnostic tasks;and for diabetic-retinopathy grading, it used only 40% of the expert-annotated training data used by RETFound. We also support the generalizability of the data-efficient strategy by building a classifier for the detection of tuberculosis on chest X-ray images. The text-conditioned generation of synthetic data may enhance the performance and generalization of medical foundation models.
作者:
Li, YanshanXu, FanShenzhen Univ
Inst Intelligent Informat Proc Shenzhen 518060 Peoples R China Shenzhen Univ
Guangdong Key Lab Intelligent Informat Proc Shenzhen Peoples R China
Synthetic aperture radar (SAR) has been widely studied and applied in many fields. Although image super-resolution technology has been successfully applied to SAR imaging in recent years, there is less research on lar...
详细信息
Synthetic aperture radar (SAR) has been widely studied and applied in many fields. Although image super-resolution technology has been successfully applied to SAR imaging in recent years, there is less research on large-scale factor SAR image super-resolution methods. A more effective method is to obtain comprehensive information to guide the reconstruction of SAR images. In fact, the co-registered characteristics of high-resolution optical images have been successfully applied to improve the quality of SAR images. Inspired by this, an optical-guided multi-kernel attention based SAR image super-resolution reconstruction network (OMA-SSR) is proposed. The proposed multi-modal mutual attention (MMA) module in this network can effectively establish the dependency between SAR image features and optical image features. This network also designs a deep feature extraction module for SAR images, which includes a channel-splitted multi-kernel attention (CSMA) module and residual connections. CSMA module splits SAR image channels, extracts features in different ranges through multi-kernel convolution, and finally fuses the extracted features between different channels. Experimental results on the Sen1-2 and QXS datasets show that the proposed OMA-SSR performs well in evaluation indicators and visual effects of SAR image super-resolution reconstruction.
To address the issue of relatively simple features and methods used in methane concentration inversion, which leads to low overall accuracy, this study proposes a methane concentration inversion method based on multi-...
详细信息
To address the issue of relatively simple features and methods used in methane concentration inversion, which leads to low overall accuracy, this study proposes a methane concentration inversion method based on multi-feature fusion and Stacking ensemble learning. The method leverages the series-parallel cascade structure between multiple base models and meta-models to learn different feature representations and patterns in the original data, fully exploring the intrinsic relationships between various feature factors and methane concentration. This approach improves inversion accuracy and generalization capability. Finally, the research team conducted experimental validation in the eastern region of Xinjiang. The experimental results show that, compared with other typical methods, the Stacking ensemble model proposed in this study achieves the best inversion performance, with R2, RMSE, and MAE values of 0.9747, 2.8294, and 1.5299, respectively. In terms of seasonal distribution, methane concentration in eastern Xinjiang typically shows lower average values in the spring and autumn and higher average values in the summer and winter.
X-ray ptychography improves the illumination method of coherent X-ray diffraction imaging (CDI) by shifting the local illumination area to image large samples, while simultaneously improving the convergence speed and ...
详细信息
X-ray ptychography improves the illumination method of coherent X-ray diffraction imaging (CDI) by shifting the local illumination area to image large samples, while simultaneously improving the convergence speed and reconstruction quality of phase recovery using the constraints imposed by the overlapping of adjacent illumination areas. To improve the imaging resolution and image quality of X-ray ptychography, system parameters such as ray source parameters, illumination probe size, sample-to-detector transmission distance, detector accuracy, etc. need to be rationalized. Rationalizing the imaging system parameters is even more important when using a laboratory light source with lower brightness and coherence or when the size of the experimental field is limited. In this study, we aim to ensure the high resolution of far-field imaging and explore optimizing the system parameter settings based on a simulation method, focusing on the transmission distance between the sample and the detector. First, we design a simulation method that can flexibly adjust the system parameters to overcome the strict limitation of the matching relationship between parameters in the traditional simulation method. Second, we compare the effects of different overlap rates and up-sampling methods on the imaging results under different transmission distances so that the system can realize high quality far-field ptychography imaging under the shortest possible transmission distance. Finally, the system parameters are adjusted to compare the imaging results under different transmission distances, and the rules for setting the transmission distance under different system parameter designs are proposed.
Accurate traffic anomaly detection (TAD) is critical for intelligent transportation systems. Previous TAD methods mainly relied on driving scene perception or motion patterns of agents (i.e., traffic participants, suc...
详细信息
Accurate traffic anomaly detection (TAD) is critical for intelligent transportation systems. Previous TAD methods mainly relied on driving scene perception or motion patterns of agents (i.e., traffic participants, such as vehicles and pedestrians) to detect traffic anomalies. Although these methods achieve promising detection performance, they lack an intuitive modeling of agent interactions, which limits their ability to handle complex driving scenarios. In fact, modeling interactions among agents helps to understand the underlying logic behind changes in agent behavior, thereby benefiting traffic anomaly detection. In our work, we introduce Interaction-Scene Collaborative Representation for Traffic Anomaly Detection (ISCRTAD), an innovative framework that leverages advanced artificial intelligence techniques to model agent interactions in dynamic driving scenarios. Unlike previous TAD methods, the proposed approach is the first try to collaboratively represent agent interactions and dynamic driving scenarios, significantly enhancing the perception and understanding of traffic anomalies in driving videos. First, we introduce the agent interaction modeling module, which comprehensively models the interactions between agents in driving scenarios through the designed Behavior Interaction Graph and Spatial Perception Graph. Furthermore, we design a heterogeneous modality collaborative representation (HMCR) mechanism to deeply integrate agent interactions with dynamic driving scenarios, thereby enabling a more profound understanding of agent motion patterns in dynamic driving environments. Experimental results on the DoTA and DADA datasets demonstrate significant improvements in traffic anomaly detection performance, highlighting the effectiveness of our AI-driven approach.
Guided depth super-resolution (GDSR) is a challenging task that aims to restore a high-resolution depth map from a low-resolution one, using a high-resolution RGB image of the same scene as guidance. Recently, most ex...
详细信息
Guided depth super-resolution (GDSR) is a challenging task that aims to restore a high-resolution depth map from a low-resolution one, using a high-resolution RGB image of the same scene as guidance. Recently, most existing GDSR methods have been limited by insufficient prior knowledge and inappropriate guidance, leading to several challenges, such as serious boundary discontinuities, incomplete structures, and unsatisfactory colors in the reconstructed high-resolution depth maps. To overcome these limitations, we propose a dual prior guided depth image super-resolution method using a multi-scale transformer fusion network. Firstly, we design a dual prior mechanism by employing a color branch to extract color priors from RGB images and an edge branch to extract edge priors from depth images. This approach provides effective prior knowledge for high-resolution depth images, resulting in complete structures and satisfactory colors. Meanwhile, we introduce the self-attention mechanism of the Transformer to the guided depth map super-resolution task to extract global features through a transformer block that utilizes feature mapping from a semi-coupled convolutional block. In addition, we introduce a multi-scale feature fusion module to maintain the boundary continuity of the reconstructed depth image by enhancing the depth image super-resolution guided by RGB images from multiple scales. Extensive quantitative and qualitative experiments on multiple datasets demonstrate that the proposed method has significant advantages over other state-of-the-art GDSR methods. In particular, our method achieves superior super-resolution performance on depth images with noise.
For random walks on graph G with n vertices and m edges, the mean hitting time H-j from a vertex chosen from the stationary distribution to vertex j measures the importance for j, while the Kemeny constant K is the me...
详细信息
For random walks on graph G with n vertices and m edges, the mean hitting time H-j from a vertex chosen from the stationary distribution to vertex j measures the importance for j, while the Kemeny constant K is the mean hitting time from one vertex to another selected randomly according to the stationary distribution. In this article, we first establish a connection between the two quantities, representing K in terms of H-j for all vertices. We then develop an efficient algorithm estimating H-j for all vertices and K in nearly linear time of m. Moreover, we extend the centrality H-j of a single vertex to H(S) of a vertex set S, and establish a link between H(S) and some other quantities. We further study the NP-hard problem of selecting a group S of k << n vertices with minimum H(S), whose objective function is monotonic and supermodular. We finally propose two greedy algorithms approximately solving the problem. The former has an approximation factor (1-k/k-1 1/e) and O(kn(3)) running time, while the latter returns a (1-k/k-1 1/e-& varepsilon;)-approximation solution in nearly-linear time of m, for any parameter 0<& varepsilon;<1. Extensive experiment results validate the performance of our algorithms.
In this article, an event-triggered distributed sliding mode control (SMC) scheme is developed for dc microgrids composed of multiple boost converters in parallel under limited resource bandwidth. By removing the assu...
详细信息
In this article, an event-triggered distributed sliding mode control (SMC) scheme is developed for dc microgrids composed of multiple boost converters in parallel under limited resource bandwidth. By removing the assumption of ideal voltage sources, the multiple boost converters are modeled as imperfect voltage sources that can be well addressed by SMC. A finite-time control method is suggested to design a distributed control framework for realizing voltage regulation and power sharing with high convergence rate and low steady-state error. Furthermore, an event-triggered strategy is proposed to avoid large bandwidth overhead when carrying out tasks with heavy resource burden. Finally, the effectiveness of bandwidth saving and disturbance rejecting is verified by several comparative simulation results and experimental tests over existing literature.
Coherent diffraction imaging (CDI) overcomes the limitation of the optical component fabrication technology on imaging resolution. Ptychography, an important variant of CDI, can reconstruct the complex transmission of...
详细信息
Coherent diffraction imaging (CDI) overcomes the limitation of the optical component fabrication technology on imaging resolution. Ptychography, an important variant of CDI, can reconstruct the complex transmission of the object through a series of diffraction patterns, while providing a large field of view with a high resolution. However, the accuracy of the scan positions has a significant effect on the image quality of the ptychography. Herein, the translation parameters are dynamically and adaptively adjusted by the gradient optimization algorithms within the extended ptychographic iterative engine. Six advanced gradient optimization algorithms are evaluated through simulations. The results show that they can achieve subpixel correction accuracy. In addition, the method is tested on an experimental ptychography dataset using soft X-ray, which also verifies its capability for reconstruction improvement. (c) 2025 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.
As a primary method to solve the optimal tracking control problem for nonlinear multi-agent systems, policy gradient approach with traditional multi-agent actor-critic network structures suffers from lengthy training ...
详细信息
As a primary method to solve the optimal tracking control problem for nonlinear multi-agent systems, policy gradient approach with traditional multi-agent actor-critic network structures suffers from lengthy training time and stability. In this article, we propose a novel first-order deterministic policy gradient framework with an initial admissible control policy method (I-FDPG). An additional first-order critic network and its target network are introduced to estimate the partial derivatives of the value function with respect to the tracking error. Updated by the estimation error of the values and the policy gradient alternately, the iterative process of control policy is improved. Moreover, when determining the initial control policy, a dynamic iterative termination criterion is proposed since the fixed iterative termination criterion is often unsuitable for each agent. The boundedness, convergence and optimality of the I-FDPG algorithm are proven. Finally, the effectiveness of the proposed algorithm is validated through a two-dimensional affine nonlinear multi-agent system.
暂无评论