Deep-unfolded networks (DUNs) have set new performance benchmarks in fields such as compressed sensing, image restoration, and wireless communications. DUNs are built from conventional iterative algorithms, where an i...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Deep-unfolded networks (DUNs) have set new performance benchmarks in fields such as compressed sensing, image restoration, and wireless communications. DUNs are built from conventional iterative algorithms, where an iteration is transformed into a layer/block of a network with learnable parameters. Despite their huge success, the reasons behind their superior performance over their iterative counterparts are not fully understood. This paper focuses on enhancing the explainability of DUNs by investigating potential reasons behind their superior performance over traditional iterative methods. We concentrate on the Learnt Iterative Shrinkage-Thresholding Algorithm (LISTA), a foundational contribution that achieves sparse recovery with significantly fewer layers than its iterative counterpart, ISTA. Our findings reveal that the learnt matrices in LISTA always have Gaussian distributed entries regardless of whether the sensing matrix is random Gaussian, Bernoulli, exponential, or uniform. The findings also show that the singular values of the learnt matrices exceed unity, despite which, the reconstruction scheme is stable. We conjecture that the activation function may have a role to play in ensuring stability. We also present an unbiasing technique that substantially improves the sparse recovery performance by reestimating the amplitudes based on the converged support.
In this study, the feasibility of extracting information about tissue dimensions, types, and distances to the antenna using the reflection coefficient of a single antenna, intended for early diagnosis of breast cancer...
详细信息
ISBN:
(数字)9798350388961
ISBN:
(纸本)9798350388978
In this study, the feasibility of extracting information about tissue dimensions, types, and distances to the antenna using the reflection coefficient of a single antenna, intended for early diagnosis of breast cancer with microwave imaging systems, has been investigated using Mutual Information (MI) and Machine Learning (ML). The study involved the examination of tissue radius, distance to the antenna, and type parameters using $3468\left|S_{11}\right|$ curves obtained from various simulation scenarios. The information extraction from the curves for classification and regression was achieved using the training of ML models. The best performance of ML models resulted in identifying tissue size with $99.3 \%$ accuracy. Additionally, using regression models, the distance between the tissue and the antenna could be determined with the standard deviation 0.74 mm. In conclusion, it was observed that classification and regression models could learn and predict the relationship between |S11| curves and the location and dimensions of tissues. Furthermore, the obtained results are expected to make a significant contribution to microwave imaging system design and tomographic image reconstruction.
In this paper, a novel receiver structure is proposed to mitigate the inter-channel interference (ICI) effects that limit the data rate in multi-color visible light communication (MCVLC) systems. Reconfigurable optica...
详细信息
ISBN:
(数字)9798350388961
ISBN:
(纸本)9798350388978
In this paper, a novel receiver structure is proposed to mitigate the inter-channel interference (ICI) effects that limit the data rate in multi-color visible light communication (MCVLC) systems. Reconfigurable optical filters (ROFs) are used at the receiver side of MC-VLC systems to enable the separation of color channels. ROFs are optical elements that have the ability to electrically change their transmission spectrum. Utilizing this feature of ROFs, a receiver structure is developed that, unlike traditional receiver structures, can adaptively adjust the transmit spectrum based on channel state information. This adaptive adjustment is aimed at maximizing communication performance by effectively separating color channels in MC-VLC systems. The obtained results demonstrate that the proposed receiver structure achieves higher communication performance compared to traditional receivers.
Computer vision is a top-tier domain of the technological world that is responsible for automating the visual systems from healthcare to self-driving vehicles. With a reputation for surpassing human intelligence, it c...
详细信息
ISBN:
(纸本)9781665436564
Computer vision is a top-tier domain of the technological world that is responsible for automating the visual systems from healthcare to self-driving vehicles. With a reputation for surpassing human intelligence, it can be implemented in various trigger systems like wildfire smoke detection where the emission of smoke as a result of wildfire is fairly unpredictable. Low contrast and brightness have a detrimental effect on computer vision tasks. We present a novel approach to detect forest wildfire smoke, using image translation for converting nighttime images to day time which eliminates the confusion between smoke, cloud, and fog. This translation aids the YOLOv5 object detection algorithm to detect the smoke with the same aptness irrespective of time and lighting conditions. This paper demonstrates that the object detection model performs better on the images translated to day time with a better confidence score as compared to the corresponding nighttime images.
With the promotion of testing work over the years, colleges and universities have accumulated a large number of data including students' basic information and physical fitness test results. In order to facilitate ...
详细信息
Early diagnosis is very important in brain tumors. Although Magnetic Resonance (MRI) is widely used for brain tumor detection, it is difficult to detect the tumor manually. Therefore, computer-aided diagnosis systems ...
Early diagnosis is very important in brain tumors. Although Magnetic Resonance (MRI) is widely used for brain tumor detection, it is difficult to detect the tumor manually. Therefore, computer-aided diagnosis systems have been frequently utilized in recent years. In this study, an Efficient Channel Attention-Dense Convolutional Network (ECA-DenseNet) framework is proposed to detect tumors in patients based on brain MRI images. While detecting the tumor, it is tried to determine which type of tumor is present in the patient. In the developed ECA-DenseNet structure, an ECA block has been added to the dense blocks. The ECA block aimed to discard unimportant information and thus reduce the computation time. The improved DenseNet model has been tested on an open-source dataset. The improved model is compared with DenseNet-121, DenseNet-169, DenseNet-201, and DenseNet-264. The experimental results show the improving model has better classification performance than the others. The accuracy of the proposed model was 95.07%.
Diabetic Retinopathy (DR) is a retinal condition resulting in damage to blood vessels within the eye, serving as a leading cause of vision impairment or blindness when not addressed. Manual identification of diabetic ...
详细信息
ISBN:
(数字)9798350374711
ISBN:
(纸本)9798350374728
Diabetic Retinopathy (DR) is a retinal condition resulting in damage to blood vessels within the eye, serving as a leading cause of vision impairment or blindness when not addressed. Manual identification of diabetic retinopathy is labor-intensive and susceptible to human error due to the intricate nature of the eye's structure. This work offers a comprehensive method that uses image pre-processing techniques, local ternary pattern (LTP) features, machine learning and different similarity measures to improve content based retinal image analysis and retrieval from fundus images. With an emphasis on feature extraction, subtle texture details were extracted from retinal fundus images using Local Ternary Patterns (LTP) with different radius values, specifically concentrating on radius 1, 2, and 3. This study includes two parts: first, it assesses machine learning algorithms for the classification of diabetic retinopathy. Of these, Random Forest performed the best, with an accuracy of 92.77%. Second, the research works on retinal image retrieval and compares various LTP variants' abilities to different retrieval tasks. To evaluate model performance for particular disease classes, class-wise image retrieval was carried out, providing information on individual class precision. Several similarity measures were investigated for retinal image retrieval, demonstrating the effectiveness of Cosine Similarity in achieving better retrieval results.
Considering the objective change of human beings and the effect of temporal parallax, it is difficult to extract the required feature points accurately and recognize the recognition algorithm. In order to obtain effec...
详细信息
This paper presents the design and implementation of our system for Track 1 of the Multi-modal Information based Speech processing (MISP) 2022 Challenge. We design an end-to-end transformer-based multi-talker system. ...
详细信息
This paper presents the design and implementation of our system for Track 1 of the Multi-modal Information based Speech processing (MISP) 2022 Challenge. We design an end-to-end transformer-based multi-talker system. The transformer backbone is well-suited to capture long-term features, which is crucial for multi-modal speaker diarization in cases where temporal modalities are missing. Besides, we employ several loss functions and image data augmentation techniques to prevent over-fitting during training. Moreover, to further improve the system’s performance, we incorporate Interchannel Phase Difference (IPD) to model the location features and pre-train an ECAPA-TDNN-based model to extract speaker embedding features. Our system achieved a diarization error rate (DER) of 10.82% on the evaluation set, which earned us second place in the audio-visual speaker diarization task of the MISP 2022 challenge.
Multi-modal medical image fusion can effectively integrate imaging information from multiple sensors and has become a research hotspot in medical imageprocessing in recent years. In this paper, a new multimodal medic...
Multi-modal medical image fusion can effectively integrate imaging information from multiple sensors and has become a research hotspot in medical imageprocessing in recent years. In this paper, a new multimodal medical image fusion method based on adaptive pulse-coupled neural network (PCNN) and nonsubsampled contour wave transform (NSCT) is proposed. In the proposed, the NSCT decomposition is first performed on medical source images to obtain high-frequency subbands and low-frequency subbands. Secondly, the fusion strategy of phase coherence model for feature extraction of low frequency sub-bands can effectively retain the energy information of the source image. In addition, a new high-frequency subband fusion rule is designed to customize the decomposed high-frequency sublayer, and further optimized for the different frequency band characteristics of the decomposed obtained sublayer. And a new input parameter adaption is used to improve the PCNN model as a strategy for high frequency coefficient fusion. Finally,the fused image is reconstructed by performing inverse NSCT on the fused high-frequency and low-frequency bands. Experiments are conducted by performing on a publicly accessible brain image dataset. The experimental results show that the algorithm achieves positive performance in terms of visual quality and the objective indicators of overall Cross entropy (OCE), structural similarity (SSIM), feature mutual information (FMI) and visual fidelity (VIFF) have reached the best level.
暂无评论