Detecting interaction groups is an essential task for understanding human behaviours and social activities. However, it is still challenging to identify social interactions and the resulting crowd groups using purely ...
详细信息
Most existing unsupervised person re-identification (Re-ID) methods primarily depend on the cluster distance, and merely exploit the availab.e source lab.led data to assign pseudo lab.ls for the unannotated data. Wher...
详细信息
Snoring, a recurring habit often disregarded within the Indian community, can signal a grave underlying issue of Obstructive Sleep Apnea (OSA). OSA is a severe sleep disorder characterized by recurrent interruptions i...
详细信息
ISBN:
(数字)9798350350951
ISBN:
(纸本)9798350350968
Snoring, a recurring habit often disregarded within the Indian community, can signal a grave underlying issue of Obstructive Sleep Apnea (OSA). OSA is a severe sleep disorder characterized by recurrent interruptions in breathing for more than 10 seconds during sleep, typically due to partial or complete airway obstructions. Neglecting OSA can lead to a range of significant health risks, including increased likelihood of occupational accidents, motor vehicle accidents, heightened susceptibility to severe depression, cardiac and cerebrovascular diseases, and reduced life expectancy. The main objective of the study is to detect snoring while at sleep and also to classify it as normal snoring and OSA snoring. Arduino nano 33 BLE sense is used to capture the snore signal, it houses a built-in MP34DT05 sensor. The sensor has a signal-to-noise ratio of 64dB and sensitivity of - 26dBFS ± 3dB. This captures the sound signal of the individual, it is further processed to extract the Mel-filter bank energy features, Mel Frequency Cepstral Coefficients and Spectrogram features. The features are further used to build a model and the same is trained using edge impulse to classify the signal. The dataset is divided into training, testing, and validation sets, with 80% of the data allocated to training, 20% to testing, and an additional 20% within the training data set aside for validation purposes. The results for the two class classification (snoring and non snoring) indicate that the spectrogram-based approach achieved an accuracy rate of 96.9%, while the other two methods yielded accuracy rates of 93.8%. The accuracy for three class classification (normal, snoring and OSA snoring) using the Embedded Machine Learning (EML) approach is 88%. The proposed study demonstrates enhanced accuracy in identifying OSA by snoring compared to previous research. This autonomous system can facilitate the detection of OSA through the analysis of snoring patterns, subsequently alerting the subjec
Depth adjustment aims to enhance the visual experience of stereoscopic 3D (S3D) images, which accompanied with improving visual comfort and depth perception. For a human expert, the depth adjustment procedure is a seq...
详细信息
We address the black-box issue of VR sickness assessment (VRSA) by evaluating the level of physical symptoms of VR sickness. For the VR contents inducing the similar VR sickness level, the physical symptoms can vary d...
详细信息
Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users’ viewing experience in various real-world video-enabled media applications. As an experimental field, the im...
详细信息
Human-object interaction (HOI) detection is a meaningful research topic on human activity understanding. Recent works have made significant progress by focusing on efficient triplet matching and leveraging image-wide ...
详细信息
ISBN:
(纸本)9781665475938
Human-object interaction (HOI) detection is a meaningful research topic on human activity understanding. Recent works have made significant progress by focusing on efficient triplet matching and leveraging image-wide features based on encoder-decoder architecture. However, the ability to gather relevant contextual information about human is limited and different sub-tasks in HOI detection are not differentiated by specific decoupling in previous methods. To this end, we propose a new transformer-based method for HOI detection, namely, Mask-Guided Transformer (MGT). Our model, which is composed of five parallel decoders with a shared encoder, not only emphasizes interactive regions by applying body features, but also disentangles the prediction of instance and interaction. We achieve a favorable result at 63.3 mAP on the well-known HOI detection dataset V-COCO.
Crowd anomaly detection suffers from limited training data under weak supervision. In this paper, we propose a dual-mode iterative denoiser to tackle the weak lab.l challenge for anomaly detection. First, we use a con...
详细信息
Crowd anomaly detection suffers from limited training data under weak supervision. In this paper, we propose a dual-mode iterative denoiser to tackle the weak lab.l challenge for anomaly detection. First, we use a convolution autoencoder (CAE) in image space to act as a cluster for grouping similar video clips, where the spatial-temporal similarity helps the cluster metric to represent the reconstruction error. Then we use the graph convolution neural network (GCN) to explore the temporal correlation and the feature similarity between video clips within different rough lab.ls, where the classifier can be constantly updated in the lab.l denoising process. Without specific image-level lab.ls, our model can predict the clip-level anomaly probabilities for videos. Extensive experiment results on two public datasets show that our approach performs favorably against the state-of-the-art methods.
The response of many-body quantum systems to an optical pulse can be extremely challenging to model. Here we explore the use of neural networks, both traditional and generative, to learn and thus simulate the response...
详细信息
Crowd counting still confronts two primary challenges: limited ability to deal with cross density levels caused by fixed density maps and lack of fine-grained or coarse-grained guidance for density estimation. In this...
详细信息
Crowd counting still confronts two primary challenges: limited ability to deal with cross density levels caused by fixed density maps and lack of fine-grained or coarse-grained guidance for density estimation. In this paper, a novel end-to-end crowd counting framework via multi-level regression with latent Gaussian maps is proposed, which is consisted of GaussianNet, EstimateNet and Discriminator. GaussianNet is composed of masked Gaussian convolutional blocks and vanillia convolutional layers, to generate latent Gaussian maps adaptively for various density levels. The latent Gaussian maps are then treated as the ground truth density maps for EstimateNet, which outputs density estimations and follows the principle of adversarial learning with Discriminator. Moreover, multi-level losses are combined for density map regression guidance. Extensive experiments on the major public datasets outperform state-of-the-art ones, illustrating the superior validity of the proposed framework.
暂无评论