The extensive transmission of fake information on social media platforms has grown to be a serious social problem that affects freedom of speech, democracy, and individual safety. The amount of fake information availa...
详细信息
A common goal in statistical analyses is to differentiate signal from noise. This problem is ubiquitous to many fields, including mobile health (mHealth) and genomics, both of which have garnered tremendous interest i...
详细信息
A common goal in statistical analyses is to differentiate signal from noise. This problem is ubiquitous to many fields, including mobile health (mHealth) and genomics, both of which have garnered tremendous interest in recent years as advancements in technology continue to make them even more prominent for studying human health. While this challenge of detecting signal is universal, the solutions to it are not. Different research applications introduce their own idiosyncrasies that can make existing approaches for signal detection insufficient for that specific context. In this dissertation, we present approaches for signal detection for three different problems in mHealth and genomics. In Chapter 1, we study mHealth data, which are often collected through wearable devices, such as watches and other fitness trackers. The devices record and process data using algorithms that are subject to updates and glitches, which device manufacturers often do not publicize. As a result, devices can suddenly change how data are collected and reported over time. A researcher using mHealth data needs to be able to detect these changes in order to adjust for them. We propose Automated Selection of Changepoints using Empirical P-values and Trimming (ASCEPT) as an approach for objectively identifying where these changes occur. ASCEPT relies upon Monte Carlo simulations and regression models to accurately identify these algorithmic changes. We compare ASCEPT to an existing method on both simulated and real mHealth data. In Chapter 2, we look at chromatin immunoprecipitation sequencing (ChIP-seq) data, which reflect where proteins bind to a genome. Researchers often compare individuals from different experimental groups or biological conditions to detect regions of the genome in which there is differential binding (DB). DB in particular regions may then be associated with different health outcomes between the two groups, in turn helping the researcher understand risk factors or mechanism
暂无评论