Ensembling is one approach that improves the performance of a neural network by combining a number of independent neural networks, usually by either averaging or summing up their individual outputs. We modify this ens...
详细信息
ISBN:
(数字)9783030922702
ISBN:
(纸本)9783030922696;9783030922702
Ensembling is one approach that improves the performance of a neural network by combining a number of independent neural networks, usually by either averaging or summing up their individual outputs. We modify this ensembling approach by training the sub-networks concurrently instead of independently. This concurrent training of sub-networks leads them to cooperate with each other, and we refer to them as "cooperative ensemble". Meanwhile, the mixture-of-experts approach improves a neural network performance by dividing up a given dataset to its sub-networks. It then uses a gating network that assigns a specialization to each of its sub-networks called "experts". We improve on these aforementioned ways for combining a group of neural networks by using a k-Winners-Take-All (kWTA) activation function, that acts as the combination method for the outputs of each sub-network in the ensemble. We refer to this proposed model as "kWTA ensemble neural networks" (kWTA-ENN). With the kWTA activation function, the losing neurons of the sub-networks are inhibited while the winning neurons are retained. This results in sub-networks having some form of specialization but also sharing knowledge with one another. We compare our approach with the cooperative ensemble and mixture-of-experts, where we used a feed-forward neural network with one hidden layer having 100 neurons as the sub-network architecture. Our approach yields a better performance compared to the baseline models, reaching the following test accuracies on benchmark datasets: 98.34% on MNIST, 88.06% on Fashion-MNIST, 91.56% on KMNIST, and 95.97% on WDBC.
Flat clustering and hierarchical clustering are two fundamental tasks, often used to discover meaningful structures in data, such as subtypes of cancer, phylogenetic relationships, taxonomies of concepts, and cascades...
详细信息
Flat clustering and hierarchical clustering are two fundamental tasks, often used to discover meaningful structures in data, such as subtypes of cancer, phylogenetic relationships, taxonomies of concepts, and cascades of particle decays in particle physics. When multiple clusterings of the data are possible, it is useful to represent uncertainty in clustering through various probabilistic quantities, such as the distribution over partitions or tree structures, and the marginal probabilities of subpartitions or subtrees. Many compact representations exist for structured prediction problems, enabling the efficient computation of probability distributions, e.g., a trellis structure and corresponding Forward-Backward algorithm for Markov models that model sequences. However, no such representation has been proposed for either flat or hierarchical clustering models. In this thesis, we present our work developing data structures and algorithms for computing probability distributions over flat and hierarchical clusterings, as well as for finding maximum a posteriori (MAP) flat and hierarchical clusterings, and various marginal probabilities, as given by a wide range of energy-based clustering models. First, we describe a trellis structure that compactly represents distributions over flat or hierarchical clusterings. We also describe related data structures that represent approximate distributions. We then present algorithms that, using these structures, allow us to compute the partition function, MAP clustering, and the marginal proba- bilities of a cluster (and sub-hierarchy, in the case of hierarchical clustering) exactly. We also show how these and related algorithms can be used to approximate these values, and analyze the time and space complexity of our proposed methods. We demonstrate the utility of our approaches using various synthetic data of interest as well as in two real world applications, namely particle physics at the Large Hadron Collider at CERN and in can
Buildings have a considerable impact on the environment, and it is crucial to consider environmental and energy performance in building design. Buildings account for about 40% of the global energy consumption and cont...
详细信息
Buildings have a considerable impact on the environment, and it is crucial to consider environmental and energy performance in building design. Buildings account for about 40% of the global energy consumption and contribute over 30% of the CO2 emissions. A large proportion of this energy is used for meeting occupants’ thermal comfort in buildings, followed by lighting. The building facade forms a barrier between the exterior and interior environments; therefore, it has a crucial role in improving energy efficiency and building performance. In this regard, decision-makers are required to establish an optimal solution, considering multi-objective problems that are usually competitive and nonlinear, such as energy consumption, financial costs, environmental performance, occupant comfort, etc. Sustainable building design requires considerations of a large number of design variables and multiple, often conflicting objectives, such as the initial construction cost, energy cost, energy consumption and occupant satisfaction. One approach to address these issues is the use of building performance simulations and optimization methods. This research first investigates and highlights the key research methods, issues and tools associated with building performance simulations and the optimization methods. Then a novel method for improving building facade performance is presented, taking into consideration occupant comfort, energy consumption and energy costs. The dissertation discusses development of a framework, which is based on multi-objective optimization and uses a genetic algorithm in combination with building performance simulations. The framework utilizes EnergyPlus simulation engine and Python programming to implement optimization algorithm analysis and decision support. The framework enhances the process of performance-based facade design, couples simulation and optimization packages, and provides flexible and fast supplement in facade design process by rapid generation
A long-standing assumption common in algorithm design is that any part of the input is accessible at any time for unit cost. However, as we work with increasingly large data sets, or as we build smaller devices, we mu...
详细信息
A long-standing assumption common in algorithm design is that any part of the input is accessible at any time for unit cost. However, as we work with increasingly large data sets, or as we build smaller devices, we must revisit this assumption. In this thesis, I present some of my work on graph algorithms designed for circumstances where traditional assumptions about inputs do not apply. 1. Classical graph algorithms require direct access to the input graph and this is not feasible when the graph is too large to fit in memory. For computation on massive graphs we consider the dynamic streaming graph model. Given an input graph defined by as a stream of edge insertions and deletions, our goal is to approximate properties of this graph using space that is sublinear in the size of the stream. In this thesis, I present algorithms for approximating vertex connectivity, hypergraph edge connectivity, maximum coverage, unique coverage, and temporal connectivity in graph streams. 2. In certain applications the input graph is not explicitly represented, but its edges may be discovered via queries which require costly computation or measurement. I present two open-source systems which solve real-world problems via graph algorithms which may access their inputs only through costly edge queries. M ESH is a memory manager which compacts memory efficiently by finding an approximate graph matching subject to stringent time and edge query restrictions. PathCache is an efficiently scalable network measurement platform that outperforms the current state of the art.
This dissertation describes progress in the state-of-the-art for developing and deploying formally verified cyber security devices in industrial control networks. It begins by detailing the unique struggles that are f...
详细信息
This dissertation describes progress in the state-of-the-art for developing and deploying formally verified cyber security devices in industrial control networks. It begins by detailing the unique struggles that are faced in industrial control networks and why concepts and technologies developed for securing traditional networks might not be appropriate. It uses these unique struggles and examples of contemporary cyber-attacks targeting control systems to argue that progress in securing control systems is best met with formal verification of systems, their specifications, and their security properties. This dissertation then presents a development process and identifies two technologies, TLA+ and seL4, that can be leveraged to produce a high-assurance embedded security device. The method presented in this dissertation takes an informal design of an embedded device that might be found in a control system and 1) formalizes the design within TLA+, 2) creates and mechanically checks a model built from the formal design, and 3) translates the TLA+ design into a component-based architecture of a native seL4 application. The later chapters of this dissertation describe an application of the process to a security preprocessor embedded device that was designed to add security mechanisms to the network communication of an existing control system. The device and its security properties are formally specified in TLA+ in chapter 4, mechanically checked in chapter 5, and finally its native seL4 architecture is implemented in chapter 6. Finally, the conclusions derived from the research are laid out, as well as some possibilities for expanding the presented method in the future.
Society has benefited from the technological revolution and the tremendous growth in computing powered by Moore's law. However, we are fast approaching the ultimate physical limits in terms of both device sizes an...
详细信息
Society has benefited from the technological revolution and the tremendous growth in computing powered by Moore's law. However, we are fast approaching the ultimate physical limits in terms of both device sizes and the associated energy dissipation. It is important to characterize these limits in a physically grounded and implementation-agnostic manner, in order to capture the fundamental energy dissipation costs associated with performing computing operations with classical information in nano-scale quantum systems. It is also necessary to identify and understand the effect of quantum in-distinguishability, noise, and device variability on these dissipation limits. Identifying these parameters is crucial to designing more energy efficient computing systems moving forward. In this dissertation, we will provide a physical description of finite state automaton, an abstract tool commonly used to describe computational operations under the Referential Approach to physical information theory. We will derive the fundamental limits of dissipation associated with a state transition in deterministic and probabilistic finite state automaton, and propose efficacy measures to capture how well a particular state transition has been physically realized. We will use these dissipation bounds to understand the limits of dissipation during learning during training and testing phases in feed-forward and recurrent neural networks. This study of dissipation in neural network provides key hints at how dissipation is fundamentally intertwined with learning in physical systems. These ideas connecting energy dissipation, entropy and physical information provide the perfect toolkit to critically analyze the very foundations of computing, and our computational approaches to artificial intelligence. In the second part of this dissertation, we derive the non-equilibrium reliable low dissipation condition for predictive inference in self-organized systems. This brings together the central ideas
Internet of Things (IoT) devices are becoming an essential part of our everyday lives. These physical devices are connected to the internet and can measure or control the environment around us. Further, IoT devices ar...
详细信息
Internet of Things (IoT) devices are becoming an essential part of our everyday lives. These physical devices are connected to the internet and can measure or control the environment around us. Further, IoT devices are increasingly being used to monitor buildings, farms, health, and transportation. As these connected devices become more pervasive, these devices will generate vast amounts of data that can be used to gain insights and build intelligence into the system. At the same time, large-scale deployment of these devices will raise new challenges in efficiently managing and controlling them. In this thesis, I argue that the IoT devices need programmability and need to provide software controls in order to manage them efficiently. Further, it will need data-driven modeling techniques to process and analyze a vast amount of data from heterogeneous devices to derive actionable insights. My thesis explores the problems posed by software-defined IoT energy infrastructure. I present four techniques that use systems and machine learning principles to design, analyze and deploy the next generation of smart IoT energy systems. First, I discuss how current state-of-the-art LIDAR-based approaches in identifying ideal locations on rooftops for deploying energy systems such as solar do not scale to many regions of the world. To address the challenges, I propose DeepRoof, a data-driven approach that uses deep learning to estimate the solar potential of roofs using satellite imagery and identify ideal locations for installation. We evaluate our approach on different types of roof and show that our technique is comparable to LIDAR-based methods. Second, I study how excessive solar can cause problems in the grid and examine how programmatic control of the solar output can prevent congestion in the electric grid. Further, I present a decentralized approach that can control the solar arrays in a grid-friendly manner. Also, my approach provides flexible control of solar output, and
In nature there are a variety of self-assembling systems occurring at varying scales which give rise to incredibly complex behaviors. Theoretical models of self-assembly allow us to gain insight into the fundamental n...
详细信息
In nature there are a variety of self-assembling systems occurring at varying scales which give rise to incredibly complex behaviors. Theoretical models of self-assembly allow us to gain insight into the fundamental nature of self-assembly independent of the specific physical implementation. In Winfree's abstract tile assembly model (aTAM), the atomic components are unit square "tiles" which have "glues" on their four sides. Beginning from a seed assembly, these tiles attach one at a time during the assembly process in an asynchronous and nondeterministic manner. We can gain valuable insights into the nature of self-assembly by comparing different models of self-assembly which use fundamentally different mechanisms for local interactions. A powerful notion which allows us to compare models of self-assembly is simulation. The first result of this thesis examines the role of non-determinism in simulation. It shows that the universal simulation of directed aTAM systems requires undirectedness. A tile assembly model is said to be directed if it always assembles the same final assembly. We distinguish between two types of aTAM systems: cooperative systems and non-cooperative systems. In cooperative aTAM systems, we are able to enforce that in order for a tile to attach to an assembly, the glues of a tile must match two or more glues of neighboring tiles. On the other hand, in non-cooperative aTAM systems, tiles are able to attach to an assembly provided that one of the tile's glues match an exposed glue on the assembly. It is well known that the cooperative aTAM is computationally universal, and it is conjectured that the non-cooperative aTAM is not computationally universal. For our second result, we show that if we allow tiles to be polygons with six or more sides, then the class of non-cooperative systems is capable of universal computation. On the other hand, we show that the class of systems consisting of polygons with six or less sides is not capable of computing u
This thesis introduces COMPLEXITY TUTOR, a tutoring system to assist in learning abstract proof-based topics, which has been specifically targeted towards the population of computer science students studying theoretic...
详细信息
This thesis introduces COMPLEXITY TUTOR, a tutoring system to assist in learning abstract proof-based topics, which has been specifically targeted towards the population of computer science students studying theoretical computer science. Existing literature has shown tremendous educational benefits produced by active learning techniques, student-centered pedagogy, gamification and intelligent tutoring systems. However, previously, there had been almost no research on adapting these ideas to the domain of theoretical computer science. As a population, computer science students receive immediate feedback from compilers and debuggers, but receive no similar level of guidance for theoretical coursework. One hypothesis of this thesis is that immediate feedback while working on theoretical problems would be particularly well-received by students, and this hypothesis has been supported by the feedback of students who used the system. This thesis makes several contributions to the field. It provides assistance for teaching proof construction in theoretical computer science. A second contribution is a framework that can be readily adapted to many other domains with abstract mathematical content. Exercises can be constructed in natural language and instructors with limited programming knowledge can quickly develop new subject material for COMPLEXITY TUTOR. A third contribution is a platform for writing algorithms in Python code that has been integrated into this framework, for constructive proofs in computer science. A fourth contribution is development of an interactive environment that uses a novel graphical puzzle-like platform and gamification ideas to teach proof concepts. The learning curve for students is reduced, in comparison to other systems that use a formal language or complex interface. A multi-semester evaluation of 101 computer science students using COMPLEXITY TUTOR was conducted. An additional 98 students participated in the study as part of control groups. C
Algorithmic bias consists of biased predictions born from ingesting unchecked information, such as biased samples and biased labels. Furthermore, the interaction between people and algorithms can exacerbate bias such ...
详细信息
Algorithmic bias consists of biased predictions born from ingesting unchecked information, such as biased samples and biased labels. Furthermore, the interaction between people and algorithms can exacerbate bias such that neither the human nor the algorithms receive unbiased data. Thus, algorithmic bias can be introduced not only before and after the machine learning process but sometimes also in the middle of the learning process. With a handful of exceptions, only a few categories of bias have been studied in Machine Learning, and there are few, if any, studies of the impact of bias on both human behavior and algorithm performance. Although most research treats algorithmic bias as a static factor, we argue that algorithmic bias interacts with humans in an iterative manner producing a long-term effect on algorithms' performance. Recommender systems involve the natural interaction between humans and machine learning algorithms that may introduce bias over time during a continuous feedback loop, leading to increasingly biased recommendations. Therefore, in this work, we view a Recommender system environment as generating a continuous chain of events as a result of the interactions between users and the recommender system outputs over time. For this purpose, In the first part of this dissertation, we employ an iterated-learning framework that is inspired from human language evolution to study the impact of interaction between machine learning algorithms and humans. Specifically, our goal is to study the impact of the interaction between two sources of bias: the process by which people select information to label (human action); and the process by which an algorithm selects the subset of information to present to people (iterated algorithmic bias mode). Specifically, we investigate three forms of iterated algorithmic bias (i.e. personalization filter, active learning, and a random baseline) and how they affect the behavior of machine learning algorithms. Our controlled
暂无评论