In the contemporary digital era, the storage of Electronic Health Records on open platforms presents significant security and privacy challenges. Addressing these concerns requires standardizing the clinical deploymen...
详细信息
In the contemporary digital era, the storage of Electronic Health Records on open platforms presents significant security and privacy challenges. Addressing these concerns requires standardizing the clinical deployment models currently in use. This paper proposes a robust model that overcomes critical issues related to security, privacy, access control, and ownership transfer of patients' records. The model incorporates data collection to assess clinical needs, followed by deployment-level checks to mitigate network attacks and enhance Quality-of-Service according to scalability demands. A Modified genetic Algorithm is employed to improve blockchain scalability. The model also introduces mechanisms for ensuring database integrity, mitigating external attacks, and enhancing usability. It further supports platform-level modularization, access control, department-specific configurations, and patient-level confidentiality. The proposed solution outperforms existing systems, particularly in terms of ease of use, deployment delay, deployment complexity, and module-level efficiency, making it a highly suitable option for implementing secure, customized clinic security systems based on Electronic Health Records
UCS is a s (u) under bar pervised learning (c) under bar lassifier (s) under bar ystem that was introduced in 2003 for classification in data mining tasks. The representation of a rule in UCS as a univariate classific...
详细信息
UCS is a s (u) under bar pervised learning (c) under bar lassifier (s) under bar ystem that was introduced in 2003 for classification in data mining tasks. The representation of a rule in UCS as a univariate classification rule is straightforward for a human to understand. However, the system may require a large number of rules to cover the input space. Artificial neural networks ( NNs), on the other hand, normally provide a more compact representation. However, it is not a straightforward task to understand the network. In this paper, we propose a novel way to incorporate NNs into UCS. The approach offers a good compromise between compactness, expressiveness, and accuracy. By using a simple artificial NN as the classifier's action, we obtain a more compact population size, better generalization, and the same or better accuracy while maintaining a reasonable level of expressiveness. We also apply negative correlation learning ( NCL) during the training of the resultant NN ensemble. NCL is shown to improve the generalization of the ensemble.
The complexity of the selection procedure of a genetic algorithm that requires reordering, if we restrict the class of the possible fitness functions to varying fitness functions, is O(N log N), where N is the size of...
详细信息
The complexity of the selection procedure of a genetic algorithm that requires reordering, if we restrict the class of the possible fitness functions to varying fitness functions, is O(N log N), where N is the size of the population. The quantum genetic optimization algorithm (QGOA) exploits the power of quantum computation in order to speed up genetic procedures. In QGOA, the classical fitness evaluation and selection procedures are replaced by a single quantum procedure. While the quantum and classical geneticalgorithms use the same number of generations, the QGOA requires fewer operations to identify the high-fitness subpopulation at each generation. We show that the complexity of our QGOA is o(1) in terms of number of oracle calls in the selection procedure. Such theoretical results are confirmed by the simulations of the algorithm.
Several systems that rely on consistent data to offer high-quality services, such as digital libraries and e-commerce brokers, may be affected by the existence of duplicates, quasi replicas, or near-duplicate entries ...
详细信息
Several systems that rely on consistent data to offer high-quality services, such as digital libraries and e-commerce brokers, may be affected by the existence of duplicates, quasi replicas, or near-duplicate entries in their repositories. Because of that, there have been significant investments from private and government organizations for developing methods for removing replicas from its data repositories. This is due to the fact that clean and replica-free repositories not only allow the retrieval of higher quality information but also lead to more concise data and to potential savings in computational time and resources to process this data. In this paper, we propose a genetic programming approach to record deduplication that combines several different pieces of evidence extracted from the data content to find a deduplication function that is able to identify whether two entries in a repository are replicas or not. As shown by our experiments, our approach outperforms an existing state-of-the-art method found in the literature. Moreover, the suggested functions are computationally less demanding since they use fewer evidence. In addition, our genetic programming approach is capable of automatically adapting these functions to a given fixed replica identification boundary, freeing the user from the burden of having to choose and tune this parameter.
The cDNA microarray is an important tool for generating large data sets of gene expression measurements. An efficient design is critical to ensure that the experiment will be able to address relevant biological questi...
详细信息
The cDNA microarray is an important tool for generating large data sets of gene expression measurements. An efficient design is critical to ensure that the experiment will be able to address relevant biological questions. Microarray experimental design can be treated as a multicriterion optimization problem. For this class of problems, evolutionaryalgorithms (EAs) are well suited, as they can search the solution space and evolve a design that optimizes the parameters of interest based on their relative value to the researcher under a given set of constraints. This paper introduces the use of EAs for optimization of experimental designs of spotted microarrays using a weighted objective function. The EA and the various criteria relevant to design optimization are discussed. Evolved designs are compared with designs obtained through exhaustive search with results suggesting that the EA can find just as efficient optimal or near-optimal designs within a tractable timeframe.
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data...
详细信息
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data sources can be challenging. Recently, it has been recognized that kernel-based classifiers are well suited for this task. However, the different kernels (data sources) are often combined using equal weights. Although several methods have been developed to optimize kernel weights, no large-scale example of an improvement in classifier performance has been shown yet. In this work, we employ an evolutionary algorithm to determine weights for a larger set of kernels by optimizing a criterion based on the area under the ROC curve. We show that setting the right kernel weights can indeed improve performance. We compare this to the existing kernel weight optimization methods (i.e., (regularized) optimization of the SVM criterion or aligning the kernel with an ideal kernel) and find that these do not result in a significant performance improvement and can even cause a decrease in performance. Results also show that an expert approach of assigning high weights to features with high individual performance is not necessarily the best strategy.
Resource management is a key factor in the performance and efficient utilization of cloud systems, and many research works have proposed efficient policies to optimize such systems. However, these policies have tradit...
详细信息
Resource management is a key factor in the performance and efficient utilization of cloud systems, and many research works have proposed efficient policies to optimize such systems. However, these policies have traditionally managed the resources individually, neglecting the complexity of cloud systems and the interrelation between their elements. To illustrate this situation, we present an approach focused on virtualized Hadoop for a simultaneous and coordinated management of virtual machines and file replicas. Specifically, we propose determining the virtual machine allocation, virtual machine template selection, and file replica placement with the objective of minimizing the power consumption, physical resource waste, and file unavailability. We implemented our solution using the non-dominated sorting genetic algorithm-II, which is a multi-objective optimization algorithm. Our approach obtained important benefits in terms of file unavailability and resource waste, with overall improvements of approximately 400 and 170 percent compared to three other optimization strategies. The benefits for the power consumption were smaller, with an improvement of approximately 1.9 percent.
Providing efficient workload management is an important issue for a large-scale heterogeneous distributed computing environment where a set of periodic applications is executed. The considered shipboard distributed sy...
详细信息
Providing efficient workload management is an important issue for a large-scale heterogeneous distributed computing environment where a set of periodic applications is executed. The considered shipboard distributed system is expected to operate in an environment where the input workload is likely to change unpredictably, possibly invalidating a resource allocation that was based on the initial workload estimate. The tasks consist of multiple strings, each made up of an ordered sequence of applications. There is a quality of service (QoS) minimum throughput constraint that must be satisfied for each application in a string, and a maximum utilization constraint that must be satisfied on each of the hardware resources in the system. The challenge, therefore, is to efficiently and robustly manage both computation and communication resources in this unpredictable environment to achieve high performance while satisfying the imposed constraints. This work addresses the problem of finding a robust initial allocation of resources to strings of applications that is able to absorb some level of unknown input workload increase without rescheduling. The proposed hybrid two-stage method of finding a near-optimal allocation of resources incorporates two specially designed mapping techniques: (1) the Permutation Space Genitor-Based heuristic, and (2) the follow-up Branch-and-Bound heuristic based on an Integer Linear Programming (ILP) problem formulation. The performance of the proposed resource allocation method is evaluated under different simulation scenarios and compared to an iteratively computed upper bound. (C) 2007 Elsevier Inc. All rights reserved.
Metaheuristic search techniques have been extensively used to automate the process of generating test cases, and thus providing solutions for a more cost-effective testing process. This approach to test automation, of...
详细信息
Metaheuristic search techniques have been extensively used to automate the process of generating test cases, and thus providing solutions for a more cost-effective testing process. This approach to test automation, often coined "Search-based Software Testing" (SBST), has been used for a wide variety of test case generation purposes. Since SBST techniques are heuristic by nature, they must be empirically investigated in terms of how costly and effective they are at reaching their test objectives and whether they scale up to realistic development artifacts. However, approaches to empirically study SBST techniques have shown wide variation in the literature. This paper presents the results of a systematic, comprehensive review that aims at characterizing how empirical studies have been designed to investigate SBST cost-effectiveness and what empirical evidence is available in the literature regarding SBST cost-effectiveness and scalability. We also provide a framework that drives the data collection process of this systematic review and can be the starting point of guidelines on how SBST techniques can be empirically assessed. The intent is to aid future researchers doing empirical studies in SBST by providing an unbiased view of the body of empirical evidence and by guiding them in performing well-designed and executed empirical studies.
As the sizes of CMOS devices rapidly scale deep into the nanometer range, the manufacture of nanocircuits will become extremely complex and will inevitably introduce more defects, including more transient faults that ...
详细信息
As the sizes of CMOS devices rapidly scale deep into the nanometer range, the manufacture of nanocircuits will become extremely complex and will inevitably introduce more defects, including more transient faults that appear during operation. For this reason, accurately calculating the reliability of future designs will be extremely critical for nanocircuit designers as they investigate design alternatives to optimize the tradeoffs between area-power-delay and reliability. However, accurate calculation of the reliability of large and highly connected circuits is complex and very time consuming. This paper presents a complete solution for estimating logic circuit reliability bounds with high accuracy in reasonable time, even for very large and complex circuits. The solution combines a novel criticality scoring algorithm to rank the reliability of individual input vectors with a heuristic search to find the input vector having the lowest reliability. The solution scales well with circuit size, and is independent of the interconnect complexity or the logic depth. Extensive computational results show that the speed of our method is orders of magnitude faster than exact solutions provided by Bayesian network exact inferences, while maintaining identical or sufficiently close accuracy.
暂无评论