Establishing blockmodels for one- and two-mode binary network matrices has typically been accomplished using multiple restarts of heuristic algorithms that minimize functions of inconsistency with an ideal block struc...
详细信息
Establishing blockmodels for one- and two-mode binary network matrices has typically been accomplished using multiple restarts of heuristic algorithms that minimize functions of inconsistency with an ideal block structure. Although these algorithms likely yield exceptional performance, they are not assured to provide blockmodels that optimize the functional indices. In this paper, we present integer programming models that, for a prespecified image matrix, can produce guaranteed optimal solutions for matrices of nontrivial size. Accordingly, analysts performing a confirmatory analysis of a prespecified blockmodel structure can apply our models directly to obtain an optimal solution. In exploratory cases where a blockmodel structure is not prespecified, we recommend a two-stage procedure, where a heuristic method is first used to identify an image matrix and the integer program is subsequently formulated and solved to identify the optimal solution for that image matrix. Although best suited for ideal block structures associated with structural equivalence, the integer programming models have the flexibility to accommodate functional indices pertaining to regular equivalence. Computational results are reported for a variety of one- and two-mode matrices from the blockmodeling literature. (C) 2009 Elsevier Inc. All rights reserved.
Although the K-means algorithm for minimizing the within-cluster sums of squared deviations from cluster centroids is perhaps the most common method for applied cluster analyses, a variety of other criteria are availa...
详细信息
Although the K-means algorithm for minimizing the within-cluster sums of squared deviations from cluster centroids is perhaps the most common method for applied cluster analyses, a variety of other criteria are available. The p-median model is an especially well-studied clustering problem that requires the selection of p objects to serve as cluster centers. The objective is to choose the cluster centers such that the sum of the Euclidean distances (or some other dissimilarity measure) of objects assigned to each center is minimized. Using 12 data sets from the literature, we demonstrate that a three-stage procedure consisting of a greedy heuristic, Lagrangian relaxation, and a branch-and-bound algorithm can produce globally optimal solutions for p-median problems of nontrivial size (several hundred objects, five or more variables, and up to 10 clusters). We also report the results of an application of the p-median model to an empirical data set from the telecommunications industry.
This paper presents an integer linear programming formulation for the problem of extracting a subset of stimuli from a confusion matrix. The objective is to select stimuli such that total confusion among the stimuli i...
详细信息
This paper presents an integer linear programming formulation for the problem of extracting a subset of stimuli from a confusion matrix. The objective is to select stimuli such that total confusion among the stimuli is minimized for a particular subset size. This formulation provides a drastic reduction in the number of variables and constraints relative to a previously proposed formulation for the same problem. An extension of the formulation is provided for a biobjective problem that considers both confusion and recognition in the objective function. Demonstrations using an empirical interletter confusion matrix from the psychological literature revealed that a commercial branch-and-bound integer programming code was always able to identify optimal solutions for both the single-objective and biobjective formulations within a matter of seconds. A further extension and demonstration of the model is provided for the extraction of multiple subsets of stimuli, wherein the objectives are to maximize similarity within subsets and minimize similarity between subsets.
The clustering of two-mode proximity matrices is a challenging combinatorial optimization problem that has important applications in the quantitative social sciences. We focus on one particular type of problem related...
详细信息
The clustering of two-mode proximity matrices is a challenging combinatorial optimization problem that has important applications in the quantitative social sciences. We focus on one particular type of problem related to the clustering of a two-mode binary matrix, which is relevant to the establishment of generalized blockmodels for social networks. In this context, clusters for the rows of the two-mode matrix intersect with clusters of the columns to form blocks, which should ideally be either complete (all Is) or null (all Os). A new procedure based on variable neighborhood search is presented and compared to an existing two-mode K-means clustering algorithm. The new procedure. generally provided slightly greater explained variation;however, both methods yielded exceptional recovery of cluster structure. (C) 2007 Elsevier Inc. All rights reserved.
There are two well-known methods for obtaining a guaranteed globally optimal solution to the problem of least-squares unidimensional scaling of a symmetric dissimilarity matrix: ( a) dynamic programming, and (b) branc...
详细信息
There are two well-known methods for obtaining a guaranteed globally optimal solution to the problem of least-squares unidimensional scaling of a symmetric dissimilarity matrix: ( a) dynamic programming, and (b) branch-and-bound. Dynamic programming is generally more efficient than branch-and-bound, but the former is limited to matrices with approximately 26 or fewer objects because of computer memory limitations. We present some new branch-and-bound procedures that improve computational efficiency, and enable guaranteed globally optimal solutions to be obtained for matrices with up to 35 objects. Experimental tests were conducted to compare the relative performances of the new procedures, a previously published branch-and-bound algorithm, and a dynamic programming solution strategy. These experiments, which included both synthetic and empirical dissimilarity matrices, yielded the following findings: (a) the new branch-and-bound procedures were often drastically more efficient than the previously published branch-and-bound algorithm, (b) when computationally feasible, the dynamic programming approach was more efficient than each of the branch-and-bound procedures, and ( c) the new branch-and-bound procedures require minimal computer memory and can provide optimal solutions for matrices that are too large for dynamic programming implementation.
A comparison is made among four different optimization strategies for the linear unidimensional scaling task in the L-2-norm: (1) dynamic programming;(2) an iterative quadratic assignment improvement heuristic;(3) the...
详细信息
A comparison is made among four different optimization strategies for the linear unidimensional scaling task in the L-2-norm: (1) dynamic programming;(2) an iterative quadratic assignment improvement heuristic;(3) the Guttman update strategy as modified by Pliner's technique of smoothing;(4) a nonlinear programming reformulation by Lau, Leung, and Tse. The methods are all implemented through (freely downloadable) MATLAB m-files;their use is illustrated by a common data set carried throughout. For the computationally intensive dynamic programming formulation that can guarantee a globally optimal solution, several possible computational improvements are discussed and evaluated using (a) a transformation of a given m-function with the MATLAB Compiler into C code and compiling the latter;(b) rewriting an m-function and a mandatory MATLAB gateway directly in Fortran and compiling into a MATLAB callable file;(c) comparisons of the acceleration of raw m-files implemented under the most recent release of MATLAB Version 6.5 (and compared to the absence of such acceleration under the previous MATLAB Version 6.1). Finally, and in contrast to the combinatorial optimization task of identifying a best unidimensional scaling for a given proximity matrix, an approach is given for the confirmatory fitting of a given unidimensional scaling based only on a fixed object ordering, and to nonmetric unidimensional scaling that incorporates an additional optimal monotonic transformation of the proximities.
Minimization of the within-cluster sums of squares (WCSS) is one of the most important optimization criteria in cluster analysis. Although cluster analysis modules in commercial software packages typically use heurist...
详细信息
Minimization of the within-cluster sums of squares (WCSS) is one of the most important optimization criteria in cluster analysis. Although cluster analysis modules in commercial software packages typically use heuristic methods for this criterion, optimal approaches can be computationally feasible for problems of modest size. This paper presents a new branch-and-bound algorithm for minimizing WCSS. Algorithmic enhancements include an effective reordering of objects and a repetitive solution approach that precludes the need for splitting the data set, while maintaining strong bounds throughout the solution process. The new algorithm provided optimal solutions for problems with up to 240 objects and eight well-separated clusters. Poorly separated problems with no inherent cluster structure were optimally solved for up to 60 objects and six clusters. The repetitive branch-and-bound algorithm was also successfully applied to three empirical data sets from the classification literature.
This paper is concerned with a problem where K (n x n) proximity matrices are available for a set of n objects. The goal is to identify a single permutation of the n objects that provides an adequate structural fit, a...
详细信息
This paper is concerned with a problem where K (n x n) proximity matrices are available for a set of n objects. The goal is to identify a single permutation of the n objects that provides an adequate structural fit, as measured by an appropriate index, for each of the K matrices. A multiobjective programming approach for this problem, which seeks to optimize a weighted function of the K indices, is proposed, and illustrative examples are provided using a set of proximity matrices from the psychological literature. These examples show that, by solving the multiobjective programming model under different weighting schemes, the quantitative analyst can uncover information about the relationships among the matrices and often identify one or more permutations that provide good to excellent index values for all matrices. (C) 2002 Elsevier Science (USA).
Multiobjective programming, a technique for solving mathematical optimization problems with multiple conflicting objectives, has received increasing attention among researchers in various academic disciplines. A summa...
详细信息
Multiobjective programming, a technique for solving mathematical optimization problems with multiple conflicting objectives, has received increasing attention among researchers in various academic disciplines. A summary of multiobjective programming techniques and a review of their applications in quantitative psychology are provided. (C) 2011 Elsevier Inc. All rights reserved.
Perhaps the most common criterion for partitioning a data set is the minimization of the within-cluster sums of squared deviation from cluster centroids. Although optimal solution procedures for within-cluster sums of...
详细信息
Perhaps the most common criterion for partitioning a data set is the minimization of the within-cluster sums of squared deviation from cluster centroids. Although optimal solution procedures for within-cluster sums of squares (WCSS) partitioning are computationally feasible for small data sets, heuristic procedures are required for most practical applications in the behavioral sciences. We compared the performances of nine prominent heuristic procedures for WCSS partitioning across 324 simulated data sets representative of a broad spectrum of test conditions. Performance comparisons focused on both percentage deviation from the "best-found" WCSS values, as well as recovery of true cluster structure. A real-coded genetic algorithm and variable neighborhood search heuristic were the most effective methods;however, a straightforward two-stage heuristic algorithm, HK-means, also yielded exceptional performance. A follow-up experiment using 13 empirical data sets from the clustering literature generally supported the results of the experiment using simulated data. Our findings have important implications for behavioral science researchers, whose theoretical conclusions could be adversely affected by poor algorithmic performances.
暂无评论