Vision language models (VLMs) have achieved impressive progress in diverse applications, becoming a prevalent research direction. In this paper, we build FIRE, a feedback-refinement dataset, consisting of 1.1M multi-t...
详细信息
Occupants' comfort is the primary target in a building operation. However their efforts are often neglected and ruled out from traditional control strategies of energy-efficient building management systems. Occupa...
详细信息
Occupants' comfort is the primary target in a building operation. However their efforts are often neglected and ruled out from traditional control strategies of energy-efficient building management systems. Occupant-engaged control strategies have recently attracted many research attentions and demonstrated great potentials for energy saving. With them, occupants' behavior is incorporated into the closed-loop control methods in which their initiatives actively contribute to building services and energy utility by explicitly expressing their preferences. This work proposes an occupant-engaged demand response (DR) strategy for building automation in which occupants are actively engaged to adapt their energy consumption in response to incentive opportunities designed by facility managers. A model-based study and a Nash-Equilibrium-based solution are provided to assist facility managers with the design of social incentive policies to promote occupant participation, along with the guarantee of lucrativeness for a DR event.
Summary form only given. As the only method to study long-term climate trend and to predict potential climate risk, climate modeling is becoming a key research topic among governments and research organizations. One o...
详细信息
Summary form only given. As the only method to study long-term climate trend and to predict potential climate risk, climate modeling is becoming a key research topic among governments and research organizations. One of the most essential and challenging components in climate modeling is the atmospheric model. To cover high resolution in climate simulation scenarios, developers have to face the challenges from billions of mesh points and extremely complex algorithms. Shallow Water Equations (SWEs) are a set of conservation laws that perform most of the essential characteristics of the atmosphere. The study of SWEs can serve as the starting point for understanding the dynamic behavior of the global atmosphere. We choose cubed-sphere mesh as the computational mesh for its better load balance in pole regions over other meshes such as the latitude-longitude mesh. The cubed-sphere mesh is obtained by mapping a cube to the surface of the sphere. The computational domain is then the six patches, each of which is covered with N × N mesh points to be calculated. When written in local coordinates, SWEs have an identical expression on the six patches, that is ∂Q/∂t + 1/Λ ∂(ΛF 1 )/∂x 1 + 1/Λ ∂(ΛF 1 )/∂z 2 + S=0, (1) where (x 1 , x 2 ) ∈ [-π/4, π/4] are the local coordinates, Q = (h, hu 1 , hu 2 ) T is the prognostic variable, F i = u i Q (i = 1, 2) are the convective fluxes, S is the source term. Spatially discretized with a cell-centered finite volume method and integrated with a second-order accurate TVD Runge-Kutta method, SWE solvers are transferred to the computation of a 13-point upwind stencil that exhibits a diamond shape. To get the prognostic components (h, hu 1 and hu 2 ) of the central point, its neighboring 12 points need to be accessed. The stencil kernel includes at least 434 ADD/SUB operations, 570 multiplications, 99 divisions. The high arithmetic density of the SWEs algorithm makes it difficult to implement one kernel into the resource-limited FPGA card. I
The proposed method is focused on synthesis-based static circuits, and a power modeling library is developed for modeling processors by means of parametric RTL and physical annotation, and all kinds of processor modul...
详细信息
The proposed method is focused on synthesis-based static circuits, and a power modeling library is developed for modeling processors by means of parametric RTL and physical annotation, and all kinds of processor modules are mapped into combinations of basic components. Those models are linked to an architectural simulator, running benchmarks to get power results. The power analysis of benchmark platforms proves to be effective and highly correlated, with an average 10% error and little speed penalty compared with the gate-level power analysis.
The bandwidth becomes the major bottleneck of the performance improvement for modern microprocessors. A cache adaptive write allocate policy that improves the bandwidth of microprocessor significantly is proposed by i...
详细信息
The bandwidth becomes the major bottleneck of the performance improvement for modern microprocessors. A cache adaptive write allocate policy that improves the bandwidth of microprocessor significantly is proposed by investigating cache store misses. The cache adaptive write allocate policy collects fully modified blocks in miss queue. Fully modified blocks are written to lower level memory based on non-write allocate policy which can switch to write allocate policy adaptively. Compared with other cache store miss policies, the cache adaptive write allocate policy avoids unnecessary memory traffic, reduces cache pollution and decreases load and store queue full rate without increasing hardware overhead. Experiment results indicate that on average 62.6% memory bandwidth in STREAM benchmarks is improved by utilizing the cache adaptive write allocate policy. The performance of SPEC CPU 2000 benchmarks is also improved efficiently. The average IPC speedup is 5.9%.
Chip multiprocessors (CMP) have become the main stream microprocessor architecture. In CMP, the cache, especially the last level cache, is the critical part of its performance and becomes a focus of current research a...
详细信息
Chip multiprocessors (CMP) have become the main stream microprocessor architecture. In CMP, the cache, especially the last level cache, is the critical part of its performance and becomes a focus of current research activities. CMP cache faces the conflicting requirements of satisfying both latency and capacity, and has to trade off between techniques that reduce off-chip and cross-chip misses. The private cache design minimizes the cache access latency but reduces the total effective cache capacity. The shared cache design maximizes the effective cache capacity but incurs long hit latency. In this paper, a CMP cache design (tradeoff cache between latency and capacity, TCLC) is proposed. TCLC is a private and shared hybrid design. TCLC can dynamically identify the cache blocks' shared type and optimize them respectively. The private type is optimized through migration policy, the shared read-only type is optimized through replication policy, and the shared read-write type is optimized through center placement policy. TCLC tries to make cache access latency close to private design, and effective cache capacity close to shared design, which can mitigate the impact of the wire delay and reduce the average memory access latency. The experiment results indicate that this proposal performs 13.7% better than a private cache and 12% better than a shared cache.
With the widespread adoption of embedded microprocessor-based systems in safety critical applications, such as aircrafts, spaceships and nuclear power plants, how to rapidly and conveniently evaluate these fault-toler...
详细信息
With the widespread adoption of embedded microprocessor-based systems in safety critical applications, such as aircrafts, spaceships and nuclear power plants, how to rapidly and conveniently evaluate these fault-tolerant mechanisms with low cost is an important problem. The traditional method requires a detailed hardware protocol to do evaluation, which lengthens evaluation period and increases the cost. A new dependability evaluation technique based on microprocessor function model is proposed, which can evaluate fault-tolerant mechanisms more rapidly, more conveniently and more economically than the conventional systems. As a case for study, the new system evaluates three fault-tolerant techniques;the software redundancy technique, the assertion validation technique and the instruction re-fetching and re-execution technique. The results show that the evaluation is reasonable.
Due to the mobility and frequent disconnections, the correctness of mobile interaction systems, such as mobile robot systems and mobile payment systems, are often difficult to analyze. This paper introduces three crit...
详细信息
In processor architectures such as MIPS, ALPHA, SPARC and PowerPC, indirect addressing mode is always adopted to access global variables and static ones. Since the addresses of these variables and the corresponding va...
详细信息
In processor architectures such as MIPS, ALPHA, SPARC and PowerPC, indirect addressing mode is always adopted to access global variables and static ones. Since the addresses of these variables and the corresponding values are in different data sections in the corresponding binary file, the data locality of the program will be very poor. As a result, accessing the read only addresses of these variables every time tends to result in non-trivial redundant data cache miss memory accesses. Moreover, such indirect addressing mode will generate two sequential load instructions which have data dependences between them. As a result, the amount of instruction level parallelism (ILP) of the program will be decreased. The authors present an address register promotion method based on feedbacks (ARPF) to solve the above problems. ARPF algorithm reduces the redundant accesses to the read only addresses of the global variables and static ones, increases the amount of instruction level parallelism of a program, and avoids the performance declines due to the increase in register pressure caused by register promotion. The algorithm has been implemented in the Loongson compiler for MIPS architecture. Experiments on SPEC CPU2000INT benchmarks are conducted to show that ARPF can improve the performance of all benchmarks by 1%-6%.
Drawing support from an effective Medical Image Segmentation (MIS) is conducive to a substantial diagnostic basis for the physicians to identify the focus lesion in the patient body and give the subsequent clinical as...
详细信息
暂无评论