Due to their fine-grained operations and low conflict rates, graph processingalgorithms expose a large amount of parallelism that has been extensively exploited by various parallelization frameworks. Transactional Me...
详细信息
ISBN:
(纸本)9783030856656;9783030856649
Due to their fine-grained operations and low conflict rates, graph processingalgorithms expose a large amount of parallelism that has been extensively exploited by various parallelization frameworks. Transactional Memory (TM) is a programming model that uses an optimistic concurrency control mechanism to improve the performance of irregular applications, making it a perfect candidate to extract parallelism from graph-based programs. Although fast Hardware TM (HTM) instructions are now available in the ISA extensions of some major processor architectures (e.g., Intel and ARM), balancing the usage of Software TM (STM) and HTM to compensate for capacity and conflict aborts is still a challenging task. this paper presents a Phased TM implementation for graph applications, called Graph-Oriented Transactional Memory (GoTM). It uses a three-state (HTM, STM, GLOCK) concurrency control automaton that leverages both HTM and STM implementations to speed-up graph applications. Experimental results using seven well-known graph programs and real-life workloads show that GoTM can outperform other Phased TM systems and lock-based concurrency mechanisms such as the one present in Galois, a state-of-the-art framework for graph computations.
Autoregressive language models have achieved remarkable advancements, yet their potential is often limited by the slow inference speeds associated with sequential token generation. Blockwise parallel decoding (BPD) wa...
ISBN:
(纸本)9798331314385
Autoregressive language models have achieved remarkable advancements, yet their potential is often limited by the slow inference speeds associated with sequential token generation. Blockwise parallel decoding (BPD) was proposed by Stern et al. [42] as a method to improve inference speed of language models by simultaneously predicting multiple future tokens, termed block drafts, which are subsequently verified by the autoregressive model. this paper advances the understanding and improvement of block drafts in two ways. First, we analyze token distributions generated across multiple prediction heads. Second, leveraging these insights, we propose algorithms to improve BPD inference speed by refining the block drafts using task-independent n-gram and neural language models as lightweight rescorers. Experiments demonstrate that by refining block drafts of open-sourced Vicuna and Medusa LLMs, the mean accepted token length are increased by 5-25% relative. this results in over a 3x speedup in wall clock time compared to standard autoregressive decoding in open-source 7B and 13B LLMs.
the increasing number of smart devices in private households has lead to a large quantity of smart homes worldwide. In order to gain meaningful insights into their generated data and offer extended information and add...
详细信息
ISBN:
(纸本)9789897584268
the increasing number of smart devices in private households has lead to a large quantity of smart homes worldwide. In order to gain meaningful insights into their generated data and offer extended information and added value for consumers, data analytics architectures are essential. In addition, the development and improvement of machine learning techniques and algorithms in the past years has lead to the availability of powerful analytics tools, which have the potential to allow even more sophisticated insights at the cost of changed challenges and requierements for analytics architectures. However, architectural solutions, which offer the ability to deploy flexible, machine learning-based analytics pipelines on streaming data, are missing in research as well as in industry. In this paper, we present the motivation and a concept for machine learning-based data processing on streaming data for consumer-centric Internet of things domains, such as smart home. this approach was evaluated in terms of its performance and may serve as a basis for further development and discussion.
Web3.0 apps, withtheir emphasis on decentralization, real-time processing, and low latency, hold the key to MEC's bright future. this calls for an all-encompassing, future-oriented strategy in MEC architecture an...
详细信息
ISBN:
(数字)9798350305463
ISBN:
(纸本)9798350305470
Web3.0 apps, withtheir emphasis on decentralization, real-time processing, and low latency, hold the key to MEC's bright future. this calls for an all-encompassing, future-oriented strategy in MEC architecture and design. the suggested technique is a multi-stage strategy for molding the future of MEC via cutting-edge architectures and designs for Web3.0 over 5G/6G networks. In the first stage, you'll examine the needs and goals of Web3.0 in detail. Real-time, lowlatency interactions are essential, as is familiarity withthe dynamics of distributed apps and blockchain technology. In the second stage, we examine the current MEC setup, from edge server dispersion to data center density to network topology. the technique takes use of the low latency, high throughput, and huge device connection offered by 5G and impending 6G networks to guarantee preparedness for the future. the suggested technique is based on novel building layouts as its foundation. Latency, throughput, resource usage, scalability, and security compliance are only few of the metrics that may be evaluated to guarantee that the design is effective. the difficulties of actual deployment are also discussed; they include issues like where to put edge servers and how to scale resources according to demand. To ensure that the process remains in step with developing technologies and user preferences, it is recommended that improvements be made in a series of iterations.
the work is devoted to the research and development of a software and hardware platform based on control methods of a spherical parallel mechanism (SPM) with tracking of changes in the position of a human head withth...
the work is devoted to the research and development of a software and hardware platform based on control methods of a spherical parallel mechanism (SPM) with tracking of changes in the position of a human head withthe help of a camera and a computer using intelligent data *** article explores the relevance and advantages of SPMs, emphasizing their high precision, enhanced mobility, reliability, and lightweight design. the paper deals with an important aspect of controlling SPMs, which directly influences their performance. the control methods for SPMs vary based on the specific tasks and operational requirements, such as actuator-based control, sensor-based control, closed-loop control, and adaptive control are *** research additionally investigates the problems of forming tasks for SPMs and the need for improved methods, especially for applications involving human-robot interaction, dangerous works, and camera positioning tasks. To resolve these problems, the study proposed a software-hardware platform that remotely controls SPMs by tracking changes in the position of the human head using a camera and a computer. By employing OpenCV and Dilb libraries for face recognition and control point tracking, the proposed platform enables intuitive and clear control of the *** article presents the mathematical operations involved in calculating the rotation angles of the SPM based on the tracked human head movements. the obtained rotation angles serve as inputs for the servomotors, allowing precise control over the mechanism's *** results of the study demonstrate the effectiveness of the proposed software-hardware platform in controlling the SPM based on human head movements. the platform's ability to accurately track and convert head movements into servo rotation values enables remote and intuitive control of the SPM, making it a promising direction for future research in the field of robotics and automation.
A message authentication code (MAC) scheme is a means for verification of a message in a symmetric key setting. An aggregate MAC (AMAC) scheme is proposed for decreasing the communication cost (tag size) generated by ...
详细信息
ISBN:
(纸本)9781665492669
A message authentication code (MAC) scheme is a means for verification of a message in a symmetric key setting. An aggregate MAC (AMAC) scheme is proposed for decreasing the communication cost (tag size) generated by plural tag generators. Up to present, we can see an AMAC scheme for parallel construction, one for sequential construction, and one for series-parallel construction. In this paper, we present how to construct and implement an AMAC scheme in which a tag generation order represented by a general DAG is available, and discuss the security and the tag size for the scheme. the first implementation gives a simple HMAC-like AMAC scheme, but the security of such a scheme is shown only under a weak security model. the second one gives an AMAC scheme which is shown to be secure under a strong security model, but the tag size increases in proportion to the number of the paths from the sources to the sinks. then we discuss also how to decrease the tag size.
A predictive error control switching strategy (PESS) is proposed in this paper for a three-level direct matrix converter (TLDMC). In the proposed PESS, to control the input current and output voltage indirect-space ve...
详细信息
ISBN:
(数字)9798350376753
ISBN:
(纸本)9798350376760
A predictive error control switching strategy (PESS) is proposed in this paper for a three-level direct matrix converter (TLDMC). In the proposed PESS, to control the input current and output voltage indirect-space vectors of the matrix converter are used. During every sampling period, using the principle of least square method the switching vectors are selected by computing the minimum sum of the squared errors of both voltage and current. the envelope of the minimum error of the space vectors over a sampling period directs the overall error in the PESS technique. the PESS technique experiences considerably high input current harmonics for lower modulation indices. the PESS technique is modified to improve the performance of TLDMC during lower modulation indices. the modified PESS for TLDMC is evaluated using MATLAB model and the same is validated by experimentation.
Cloud removal is vital for the analysis of optical satellite images. To alleviate the impact of thick clouds, recent advances integrate deep learning withmultimodal data. To relax the requirements of paired training s...
详细信息
ISBN:
(数字)9798350373820
ISBN:
(纸本)9798350373837
Cloud removal is vital for the analysis of optical satellite images. To alleviate the impact of thick clouds, recent advances integrate deep learning withmultimodal data. To relax the requirements of paired training samples, existing methods utilized cycle-consistent architecture to learn the relation between cloudy and cloud-free images based on unpaired training samples. When considering thick cloud removal based on unpaired training data, the relevant studies are insufficient and they face two challenges: 1) the information in non-cloudy areas is not preserved well after cloud removal; 2) the recovered information lacks consistency withthe global textures and structures. Based on optical-SAR fusion and cycle-consistent training, this paper proposes Multiscale Cycle-consistent Fusion (MCF) model. MCF designs preservation loss to overcome the first challenge, and proposes global-local discriminator combined withparallel Dilated Channel-weighted Module (PDCM) for the second challenge. MCF is evaluated both on a simulated dataset and a real dataset, which demonstrates its effectiveness.
Offline imitative learning(OIL) is often used to solve complex continuous decision-making tasks. For these tasks such as robot control, automatic driving and etc., it is either difficult to design an effective reward ...
详细信息
ISBN:
(纸本)9781450398336
Offline imitative learning(OIL) is often used to solve complex continuous decision-making tasks. For these tasks such as robot control, automatic driving and etc., it is either difficult to design an effective reward for learning or very expensive and time-consuming for agents to collect data interactively withthe environment. However, the data used in previous OIL methods are all gathered by reinforcement learning algorithms guided by task-specific rewards, which is not a true reward-free premise and still suffers from the problem of designing an effective reward function in real tasks. To this end, we propose the reward-free exploratory data driven offline imitation learning (ExDOIL) framework. ExDOIL first trains an unsupervised reinforcement learning agent by interacting withthe environment, and collects enough unsupervised exploration data during training; then, a task independent yet simple and efficient reward function is used to relabel the collected data; Finally, an agent is trained to imitate the expert to complete the task through a conventional RL algorithm such as TD3. Extensive experiments on continuous control tasks demonstrate that the proposed framework can achieve better imitation performance(28% higher episode returns on average) comparing with previous SOTA method(ORIL) without any task-specific rewards.
the scheduling of tasks in a heterogeneous multiprocessor system in the cloud is still a demanding problem that is being explored by many researchers. parallel computing which itself is a research area is now integrat...
详细信息
暂无评论