With the rapid progress of generative models, the current challenge in face forgery detection is how to effectively detect realistic manipulated faces from different unseen domains. Though previous studies show that p...
详细信息
Sandwich-like structures have shown remarkable efficacy in clothed human reconstruction. However, these approaches often generate unrealistic side geometries due to inadequate handling of lateral regions. This paper a...
详细信息
A high-quality enrollment speech is crucial to target speaker extraction (TSE), since it provides essential cues for identifying the target speaker in the mixture. However, real applications usually only permit a shor...
详细信息
The difficulty in fabricating a multifaceted composite heterojunction system based on CdxZn1-xS limits the enhancement of photocatalytic *** the present scrutiny,novel ZnO/CdxZn1-xS/CdS com-posite heterojunctions are ...
详细信息
The difficulty in fabricating a multifaceted composite heterojunction system based on CdxZn1-xS limits the enhancement of photocatalytic *** the present scrutiny,novel ZnO/CdxZn1-xS/CdS com-posite heterojunctions are successfully prepared by the alkaline dissolution etching *** internal electric field at the interface of Ⅰ-type and Z-scheme heterojunction improved the effective charge *** ZC 8 sample exhibits excellent photocatalytic performance and the H2 production efficiency is 15.67 mmol g-1 h-1 with good stability up to 82.9%in 24-hour *** performance of CH4 and CO capacity in the CO2RR process is 3.47 μmol g-1 h-1 and 23.5 μmol g-1 h-1,*** photogener-ated accelerated charge transport is then examined in detail by in situ X-ray photoelectron spectroscopy(ISXPS)and density functional theory(DFT)*** work presents a new idea for the synthe-sis of CdxZni-xS solid-solution-based materials and provides a solid reference for the detailed mechanism regarding the electric field at the heterojunction interface.
Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks, largely due to their substantial model size. However, this also results in significant GPU memory demands during infe...
详细信息
Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks, largely due to their substantial model size. However, this also results in significant GPU memory demands during inference. To address these challenges on hardware with limited GPU memory, existing approaches employ offloading techniques that offload unused tensors to CPU memory, thereby reducing GPU memory usage. Since offloading involves data transfer between GPU and CPU, it introduces transfer overhead. To mitigate this, prior works typically overlap data transfer with GPU computation using a fixed pipelining strategy applied uniformly across all inference iterations, referred to as static offloading. However, static offloading policies fail to maximize inference throughput because they cannot adapt to the dynamically changing transfer overhead during the inference process, leading to increasing GPU idleness and reduced inference *** propose that offloading policies should be adaptive to the varying transfer overhead across inference iterations to maximize inference throughput. To this end, we design and implement an adaptive offloading-based inference system called TightLLM with two key innovations. First, its key-value (KV) distributor employs a trade-compute-for-transfer strategy to address growing transfer overhead by dynamically recomputing portions of the KV cache, effectively overlapping data transfer with computation and minimizing GPU idleness. Second, TightLLM’s weight loader slices model weights and distributes the loading process across multiple batches, amortizing the excessive weight loading overhead and significantly improving throughput. Evaluation across various combinations of GPU hardware and LLM models shows that TightLLM achieves 1.3 to 23 times higher throughput during the decoding phase and 1.2 to 22 times higher throughput in the prefill phase compared to state-of-the-art offloading systems. Due to the higher throughput in prefill
This paper presents RoGSplat, a novel approach for synthesizing high-fidelity novel views of unseen human from sparse multi-view images, while requiring no cumbersome per-subject optimization. Unlike previous methods ...
详细信息
Path planning is an important step in ensuring the safety of Unmanned Surface Vehicle (USV) navigation and executing missions quickly and efficiently. However, current USV path planning methods lack comprehensive cons...
详细信息
The cube attack is a powerful cryptanalysis technique used against stream ciphers. It enables the retrieval of secret key information by computing the values of superpolys, with unknown secret key bits as variables. A...
详细信息
Recent open-vocabulary detectors achieve promising performance with abundant region-level annotated data. In this work, we show that an open-vocabulary detector co-training with a large language model by generating im...
详细信息
Recent advances in Large Multi-modal Models (LMMs) are primarily focused on offline video understanding. Instead, streaming video understanding poses great challenges to recent models due to its time-sensitive, omni-m...
详细信息
暂无评论