作者:
Minami, TKasai, RMatsuda, HKusaba, RMemberNTT LSI Laboratories
Atsugi Japan 243-01 Graduated in 1980 from the Department of Electrical Engineering
Kyushu University where he received his Master's degree in 1982 and joined NTT. Until March 1986 he was engaged in the development of application software for electronic switching systems. He then engaged in research and development of LSIs for image signal processing. At present he is Senior Research Engineer Advanced LSI Laboratory NTT LSI Laboratories. He is a member of IEEE. Graduated in 1972 from the Department of Electrical Engineering
Osaka University where he received his Master's degree in 1974 and his Ph.D. in 1992. He joined NTT in 1974. He is engaged in research and development of analysis of MOS devices and design of ASIC for communication. At present he is Executive Research Engineer Advanced LSI Laboratory NTT LSI Laboratories. He is a member of IEEE. Graduated in 1987 from the Department of Communications
Tohoku University where he received his Master's degree in 1989 and joined NTT. He is engaged in research and development of high-speed design for image processing LSIs. At present he is Research Engineer Advanced LSI Laboratory NTT LSI Laboratories. He is a member of the Information Processing Society. Graduated in 1985 from the Department of Electrical Engineering
Keio University where he received his Master's degree in 1987 and joined NTT. He is engaged in CAD research. At present he is Senior Research Engineer Advanced LSI Laboratory NTT LSI Laboratories. He is a member of the Information Processing Society and IEEE.
This paper discusses the downsizing and speed improvement of short-word multiplier-accumulators, which are frequently used in digital signal processors. As a first step, the optimal configuration for an array-type car...
详细信息
This paper discusses the downsizing and speed improvement of short-word multiplier-accumulators, which are frequently used in digital signal processors. As a first step, the optimal configuration for an array-type carry-save adder is considered where the shortest path in the full-adder is used to propagate the sum signal and the carry signal is sent to the full-adder of the two lower stages by skipping a stage. A configuration of the full-adder suitable for the structure is proposed. The case of eight partial product additions shows that the delay can be reduced by 22 percent compared to a simple array-type carry-save adder. Then the short-word carry look-ahead adder using the pass-transistor logic is considered. It is shown that a single-stage carry look-ahead circuit with a four-bitwise iterative structure exhibits nearly the same delay as a two-stage carry look-ahead circuit. In other words, the former is better suited to downsizing. This paper intends to examine the effectiveness of the foregoing new array-type carry-save adder and the single-stage carry look-ahead circuit using the 0.5-mu m CMOS technology. A 16-bit x 14-bit + 31-bit multiplier-accumulator has been designed and is evaluated for cases where the array-type carry-save adder is used to handle accumulation as well as partial products. The resulting area and delay are 0.77 x 0.78 mm(2) and 6.8 ns, respectively. The effectiveness of the approach used in this paper is evaluated by constructing a multiplier-accumulator, but the method is also useful in constructing a multiplier.
This paper proposes an automatic method of extracting the image of a vehicle moving along a straight road from a sequential video image. The method has been designed to be independent of the type of vehicle and the ba...
详细信息
This paper proposes an automatic method of extracting the image of a vehicle moving along a straight road from a sequential video image. The method has been designed to be independent of the type of vehicle and the background, and to obtain an image with minimum defective areas. The image of the vehicle is extracted by analyzing a locus in a spatio-temporal image seen through a slit in a fixed video camera, by updating the background image and superimposing a time-series of the background difference image. The method can extract a full image of various types of vehicles in an arbitrary background with minimum defects, and can separate the images of two vehicles when one crosses in front of the other. The method has been tested successfully with actual traffic (including traffic starting from a traffic light). The results show that the method is practical for smoothly flowing traffic. The method has also been tested successfully for identifying vehicle types.
作者:
TSUTSUGUCHI, KSAKAINO, HWATANABE, YMemberNTT Human Interface Laboratories
Yokosuka Japan 238 Hidetorno Sakaino received his B.E. degree in 1986 and his M.E. degree in 1988 from Hokkaido Univ.
Hokkaido Japan. He is a Research Engineer in NTT HI Labs. From 1988 to 1990 he worked as an engineer at the Visual Media Division in NYT HI Labs. In 1993 he began to work in the Advanced Video Processing Laboratory HI Labs NTT. His current research interests are representations of elastic objects and their transformations given the interaction between fluid dynamics and moving objects. Yashuhiko Watanabe received his Bachelor's degree from Niigata Univ.
Niigata Japan in 1981. He is the Senior Research Engineer in the Advanced Video Processing Laboratory of NIT HI Labs. He is presently researching 3D volume processinghisualization systems. Since joining the Electrical Communications Laboratories NTT in 1981 he has been working on facsimile communication systems and Videotex communication systems. He is a member of the Institute of Image Electronic Engineers of Japan.
This paper describes a method for automatically generating human walking sequences that match the path set to the given terrain. The Environment Adaptive Human Walking Animation (EAW) module using this method offers t...
详细信息
This paper describes a method for automatically generating human walking sequences that match the path set to the given terrain. The Environment Adaptive Human Walking Animation (EAW) module using this method offers to users an interface that can unite human walking motion with the environment in virtual worlds. The EAW module has a 4-layer structure comprising the GUI level, the control level, the motion generation level and the rendering level. The EAW module realizes the adaptation to the 3D terrain by using ''area divisions'' and ''step divisions'' for the walking path in the control level, and generates the walking motion sequences by using the ''E-KLAW'' method that uses dynamics and kinematics in the motion generation level. By using this module, the user's tasks are made significantly simpler and the user can incorporate human motion into various scenes.
In this paper, various convolutional approachs for 1-D and 2-D DCT are presented. In case 1-D, Li-method[9] is introduced, and more regular structure compare with Li-method is described. However, these two methods are...
详细信息
In this paper, various convolutional approachs for 1-D and 2-D DCT are presented. In case 1-D, Li-method[9] is introduced, and more regular structure compare with Li-method is described. However, these two methods are complicated forms to implemente with systolic arrays, So we propose the proper structure in arrays. This structure can speed up the computability of the multiplication processing with real-number instead of complex-number. The 2-D algorithm and structure described by expansion of 1-D algorithm. The proposed structure which can process faster than 1-D structure and can decrease PE(processing Element) numbers of 2-D structure compare with 1-D systolic one is presented with block diagram.
Current methods for image-text retrieval commonly propose various fusion modules to achieve robust visual-textual alignment, primarily relying on in-batch learning to guide the matching process. Some follow-up methods...
详细信息
Current methods for image-text retrieval commonly propose various fusion modules to achieve robust visual-textual alignment, primarily relying on in-batch learning to guide the matching process. Some follow-up methods seek to enlarge the number of negative samples to boost image-text contrastive learning. However, these methods often face challenges posed by semantic-consistent negatives, i.e., negatives samples that share correspondence with the ground truth, leading to confusion in learning cross-modal semantics. To address this issue, we propose a novel Retrieve with Authentic negative repository Learning (ReAL) method, which constructs a specific Authentic Negative Repository filled with valuable negative sample pairs. By introducing a Unique Negative Filter with a Discriminative Triplet Ranking Loss, ReAL effectively filters out the semantic-consistent negatives through similarity distribution analysis and threshold learning. Moreover, existing fusion paradigms suffer from intricate use of fine-grained representations from word- and region-level instances to progressively refine the fused embedding. In this paper, we propose a lightweight Cluster Refinement Module to exploit cross-modal semantics in a 1-way-1-out paradigm. Each visual-textual alignment can spontaneously uncover correlations with adjacent alignments through aggregation and re-allocation, without the need for a redundant and cost-inefficient refinement stage. Furthermore, ReAL employs dual momentum encoders with two memory banks, expanding the selection range of the Authentic Negative Repository to include a broader set of negatives. Extensive experiments conducted on Flickr30K, MS-COCO, and the augmented Flickr30K (with more hard negatives) demonstrate the superiority and robustness of ReAL, while also showcasing its significantly reduced inference time compared to other competitive baselines.
暂无评论