While spin-orbit interaction has been extensively studied,few investigations have reported on the interaction between orbital angular momenta(OAMs).In this work,we study a new type of orbit-orbit coupling between the ...
详细信息
While spin-orbit interaction has been extensively studied,few investigations have reported on the interaction between orbital angular momenta(OAMs).In this work,we study a new type of orbit-orbit coupling between the longitudinal OAM and the transverse OAM carried by a three-dimensional(3D)spatiotemporal optical vortex(STOV)in the process of tight *** 3D STOV possesses orthogonal OAMs in the x-y,t-x,and y-t planes,and is preconditioned to overcome the spatiotemporal astigmatism effect.x,y,and t are the axes in the spatiotemporal *** corresponding focused wavepacket is calculated by employing the Debye diffraction theory,showing that a phase singularity ring is generated by the interactions among the transverse and longitudinal vortices in the highly confined *** Fourier-transform decomposition of the Debye integral is employed to analyze the mechanism of the orbit-orbit *** is the first revelation of coupling between the longitudinal OAM and the transverse OAM,paving the way for potential applications in optical trapping,laser machining,nonlinear light-matter interactions,and more.
The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed ***, MPI implementations can contain defects that impact the reliability and performance of parallelapplications....
详细信息
The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed ***, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.
Fiber materials are key materials that have changed human history and promoted the progress of human civilization. In ancient times, humans used feathers and animal skins for clothing, and later they widely employed n...
详细信息
Fiber materials are key materials that have changed human history and promoted the progress of human civilization. In ancient times, humans used feathers and animal skins for clothing, and later they widely employed natural fibers such as cotton, hemp, silk and wool to make fabrics(Fig. 1a). Chinese ancestors had mastered the art of natural fiber weaving as early as the Neolithic *** thousand years ago, people were already familiar with and adept at techniques for spinning natural fibers [1].
Within the electronic design automation(EDA) domain, artificial intelligence(AI)-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutio...
详细信息
Within the electronic design automation(EDA) domain, artificial intelligence(AI)-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Such an “AI4EDA” approach falls short of achieving a holistic design synthesis and understanding,overlooking the intricate interplay of electrical, logical, and physical facets of circuit data. This study argues for a paradigm shift from AI4EDA towards AI-rooted EDA from the ground up, integrating AI at the core of the design process. Pivotal to this vision is the development of a multimodal circuit representation learning technique, poised to provide a comprehensive understanding by harmonizing and extracting insights from varied data sources, such as functional specifications, register-transfer level(RTL) designs, circuit netlists,and physical layouts. We champion the creation of large circuit models(LCMs) that are inherently multimodal, crafted to decode and express the rich semantics and structures of circuit data, thus fostering more resilient, efficient, and inventive design methodologies. Embracing this AI-rooted philosophy, we foresee a trajectory that transcends the current innovation plateau in EDA, igniting a profound “shift-left” in electronic design methodology. The envisioned advancements herald not just an evolution of existing EDA tools but a revolution, giving rise to novel instruments of design-tools that promise to radically enhance design productivity and inaugurate a new epoch where the optimization of circuit performance, power, and area(PPA) is achieved not incrementally, but through leaps that redefine the benchmarks of electronic systems' capabilities.
In large-scale information systems, storage device performance continues to improve while workloads expand in size and access characteristics. This growth puts tremendous pressure on caches and storage hierarchy in te...
详细信息
When designing solar systems and assessing the effectiveness of their many uses,estimating sun irradiance is a crucial first *** study examined three approaches(ANN,GA-ANN,and ANFIS)for estimating daily global solar r...
详细信息
When designing solar systems and assessing the effectiveness of their many uses,estimating sun irradiance is a crucial first *** study examined three approaches(ANN,GA-ANN,and ANFIS)for estimating daily global solar radiation(GSR)in the south of Algeria:Adrar,Ouargla,and *** proposed hybrid GA-ANN model,based on genetic algorithm-based optimization,was developed to improve the ANN *** GA-ANN and ANFIS models performed better than the standalone ANN-based model,with GA-ANN being better suited for forecasting in all sites,and it performed the best with the best values in the testing phase of Coefficient of Determination(R=0.9005),Mean Absolute Percentage Error(MAPE=8.40%),and Relative Root Mean Square Error(rRMSE=12.56%).Nevertheless,the ANFIS model outperformed the GA-ANN model in forecasting daily GSR,with the best values of indicators when testing the model being R=0.9374,MAPE=7.78%,and rRMSE=10.54%.Generally,we may conclude that the initial ANN stand-alone model performance when forecasting solar radiation has been improved,and the results obtained after injecting the genetic algorithm into the ANN to optimize its weights were *** model can be used to forecast daily GSR in dry climates and other climates and may also be helpful in selecting solar energy system installations and sizes.
The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application *** the existing dee...
详细信息
The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application *** the existing deep learning compilers,TVM is well known for its efficiency in code generation and optimization across diverse hardware *** the meanwhile,the Sunway many-core processor renders itself as a competitive candidate for its attractive computational power in both scientific computing and deep learning *** paper combines the trends in these two ***,we propose swTVM that extends the original TVM to support ahead-of-time compilation for architecture requiring cross-compilation such as *** addition,we leverage the architecture features during the compilation such as core group for massive parallelism,DMA for high bandwidth memory transfer and local device memory for data locality,in order to generate efficient codes for deep learning workloads on *** experiment results show that the codes generated by swTVM achieve 1.79x improvement of inference latency on average compared to the state-of-the-art deep learning framework on Sunway,across eight representative *** work is the first attempt from the compiler perspective to bridge the gap of deep learning and Sunway processor particularly with productivity and efficiency in *** believe this work will encourage more people to embrace the power of deep learning and Sunwaymany-coreprocessor.
Die-stacked dynamic random access memory(DRAM)caches are increasingly advocated to bridge the performance gap between the on-chip cache and the main *** fully realize their potential,it is essential to improve DRAM ca...
详细信息
Die-stacked dynamic random access memory(DRAM)caches are increasingly advocated to bridge the performance gap between the on-chip cache and the main *** fully realize their potential,it is essential to improve DRAM cache hit rate and lower its cache hit *** order to take advantage of the high hit-rate of set-association and the low hit latency of direct-mapping at the same time,we propose a partial direct-mapped die-stacked DRAM cache called *** design is motivated by a key observation,i.e.,applying a unified mapping policy to different types of blocks cannot achieve a high cache hit rate and low hit latency *** address this problem,P3DC classifies data blocks into leading blocks and following blocks,and places them at static positions and dynamic positions,respectively,in a unified set-associative *** also propose a replacement policy to balance the miss penalty and the temporal locality of different *** addition,P3DC provides a policy to mitigate cache thrashing due to block type *** results demonstrate that P3DC can reduce the cache hit latency by 20.5%while achieving a similar cache hit rate compared with typical set-associative caches.P3DC improves the instructions per cycle(IPC)by up to 66%(12%on average)compared with the state-of-the-art direct-mapped cache—BEAR,and by up to 19%(6%on average)compared with the tag-data decoupled set-associative cache—DEC-A8.
In recent times, appropriate decision-making in challenging and critical situations has been very well supported by multicriteria decision-making (MCDM) methods. The technique for order of preference by similarity to ...
详细信息
Learning from demonstration(LfD) allows for the effective transfer of human manipulation skills to a robot by building a model that represents these skills based on a limited number of demonstrated ***,a skilllearning...
详细信息
Learning from demonstration(LfD) allows for the effective transfer of human manipulation skills to a robot by building a model that represents these skills based on a limited number of demonstrated ***,a skilllearning model that can comprehensively satisfy multiple requirements,such as computational complexity,modeling accuracy,trajectory smoothness,and robustness,is still ***,this work aims to provide such a model by employing fuzzy ***,we introduce an LfD model named Takagi-Sugeno-Kang fuzzy system-based movement primitives(TSKFMPs),which exploits the advantages of the fuzzy theory for effective robotic imitation learning of human *** work formulates the TSK fuzzy system and gradient descent(GD) as imitation learning models,leveraging recent advancements in GD-based optimization for fuzzy *** study takes a two-step strategy.(ⅰ) The input-output relationships of the model are established using TSK fuzzy systems based on demonstration *** this way,the skill is encoded by the model parameter in the latent space.(ⅱ) GD is used to optimize the model parameter to increase the modeling accuracy and trajectory *** further explain how learned trajectories are adapted to new task scenarios through local *** conduct multiple tests using an open dataset to validate our method,and the results demonstrate performance comparable with those of other ***,we implement it in a real-world case study.
暂无评论