检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Kozyrev, Sergei V. Lopatin, Ilya A. Pechen, Alexander N. Steklov Mathematical Institute of Russian Academy of Sciences Gubkina St. 8 Moscow119991 Russia Ivannikov Institute for System Programming The Russian Academy of Sciences Alexandra Solzhenitsyna str. 25 Moscow109004 Russia

While there are many works on the applications of machine learning, not so many of them are trying to understand the theoretical justifications to explain their efficiency. In this work, overfitting control (or generalization property) in machine learning is explained using analogies from physics and biology. For stochastic gradient Langevin dynamics, we show that the Eyring formula of kinetic theory allows to control overfitting in the algorithmic stability approach—when wide minima of the risk function with low free energy correspond to low overfitting. For the generative adversarial network (GAN) model, we establish an analogy between GAN and the predator–prey model in biology. An application of this analogy allows us to explain the selection of wide likelihood maxima and overfitting reduction for GANs. Copyright © 2024, The Authors. All rights reserved.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Analytical and numerical methods for Zhukovsky airfoils aerodynamics coefficients

Analytical and numerical methods for Zhukovsky airfoils aero...

引用

2022 Ivannikov Open Conference, ISPRAS 2022

作者： Petrov, A.G. Sukhov, A.D. Sibgatullin, I.N. Britov, A.D. Ishlinsky Institute for Problems in Mechanics Ras Prospekt Vernadskogo 101-1 Moscow119526 Russia Institute for System Programming of the Russian Academy of Sciences 25 Alexander Solzhenitsyn st. Moscow109004 Russia

ISBN: (纸本)9798350398533

The two-dimensional problem of a viscous laminar flow around Zhukovsky airfoils at an angle of attack is considered. Based on the approach of local similarity, which was proposed by Kochin and Loytsyansky for the equations of laminar boundary layer, we have found the shear stresses at the aifoil, and the coordinates of the separation points. Assuming the values of the velocities at the separation points to be equal, we find the value of the circulation. A complete solution to the problem of the velocity and pressure field outside the boundary layer is also constructed. The theoretical results are compared with the available experimental data and numerical simulations of the Navier-Stokes equations. © 2022 IEEE.

关键词： Airfoils

来源：评论

学校读者我要写书评

暂无评论

Collecting Influencers: a Comparative Study of Online Network Crawlers

arXiv

引用

arXiv 2024年

作者： Drobyshevskiy, Mikhail Aivazov, Denis Turdakov, Denis Yatskov, Alexander Varlamov, Maksim Shayhelislamov, Danil Ivannikov Institute for System Programming The Russian Academy of Sciences Moscow Russia Moscow Institute of Physics and Technology State University Moscow Russia Lomonosov Moscow State University Moscow Russia

Online network crawling tasks require a lot of efforts for the researchers to collect the data. One of them is identification of important nodes, which has many applications starting from viral marketing to the prevention of disease spread. Various crawling algorithms has been suggested but their efficiency is not studied well. In this paper we compared six known crawlers on the task of collecting the fraction of the most influential nodes of graph. We analyzed crawlers behavior for four measures of node influence: node degree, k-coreness, betweenness centrality, and eccentricity. The experiments confirmed that greedy methods perform the best in many settings, but the cases exist when they are very inefficient. © 2024, CC BY-NC-ND.

关键词： Web crawler

来源：评论

学校读者我要写书评

暂无评论

LLM-based Interactive Code Generation: Empirical Evaluation

LLM-based Interactive Code Generation: Empirical Evaluation

引用

Ivannikov ISPRAS Open Conference (ISPRAS)

作者： Danil Shaikhelislamov Mikhail Drobyshevskiy Andrey Belevantsev Moscow Institute of Physics and Technology (State University) Moscow Russia Ivannikov Institute for System Programming of the Russian Academy of Sciences Moscow Russia Lomonosov Moscow State University Moscow Russia

ISBN: (数字)9798331526023

ISBN: (纸本)9798331526030

Recently, large language models (LLMs), those pretrained on code, have demonstrated strong capabilities in generating programs from informal natural language intent. However, LLM -generated code is prone to bugs. Developers interacting with LLMs seek trusted code and, ideally, clear indications of potential bugs and vulnerabilities. Verified code can mitigate potential business risks associated with adopting generated code. We use model-agnostic framework CodePatchLLM, an extension for LLM that utilizes Svace feedback to enhance code generation quality. We evaluate CodePatchLLM on four popular LLMs across three datasets. Our experiments show an average absolute reduction of 19.1 % in static analyzer warnings for Java across all datasets and models, while preserving pass@ 1 code generation accuracy.

关键词： Analytical models Codes Accuracy Large language models Computer bugs Natural languages programming Safety Security Reliability

来源：评论

学校读者我要写书评

暂无评论

Numerical simulation of irregular waves by HOS method 32nd

Numerical simulation of irregular waves by HOS method

引用

32nd International Ocean and Polar Engineering Conference, ISOPE 2022

作者： Huang, Congyi Xu, Ping Wan, Decheng Strijhak, Sergei School of Naval Architecture Ocean and Civil Engineering Shanghai Jiao Tong University Shanghai China Marine Design and Research Institute of China Shanghai China Ivannikov Institute for System Programming of the Russian Academy of Sciences Moscow Russia

ISBN: (纸本)9781880653814

Ships and offshore structures often encounter irregular waves in the ocean. Therefore, the simulation of irregular waves is very necessary and meaningful. The potential flow method is often used to simulate the irregular waves because the evolution needs a long time and a large amount of calculation. High order spectral (HOS) method is a kind of potential flow method, which has the advantages of high efficiency. In this paper, the HOS method is used to simulate the formation and evolution of irregular waves. Firstly, the governing equations, boundary conditions and the solving procedure of the HOS-ocean model based on the HOS method are introduced. Then the accuracy and stability of the HOS-ocean model are verified by a standard example. After that, the formation and evolution of two-dimensional and three-dimensional irregular waves are simulated based on JONSWAP spectrum and ITTC spectrum respectively. The frequency domain curves, time domain curves, maximum wave height, maximum wave steepness, average wave steepness and other parameters of the waves generated by the two wave spectrums are compared, and the differences between the two wave spectrums in generating irregular waves are analyzed. © 2022 by the International Society of Offshore and Polar Engineers (ISOPE).

关键词： Potential flow

来源：评论

学校读者我要写书评

暂无评论

Using Lingvodoc platform for researching genetic and areal semantic shifts: the case of Ob-Ugric basic vocabulary

Using Lingvodoc platform for researching genetic and areal s...

引用

Ivannikov ISPRAS Open Conference (ISPRAS)

作者： Idaliya Fedotova Ivannikov Institute for System Programming of the Russian Academy of Sciences HSE university Moscow Russia

Typology of semantic shifts has been in the focus of linguistic typology for the last 20 years. Emergence of cross-linguistic databases and linguistic platforms has taken the study of semantic changes to the new level, as it enlarged the sample of the languages under investigation. Yet the languages of Russia are only scarcely represented in the global databases and do not make a substantial contribution to this field. The LingvoDoc platform, which stores unique materials on the languages of Russia, upon certain enhancements can fill in this gap.

关键词： Vocabulary Databases Search methods Semantics Linguistics Genetics

来源：评论

学校读者我要写书评

暂无评论

Docmarking: Real-Time Screen-Cam Robust Document Image Watermarking

arXiv

引用

arXiv 2023年

作者： Yakushev, Aleksey Markin, Yury Obydenkov, Dmitry Frolov, Alexander Fomin, Stas Akopyan, Manuk Kozachok, Alexander Gaynov, Arthur Ivannikov Institute for System Programming of the RAS Russia Russian Federation Security Guard Service Federal Academy Russia Ministry of Defence The Russian Federation Moscow Russia

This paper focuses on investigation of confidential documents leaks in the form of screen photographs. Proposed approach does not try to prevent leak in the first place but rather aims to determine source of the leak. Method works by applying on the screen a unique identifying watermark as semi-transparent image that is almost imperceptible for human eyes. Watermark image is static and stays on the screen all the time thus watermark present on every captured photograph of the screen. The key components of the approach are three neural networks. The first network generates an image with embedded message in a way that this image is almost invisible when displayed on the screen. The other two neural networks are used to retrieve embedded message with high accuracy. Developed method was comprehensively tested on different screen and cameras. Test results showed high efficiency of the proposed approach. © 2023, CC BY-NC-ND.

关键词： Image watermarking

来源：评论

学校读者我要写书评

暂无评论

Finetuning BERT on Partially Annotated NER Corpora

Finetuning BERT on Partially Annotated NER Corpora

引用

Ivannikov ISPRAS Open Conference (ISPRAS)

作者： Viktor Scherbakov Vladimir Mayorov Ivannikov Institute for System Programming of the Russian Academy of Sciences Lomonosov Moscow State University

Most Named Entity Recognition (NER) models operate under the assumption that training datasets are fully labelled. While it is valid for established datasets like CoNLL 2003 and OntoNotes, sometimes it is not feasible to obtain the complete dataset annotation. These situations may occur, for instance, after selective annotation of entities for cost reduction. This work presents an approach to finetuning BERT on such partially labelled datasets using self-supervision and label preprocessing. Our approach outperforms the previous LSTM-based label preprocessing baseline, significantly improving the performance on poorly labelled datasets. We demonstrate that following our approach while finetuning RoBERTa on CoNLL 2003 dataset with only 10% of total entities labelled is enough to reach the performance of the baseline trained on the same dataset with 50% of the entities labelled.

关键词： Training Costs Annotations Bit error rate

来源：评论

学校读者我要写书评

暂无评论

PyTabby: A Docreader’s module for extracting text and tables from PDF with a text layer 4

PyTabby: A Docreader’s module for extracting text and table...

引用

4th Scientific-Practical Workshop Information Technologies: Algorithms, Models, systems, ITAMS 2021

作者： Mikhailov, Andrey A. Shigarov, Alexey Kozlov, Ilya S. Matrosov Institute for System Dynamics and Control Theory of Siberian Branch of Russian Academy of Sciences Irkutsk664033 Russia Ivannikov Institute for System Programming of Russian Academy of Sciences 25 Alexander Solzhenitsyn St. Moscow109004 Russia

This paper presents a complete solution for extraction of textual information and tables from PDF with a text layer. The presented solution consist of two parts: PyTabby is a tool for extracting text and tables from PDF with a complex background and layout, and Python wrapper module for Docreader tool. The PyTabby tool extracts text and tables from the low level representation of the PDF format. It enables employment of the additional information excluded in scanned documents and provides improvement of quality and performance compared with Optical Character Recognition (OCR) methods. The presented solution is incorporated into Docreader tool to parse PDF files with a text layer and is used as a part of the TALISMAN technology for social analytics. © 2021 Copyright for this paper by its authors.

关键词： Optical character recognition

来源：评论

学校读者我要写书评

暂无评论

Nesterov’s method of dichotomy via Order Oracle: The problem of optimizing a two-variable function on a square

arXiv

引用

arXiv 2024年

作者： Chervonenkis, Boris Krasnov, Andrei Gasnikov, Alexander Lobanov, Aleksandr Moscow Institute of Physics and Technology Russia Institute for Information Transmission Problems RAS Moscow Russia Innopolis University Russia Skolkovo Institute of Science and Technology Russia The Institute for System Programming The Russian Academy of Sciences Russia

The challenges of black box optimization arise due to imprecise responses and limited output information. This article describes new results on optimizing multivariable functions using an Order Oracle, which provides access only to the order between function values and with some small errors. We obtained convergence rate estimates for the one-dimensional search method (golden ratio method) under the condition of oracle inaccuracy, as well as convergence results for the algorithm on a "square" (also with noise), which outperforms its alternatives. The results obtained are similar to those in problems with oracles providing significantly more information about the optimized function. Additionally, the practical application of the algorithm has been demonstrated in maximizing a preference function, where the parameters are the acidity and sweetness of the drink. This function is expected to be convex or at least quasi-convex. Copyright © 2024, The Authors. All rights reserved.

关键词： Optimization algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：