检索结果-内蒙古大学图书馆

CATALOGING & CLASSIFICATION QUARTERLY 2006年第2期42卷 21-34页

作者： Agenbroad, James E. POB 291 Garrett Park MD USA

For finding nonroman script library materials, catalogs with romanized access points alone are inadequate because they are unfamiliar to those who seek these materials. Relevant writers are surveyed. Information technology and MARC have eliminated the need to rely on card filers who knew only the order of letters in the roman alphabet. Two improvements are suggested: expand the MARC character repertoire and add rules to AACR to allow nonroman access points. Other issues are briefly described. (C) 2006 by The Haworth Press, Inc. All rights reserved.

关键词： Nonroman scripts access points MARC unicode AACR

来源：评论

学校读者我要写书评

暂无评论

Sham Finder: An Automated Framework for Detecting IDN Homographs 19

Sham Finder: An Automated Framework for Detecting IDN Homogr...

引用

ACM Internet Measurement Conference (IMC)

作者： Suzuki, Hiroaki Chiba, Daiki Yoneya, Yoshiro Mori, Tatsuya Goto, Shigeki Waseda Univ Tokyo Japan NTT Secure Platform Labs Tokyo Japan Japan Registry Serv Tokyo Japan NICT Tokyo Japan RIKEN AIP Tokyo Japan

ISBN: (纸本)9781450369480

The internationalized domain name (IDN) is a mechanism that enables us to use unicode characters in domain names. The set of unicode characters contains several pairs of characters that are visually identical with each other;e.g., the Latin character 'a' (U+0061) and Cyrillic character a' (U+0430). Visually identical characters such as these are generally known as homoglyphs. IDN homograph attacks, which are widely known, abuse unicode homoglyphs to create lookalike URLs. Although the threat posed by IDN homograph attacks is not new, the recent rise of IDN adoption in both domain name registries and web browsers has resulted in the threat of these attacks becoming increasingly widespread, leading to large-scale phishing attacks such as those targeting cryptocurrency exchange companies. In this work, we developed a framework named "ShamFinder," which is an automated scheme to detect IDN homographs. Our key contribution is the automatic construction of a homoglyph database, which can be used for direct countermeasures against the attack and to inform users about the context of an IDN homograph. Using the ShamFinder framework, we perform a large-scale measurement study that aims to understand the IDN homographs that exist in the wild. On the basis of our approach, we provide insights into an effective countermeasure against the threats caused by the IDN homograph attack.

关键词： DNS IDN homograph unicode Homoglyph

来源：评论

学校读者我要写书评

暂无评论

Bangla Text Compression Based on Modified Lempel-Ziv-Welch Algorithm

Bangla Text Compression Based on Modified Lempel-Ziv-Welch A...

引用

International Conference on Electrical, Computer and Communication Engineering (ECCE)

作者： Barua, Linkon Dhar, Pranab Kumar Alam, Lamia Echizen, Isao Chittagong Univ Engn & Technol Dept CSE Chittagong 4349 Bangladesh Natl Inst Informat Digital Content & Media Sci Res Div Tokyo 1018430 Japan

ISBN: (纸本)9781509056279

Text compression algorithm performs compression at the character level. Bangla text has some unique features such as no distinct upper and lower case letter, consonant cluster (CC) and consonant with dependent vowel sign (CV) etc. The conventional Lempel-Ziv-Welch (LZW) algorithm is not suitable for compressing Bangle text. Therefore, in this paper, we propose a modified LZW (MLZW) algorithm which can compress Bangla text effectively and efficiently. In our proposed method, a dictionary with unicode ranges from 1-90 is used for Bangla characters. The compression process is started with checking the input character. If input character is a part of CC or CV, then CC or CV is considered as a character and search it in the dictionary. If the character to be encoded is already in dictionary, encode it with the dictionary index. Otherwise, the character is added to the dictionary and is encoded with its corresponding dictionary index. Simulation results indicate that the proposed MLZW algorithm compresses Bangla text effectively and efficiently. We observed that the proposed MLZW provides higher compression rate approximately 3% for dictionary index and 33% for output sequence compared with LZW algorithm.

关键词： Bangla Text ASCII code LZW Data Compression unicode Consonant Cluster

来源：评论

学校读者我要写书评

暂无评论

A Method for Text Steganography Using Malayalam Text

A Method for Text Steganography Using Malayalam Text

引用

International Conference on Information and Communication Technologies (ICICT)

作者： Vidhya, P. M. Paul, Varghese Mahatma Gandhi Univ Sch Comp Sci Kottayam 686560 Kerala India Cochin Univ Sci & Technol Dept Informat Technol Cochin 682022 Kerala India

Recent researches regarding information hiding is mostly concentrating on Linguistic steganography. In this paper, a method to steganography is proposed with an Indian local language, Malayalam. The proposed method consists of custom unicode based technique with embedding based on indexing, i.e. the original message is encoded to Malayalam text with custom unicode values generated for the Malayalam text. The comparison study of the proposed method against an existing method revealed that, the proposed steganography methods is more precise in the encoding process and in the decoding process. The method achieved a precision rate of .95 and decoding rate of .81. (C) 2015 The Authors. Published by Elsevier B.V.

关键词： Information hiding text steganography unicode Malayalam index

来源：评论

学校读者我要写书评

暂无评论

PunyVis: A Visual Analytics Approach for Identifying Homograph Phishing Attacks

PunyVis: A Visual Analytics Approach for Identifying Homogra...

引用

IEEE Symposium on Visualization for Cyber Security (VizSec)

作者： Fouss, Brett Ross, Dennis M. Wollaber, Allan B. Gomez, Steven R. MIT Lincoln Lab Cambridge MA 02139 USA

ISBN: (纸本)9781728138763

Attackers seeking to deceive web users into visiting malicious websites can exploit limitations of the tools intended to help browsers translate domain names containing non-ASCII characters, or internationalized domain names (IDNs). These attacks, called homograph phishing, involve registering unicode domain names that are visually similar to legitimate ones but direct users to distinct servers. Tools exist to identify when domains use non-ASCII characters, which get translated by the Punycode protocol to work with the Domain Name System (DNS);however, these tools cannot automatically distinguish between benign use cases and ones with malicious intent, leading to high rates of false-positive alerts and increasing the workload of analysts looking for evidence of homograph phishing. To address this problem, we present PunyVis, a visual analytics system for exploring and identifying potential homograph attacks on large network datasets. By targeting instances of Punycode that use easily-confusable ASCII characters to spoof popular websites, PunyVis quickly condenses large datasets into a small number of potentially malicious records. Using the interactive tool, analysts can evaluate potential phishing instances and view supporting information from multiple data sources, as well as gain insight about overall risk and threat regarding homograph attacks. We demonstrate how PunyVis supports analysts in a case study with domain experts, and identified divergent analysis strategies and the need for interactions that support how analysts begin exploration and pivot around hypotheses. Finally, we discuss design implications and opportunities for cyber visual analytics.

关键词： visual analytics visualization design cyber security human factors homograph phishing unicode

来源：评论

学校读者我要写书评

暂无评论

Language translation for file paths

引用

DIGITAL INVESTIGATION 2013年 10卷 S78-S86页

作者： Rowe, Neil C. Schwamm, Riqui Garfinkel, Simson L. US Navy Postgrad Sch Monterey CA 93943 USA

Forensic examiners are frequently confronted with content in languages that they do not understand, and they could benefit from machine translation into their native language. But automated translation of file paths is a difficult problem because of the minimal context for translation and the frequent mixing of multiple languages within a path. This work developed a prototype implementation of a file-path translator that first identifies the language for each directory segment of a path, and then translates to English those that are not already English nor artificial words. Brown's LA-Strings utility for language identification was tried, but its performance was found inadequate on short strings and it was supplemented with clues from dictionary lookup, unicode character distributions for languages, country of origin, and language-related keywords. To provide better data for language inference, words used in each directory over a large corpus were aggregated for analysis. The resulting directory-language probabilities were combined with those for each path segment from dictionary lookup and character-type distributions to infer the segment's most likely language. Tests were done on a corpus of 50.1 million file paths looking for 35 different languages. Tests showed 90.4% accuracy on identifying languages of directories and 93.7% accuracy on identifying languages of directory/file segments of file paths, even after excluding 44.4% of the paths as obviously English or untranslatable. Two of seven proposed language clues were shown to impair directory-language identification. Experiments also compared three translation methods: the Systran translation tool, Google Translate, and word-for-word substitution using dictionaries. Google Translate usually performed the best, but all still made errors with European languages and a significant number of errors with Arabic and Chinese. Published by Elsevier Ltd.

关键词： Digital forensics File paths Machine translation Dictionary Character distribution unicode Naive Bayes inference

来源：评论

学校读者我要写书评

暂无评论

Aleph-bet, dits-and-dahs, zeros and ones: representing Hebrew in character code

引用

INTERNET HISTORIES 2022年第3期6卷 280-297页

作者： Ramati, Ido Hebrew Univ Jerusalem Noah Mozes Dept Commun & Journalism Jerusalem Israel Hebrew Univ Jerusalem Program Cultural Studies Jerusalem Israel

One of the basic features facilitating communication on the Internet in a variety of languages is unicode code-layout. It standardizes the representation of most of the world's writing systems on digital media, thus enabling the process and transmission of information through such technologies. unicode is a contemporary character code, and this paper traces its evolvement out of previous code-layouts, starting with Morse code in telegraphy. Focusing on the adaptations of character codes to Modern Hebrew, I show how representing languages in technology is intertwined with internal and transnational regional concerns, and argue that from its beginning character code has been a locus of struggle over power and sovereignty: first between colonial regimes and resistance movements, and then between global corporations and local agents.

关键词： Character code unicode Morse code Hebrew Latinization code machines

来源：评论

学校读者我要写书评

暂无评论

Arabic mathematical e-documents

引用

International Conference on TeX, XML and Digital Typography/25th Annual Meeting of the TeX-Users-Group

作者： Eddahibi, M Lazrek, A Sami, K Univ Cadi Ayyad Fac Sci Dept Comp Sci Marrakech Morocco

ISBN: (纸本)3540228012

What problems do e-documents with mathematical expressions in an Arabic presentation present? In addition to the known difficulties of handling mathematical expressions based on Latin script on the Web, Arabic mathematical expressions flow from right to left and use specific symbols with a dynamic cursivity. How might we extend the capabilities of tools such as MathML in order to structure Arabic mathematical e-documents? Those are the questions this paper will deal with. It gives a brief description of some steps toward an extension of MathML to mathematics in Arabic exposition. In order to evaluate it, this extension has been implemented in Mozilla.

关键词： mathematical expressions Arabic mathematical presentation multilingual documents e-documents unicode MathML Mozilla

来源：评论

学校读者我要写书评

暂无评论

Typographical advocacy in the age of digital encoding

引用

LANGUAGE POLICY 2022年第4期21卷 527-543页

作者： Or, Iair G. Tel Aviv Univ Sch Educ POB 20559 IL-6120402 Tel Aviv Israel

This paper contemplates the concept of typographical advocacy, defined here as a variety of activities, strategies, and policies designed to increase or enhance language support in computing systems, facilitating typing and displaying texts in local languages. Focusing on the cases of Spanish and Paraguayan Guarani, the paper traces some existing efforts in the domain of typographical advocacy and strives to explain their dynamics. An attempt is made to examine both situations in which typographical advocacy is visible and explicit (as in the case of Spanish) and cases in which advocacy seems negligible (as in the case of Guarani). A wide range of enabling and hindering factors are examined, such as the relations between tech companies, nation-state institutions, consumers, and the habitus created by the use of technology. Through these examples, the rationale for advocacy is explored as well as alternative courses of action in cases where advocacy is not desired or fails to achieve its goals. A case will be made for raising the awareness of typographical issues in language planning and policy (LPP) both as part of broader multilingual awareness and as a tool for solving practical language problems in times of increased dependency on computing and mobile devices.

关键词： Typographical advocacy Digital encoding Spanish Guarani Orthography Language planning Language and technology unicode Center vs periphery

来源：评论

学校读者我要写书评

暂无评论

Handling math expressions in economics: recoding spreadsheet teaching tool of growth models

引用

INTERACTIVE LEARNING ENVIRONMENTS 2017年第1期25卷 98-112页

作者： Moro-Egido, Ana I. Pedauga, Luis E. Univ Granada Dept Econ Granada Spain

In the present paper, we develop a teaching methodology for economic theory. The main contribution of this paper relies on combining the interactive characteristics of spreadsheet programs such as Excel and unicode plain-text linear format for mathematical expressions. The advantage of unicode standard rests on its ease for writing and reading mathematical expressions. In this sense, our proposal allows incorporating an easily readable and writable methodology to cope with math expressions when interactive spreadsheets are used and designed in Economics teaching. The resulting nearly plain text can be used with few or no modifications in other numerical computing programs.

关键词： Teaching tool unicode spreadsheet macroeconomic theory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：