Layout analysis is the main component of a typical Document Image Analysis (DIA) system and plays an important role in pre-processing. However, regarding the Pashto language, the document images have not been explored...
详细信息
Layout analysis is the main component of a typical Document Image Analysis (DIA) system and plays an important role in pre-processing. However, regarding the Pashto language, the document images have not been explored so far. This research, for the first time, examines Pashto text along with graphics and proposes a deep learningbased classifier that can detect Pashto text and graphics per document. Another notable contribution of this research is the creation of a real dataset, which contains more than 1,000 images of the Pashto documents captured by a camera. For this dataset, we applied the convolution neural network (CNN) following a deep learning technique. Our intended method is based on the development of the advanced and classical variant of Faster R-CNN called Single-Shot Detector (SSD). The evaluation was performed by examining the 300 images from the test set. Through this way, we achieved a mean average precision (mAP) of 84.90%.
Security risks brought by web page information has been a matter that can no longer be ignored. Malicious script is a major challenge the web sites security is facing currently. According to the data from the Google R...
详细信息
ISBN:
(纸本)9781479947195
Security risks brought by web page information has been a matter that can no longer be ignored. Malicious script is a major challenge the web sites security is facing currently. According to the data from the Google Research Centre, more than 10% of web pages is malicious. Especially in China, the proportion of malicious web pages has reached 43.21%. This paper presents a detection system which is used to locate the malicious scripts in web pages. It acquires and builds up malicious code features base, URL of hidden links base etc. based on safety data published on security research web sites. The web crawler is applied to collecting web pages source code in this system and learning algorithm for classification is used to train the classifier. The classification results would be evaluated and improved in the end.
Security risks brought by web page information has been a matter that can no longer be *** script is a major challenge the web sites security is facing *** to the data from the Google Research Centre,more than 10%of w...
详细信息
Security risks brought by web page information has been a matter that can no longer be *** script is a major challenge the web sites security is facing *** to the data from the Google Research Centre,more than 10%of web pages is *** in China,the proportion of malicious web pages has reached 43.21%.This paper presents a detection system which is used to locate the malicious scripts in web *** acquires and builds up malicious code features base,URL of hidden links base *** on safety data published on security research web *** web crawler is applied to collecting web pages source code in this system and learning algorithm for classification is used to train the *** classification results would be evaluated and improved in the end.
Bangla is one of the world's most widely-spoken languages, but few languages (or "script") automation solutions have been reported for it. To build an OCR system, it is very important to detect the langu...
详细信息
ISBN:
(纸本)9781728162515
Bangla is one of the world's most widely-spoken languages, but few languages (or "script") automation solutions have been reported for it. To build an OCR system, it is very important to detect the language and type of printing style to run specific character recognition and segmentation modules. This paper presents a novel solution to automatically detect the language (Bangla vs English in terms of the script), and printing style (printed vs handwritten) from any given bilingual scanned document using multiple deep learning models.
The paper explores the mechanism of following scripts in inferencing and reasoning, both in humor and, generally, in natural language, as a way of creating a base for acquisition and use of scripts in the computer.
ISBN:
(数字)9783319398624
ISBN:
(纸本)9783319398624;9783319398617
The paper explores the mechanism of following scripts in inferencing and reasoning, both in humor and, generally, in natural language, as a way of creating a base for acquisition and use of scripts in the computer.
暂无评论