International Conference on Advanced Technologies, Computer Engineering and Science

A Review on Web Crawlers and Ontology-Based Crawlers

Yasemin Gültepe A.B. ÖNCÜL E. ALTINTAŞ F. UĞUR

Abstract

As known, web crawlers are programs that automatically browse on the web. Their purpose is to automatically navigate pages, saving source links that have target links, marking pages according to the words in those links, saving, indexing, collecting data to bring personalized ads, etc. Although the web crawlering algorithm is simple, it has various difficulties with respect to the existing pages on the web and the resulting amount of data. The semantic web works on generating computer readable data and is intended to overcome the quantity of data generated. Ontologies represent a pivoting source for semantic web applications. Ontology based crawlers scan the web by focusing on related web pages along with a specific ontology based on area ontology. The main advantage of the ontology based web crawlers over other crawlers is that no Conformance Feedback or Training Procedure is required to move wisely. In addition, both the number of documents and the more effective and efficient results will be obtained during the scanning process. As a result; The main advantage of an ontology based web crawler over other web crawlers is that it does not require intelligent, efficient operation and relevant feedback. In this study, traditional and ontology based web crawlers approaches and its infrastructure are examined. In addition, differences between ontology based web crawlers and traditional web crawlers have been investigated. A brief of literature summary on the subject has been included.



Conference
International Conference on Advanced Technologies, Computer Engineering and Science
Keywords
Web crawlers Challenges Semantic Ontology

Language
English

Subject
Computer Science

Full Paper (PDF)

407 views
297 downloads