Analysis of Automated Modern Web Crawling and Testing Tools and Their Possible Employment for Information Extraction

Tomas Grigalis; Leonardas Marozas; Lukas Radvilavičius

doi:10.3846/mla.2012.07

Analysis of Automated Modern Web Crawling and Testing Tools and Their Possible Employment for Information Extraction

Tomas Grigalis
Leonardas Marozas
Lukas Radvilavičius

Abstract

World Wide Web has become an enormously big repository of data. Extracting, integrating and reusing this kind of data has a wide range of applications, including meta-searching, comparison shopping, business intelligence tools and security analysis of information in websites. However, reaching information in modern WEB 2.0 web pages, where HTML tree is often dynamically modified by various JavaScript codes, new data are added by asynchronous requests to the web server and elements are positioned with the help of cascading style sheets, is a difficult task. The article reviews automated web testing tools for information extraction tasks.

Article in Lithuanian

Article in: Lithuanian

Article published: 2012-04-23

Keyword(s): data extraction; automated crawling; web testing; dynamic webpages; Quick Test Pro; Sahi; Selenium; Telerik; TestComplete; Watir; Windmill

DOI: 10.3846/mla.2012.07

Full Text: PDF pdf

Science – Future of Lithuania / Mokslas – Lietuvos Ateitis ISSN 2029-2341, eISSN 2029-2252
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 License.