Analysis of Automated Modern Web Crawling and Testing Tools and Their Possible Employment for Information Extraction
Leonardas Marozas
Lukas Radvilavičius
Abstract
World Wide Web has become an enormously big repository of data. Extracting, integrating and reusing this kind of data has a wide range of applications, including meta-searching, comparison shopping, business intelligence tools and security analysis of information in websites. However, reaching information in modern WEB 2.0 web pages, where HTML tree is often dynamically modified by various JavaScript codes, new data are added by asynchronous requests to the web server and elements are positioned with the help of cascading style sheets, is a difficult task. The article reviews automated web testing tools for information extraction tasks.
Article in Lithuanian
Keyword(s): data extraction; automated crawling; web testing; dynamic webpages; Quick Test Pro; Sahi; Selenium; Telerik; TestComplete; Watir; Windmill
DOI: 10.3846/mla.2012.07
Science – Future of Lithuania / Mokslas – Lietuvos Ateitis ISSN 2029-2341, eISSN 2029-2252
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 License.