Ferret – Declarative web scraping

What is it?
ferret is a web scraping system aiming to simplify data extraction from the web for such things like ui testing, machine learning and analytics.Having it’s own declarative language, ferret abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.It’s extremely portable, extensible and fast.
Show me some code

The following example demonstrates the use of dynamic pages.First of all, we load the main Google Search page, type search criteria into an input box and then click a search button.The click action triggers a redirect, so we wait till its end.Once the page gets loaded, we iterate over all elements in search results and assign output to a variable.The final for loop filters out empty elements that might be because of inaccurate use of selectors.
LET google = DOCUMENT(“https://www.google.com/”, true)

INPUT(google, ‘input[name=”q”]’, “ferret”)
CLICK(google, ‘input[name=”btnK”]’)

WAIT_NAVIGATION(google)

LET result = (
FOR result IN ELEMENTS(google, ‘.g’)
RETURN {
title:


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/ZBCRp9BqUyA/ferret

Original article

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: