Show HN: Created Pickaxe a SQL like DSL for web scraping

README.md

Pickaxe uses SQL statements combined with CSS selectors to pick out text from a web page. If you know SQL and a little about CSS selectors and want to capture data from the web, this is the tool for you.

Downloads

Found here. It requires .NET framework 4.0. Pickaxe.zip contains the GUI editor and only runs on windows. The Pickaxe-Console.zip is the command line version that runs on non-windows platforms using mono as well as windows. See Command Line section below.

Quickstart

Download Pickaxe.zip from above and unzip the contents and double click on Pickaxe.Studio.exe to run the GUI editor. Below is a screen shot of the editor. A full runnable example that scrapes a forum I host is found here. Others can be found here.

Download Page

Download page returns a table with columns url, nodes, date, size. The statement below downloads aviation weather information for airports in Texas.

select *
from download


Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/dtRvt2EmpnI/pickaxe

Original article

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑

%d bloggers like this: