Scrapekit: Python Library for Writing Web Scrapers

Scrapekit is a Python library with common functionality for writing web scrapers.

Many web sites expose a great amount of data, and scraping it can help you build useful tools, services and analysis on top of that data. This can often be done with a simple Python script, using few external libraries.

As your script grows, however, you will want to add more advanced features, such as caching of the downloaded pages, multi-threading to fetch many pieces of content at once, and logging to get a clear sense of which data failed to parse.

Scrapekit provides a set of useful tools for these that help with these tasks, while also offering you simple ways to structure your scraper. This helps you to produce fast, reliable and structured scraper scripts.

See: http://bit.ly/1uNvrhL