What is Import.io?
Import.io enables you to extract data directly from the web. This is commonly known as web scraping, but Import.io is much more. Our point-and-click interface transforms websites into data with a few simple clicks, enabling you to get the data you need, whether it requires page interaction, JavaScripting or lies behind a login.
How does data extraction with Import.io work?
Import.io allows you to create an extractor and give it an example URL containing the data you want to extract. Once Import.io loads the webpage, it presents you with the data it finds and give you the option to identify the data you want to collect via point-and-click. As you select data, Import.io analyzes the underlying structure of the webpage and determines where the elements of data you want reside.
All this data is laid out in a tabular data column structure that you can design to meet your project needs.
What makes Import.io unique?
Import.io contains a built-in crawl service specifically designed to handle multiple URL queries. It uses dynamic rate limiting and contains a retry system to handle errors and restrictions. When querying multiple webpages, the crawl service queries URLs asynchronously, each from a rotating IP address pool, to make the process more efficient. If a URL fails, the URL is requeued and tried again from a different IP address. This crawl service monitors website response time, which ensures extraction does not place excessive load on a website.
The result is superior performance, high-quality data extraction and reliable success.
How do I get started with Import.io?
After first reading through our terminology, sign up for an account!
We recommend you start at our tutorial on Building Your First Extractors - or if you prefer video, our Getting Started with Import.io Tutorials on YouTube.
Comments
0 comments
Please sign in to leave a comment.