This is a 6-part tutorial is designed to guide you through building two different types of extractors in Import.io and introduce you to the key concepts of Import.io quickly. By the end of the tutorial, you should be able to create chained extractors and get data from a list of URLs that extracted with Import.io.
Note: Although this guide outlines the extraction process for Yelp, the process is virtually the same for countless use cases.
Creating a New Extractor
Navigate to the Import.io Dashboard and click Extractors on the sidebar to view all of your extractors. Clicking New Extractor will allow you to enter a URL to extract data from. For this example, we'll enter a Yelp business page (https://www.yelp.com/biz/leanns-nails-alameda) and click Extract to load the page.
Once the page is loaded in the editor, Import.io will first attempt to identify any lists or microdata on the page. In this case a table of data specific to Leann's Nails is presented, such as the Name, Rating, and Price Range.
Editing the Extractor
To begin editing the extractor, you can click Add column, and then click on the page to select that data point. To name this column, double click the column name in the data panel to edit the title.
Since this is a details extractor, you might want to restrict the data selected to be returned in one row per for page extracted, rather than list of data. To do this, reveal the Advanced options, and check to make sure it is set to Single Row.
Saving an Extractor
With the data points selected, click the Save Data button to save your extractor. For now, you name your extractor, skip the additional configuration settings, and click Save and run.
Once the extractor is saved, it will redirect you to the Dashboard, where you can preview or download the results of the run when it is completed.
With your first extractor created, you can now move on to Part 2: Editing a Details Extractor.