Turning Off CSS (Styling)
When loading a webpage into Import.io, it's possible that all of data points available to extract might not show up because they're hidden by other elements.
Take for example a product page like https://www.uniqlo.com/us/en/men-ultra-light-down-jacket-400504.html where the materials are listed on the second tab.
When this page loads initially into an Import.io extractor, it won't such show the materials on the page.
In order to reveal this, you can disable CSS. If your Advanced options aren't already showing, click the toggle to reveal them, then select the Page option before disabling CSS.
This will remove all of the styling from page, allowing you to see all of the content present on the page. With all of the text showing, you can scroll down and select all of the listed materials for that product.
Once you've selected the data points you want, you can turn CSS back on to continue training with styling on if you'd like.
What if the data point still isn't showing?
Sometimes even after turning CSS, the data point might not show up on the page, which means we can use either Manual XPath or Interactive Extractors.
Manual XPath
You would want to use this option if there's data on a page that actually is embedded in the page but never revealed. An example of this might be the average rating value which is only displayed as stars on the page but has the numeric value stored in the HTML.
You can learn about creating XPaths here.
Interactive Extractors
Interactive extractors are needed when a series of interactions with the page to reveal the data. Some example use cases for interactive extractors would include a click to reveal or inputting values to reveal more data.
You can learn more about creating Interactive Extractors here.
Comments
0 comments
Please sign in to leave a comment.