Managing an Extractor's URL List
From the Settings tab of an Extractor, you change manage the list of URLs extracted for when an Extractor starts a crawl run. You can either manually add URLs, import them from a file, extract them from other pages with Chained Extractors, or add similar URLs using URL Discovery.
Elements of the Inputs View
- Input source: Dropdown to set whether the Extractor uses URLs from an explicit list of URLs provided or URLs extracted by another Extractor.
- Clear All: Removes all the URLs from the list to start over.
- Remove Duplicate Rows: Removes any duplicate URLs from the list.
- Cleanup URLs: Removes invalid URLs and empty rows from the list.
- Download Inputs: Download a list of the URLs in CSV, Excel, JSON, or NDJSON format.
- Import Inputs: Import a list of URLs from a CSV or Excel (XLSX) file.
- Add input row: Add blank row to list of inputs.
- Reset to saved inputs: Resets URL list to saved inputs.
- List view: Shows all of the URLs currently added.
- Save: This saves any changes made to the URL list. When you add/remove/update URLs using the URLs Input, the changes will not be saved until you click Save.
- Run Inputs: Starts a new crawl run. If you have unsaved changes, this button will be disabled until you save your changes.
- Total Inputs: Display a count of URLs in the list. This is also how many queries a crawl run will use with that list of URLs (If screen capture is enabled then the total number of queries will be doubled).
Importing URLs from a File
Clicking Import URLs will reveal the Import URLs view which allows you to add URLs from a CSV or Excel file. This list of URLs can either replace or be added to your current list of URLs.
Elements of the Import URLs View
- Browse: Reveal file browser to select the file to import URLs from.
- Include column headers: Set whether the file includes column headers. If selected, then the first row will not be imported.
- Select header: Select which column the URLs are saved in.
- Append/Replace: Choose whether the list of URLs from the file are added to the current list of URLs or replaces/overwrites the current list.
- Preview list: Shows a preview of the URLs from the column selected.
- Cancel: Closes the Import URLs view and returns to the Extractor's settings.
- Upload URL list: Adds the selected URLs for import to the Extractor's URL list.