From the Settings tab of an Extractor, you change manage the list of inputs extracted for when an Interactive Extractor starts a crawl run. You can either manually add inputs or import them from a file.
Elements of the Interactive Inputs View
- Clear all: Removes all the URLs from the list to start over.
- Remove duplicate rows: Removes any duplicate inputs from the list.
- Cleanup: Removes any rows with missing input values.
- Download Inputs: Download a list of the inputs in CSV, Excel, JSON, or NDJSON format.
- Import Inputs: Import a list of inputs from a CSV or Excel (XLSX) file.
- Generate URLs: Generate a list of URLs using a 'list of values' or 'range of numbers.’
- Add button: Adds another row to the inputs table.
- Reset button: Removes changes made to inputs table since previous save.
- Inputs table: Displays inputs for the extractor. The start URL is always displayed and cells can be modified like a Excel sheet. You can copy and paste tables into this view as well double-click or drag to fill-down.
- Save: This saves any changes made to the inputs. When you add/remove/update inputs using the Import Inputs function, the changes will not be saved until you click Save.
- Run Inputs: Starts a new crawl run. If you have unsaved changes, this button will be disabled until you save your changes.
Training with Additional Inputs
While in the editor, clicking on Train With Additional URLs will take you the Inputs table. From here, you can add/edit inputs as well as add them to the training by clicking Train. When you add an input to the training, it will replay the new set of inputs and add an additional page to the training.