An extraction pattern is a set of data fields that define the positions of the text and images on the web page.
First, you should specify the URL of the page that will be used to create the pattern.
You can type the URL manually or you can click the button and open the necessary page using the built-in browser.
By default, the URL is equal to the URL of the start page, but you can change it.
If you want to allow scripts in the web browser control turn on the "Enable JavaScript" option.
To add new data fields you have to click the button.
The "Select Data Fields" window appears. Wait till the page is loaded and click every page element you need to extract one by one.
When you click the page element the program highlights it and opens the data field window what allows you to specify the data field parameters.
If the "Use Text Labels" option is enabled, the program tries to find text label associated with the selected element,
and if it exists, adds the transformation script with "sub_string" function to the data field.
You can change the data field paramaters later. To do it, select the corresponding field in the "Data Fields" list and
click the
button.
You can also delete, duplicate and move data fields by clicking the corresponding buttons on the right.
A data field has the following parameters:
If data on the page is presented as a set of rows, you should specify a loop for each row.
The program automatically analyzes the structure of the page and, if it finds a set of rows, it creates a loop for each of these rows.
You can specify or change the loop manually.
To do it, enable the "Extract multiple set of data" option, click the button and
select the parent element that contains all data of the first row and similar elements follow this element in the HTML structure (see fig. 1).
Figure 1.