1. In Browser window, enter url of web page that you want to parse. Yahoo home page is opened in the picture below. Else you can open a local file.
2. For instance we want to extract list of yahoo categories (I don’t know why we need it though…). To find this place in the html source, select first category in Browser window.
3. Then click Source button to launch Source window. We find that the respective html source is selected in this window.
4. OK. Let's highlight beginning of the block that contains categories by clicking Mark button, to easily find it among the text.
5. The same way we find the last category and mark it too.
6. OK. Now we can extract the html code where categories are listed. To parse this block, we print regex to the regex field and click Parse button. We see that found regex group has been highlighted.
7. Now we have to create next regex to parse each category from this block. We click Level++ button to add second regex to the regex chain. Then print second regex to the field and click Parse button. We see regex groups are highlighted with the respective colors.
8. Well. To be sure our regex chain extracts exactly what we want, we click Browser button. We see there highlighted matches of the last regex.
8. At last we press Save button in Source window to save our regex collection
to file. That’s all.
Enjoy!
Don’t hesitate to report about bugs via bug report form.
Copyright © 2006 CliverSoft