Example :  web page word watcher

This sample, Web-Watch-Words.dss, monitors a list of websites for keyword occurrences.   The user specifies :

Two HTML output files are generated :

"Latest results" contains the results of the last scan performed.   It displays the time of the scan, the URLs of the pages containing the search words, and the text on those pages containing the search words.   "All results" contains an accumulation of the latest results, oldest at the top, newest at the bottom.   "All results" can become quite large after a while; you may need to archive / delete it periodically.

Sample output :

Specify the URLs to watch :

screen shot: web page watcher - URL dialog

The words to watch for :

screen shot: web page watcher - search word dialog

The user can also specify the start tags of the HTML elements whose content is to be watched :

screen shot: web page watcher - HTML start tag dialog

In this example HTML header (1..3) and paragraph elements will be watched.

The user can also configure a timer to control when the watch scans occur :

screen shot: timer configuration dialog

In this example the interval is 1 hour (3600 seconds).   The first scan will run at 12:30 pm on the current day.   Subsequent scans will run every hour thereafter, until the run is canceled or Data Splitter is exited.   If a start time that is already past is configured an error dialog will appear when the user attempts to run.

How it works

The top-level parser extracts the desired HTML elements :

screen shot: top-level HTML element parser

After the top-level parser extracts an HTML element it hands it off to another parser, "CheckContent" :

screen shot: HTML element content checker

CheckContent simply checks its input (the HTML element content) against the search word list.   If any one of the search words is present the content is sent to the output by action group "ShowContent".