Email parsing + database generation

Many generated emails use a

   label:  data

or similar format that makes it easy to extract data items and transmit them to database fields corresponding to the labels.

Data Splitter string sets allow mapping of the labels to database destinations.   For example :

Text Target
Customer: table.custname
CustomerID: table.custid
TransactionID: table.transid
Transaction date: table.transdate
Amount: table.amt
.... ...

The extracting data from emails topic demonstrates how this works.   Sample solution EMail-To-Database.dss installs with Data Splitter and can be modified to work with other email formats.   See the email parsing topic for more information.

The general approach to setting up an email parser :

In detail :

Analyze the emails

Consider the following questions when analyzing the emails to be parsed :

Sample EMail-To-Database.dss demonstrates filtering emails based on the subject (SubjectFilter).   Data Splitter has a "New messages only" checkbox under Message Options that restricts parsing to as-yet-unscanned messages.

Configure the database

Use a DBMS (Database Management System) such as Microsoft Access to create the "target" database.   Questions to consider when designing the tables :

Visit the Database + ODBC page for information regarding database creation and configuring the ODBC (Open Database Connectivity) connection.

Download, install and start Data Splitter

Download the Data Splitter self-installing executable (.EXE) file.

Install Data Splitter by running the downloaded .EXE file on your computer.   It is recommended that you accept the defaults by simply pressing "Enter" until the installation has finished.   Take time to read the license agreement, though!

Start Data Splitter - the installation creates a Start menu entry and icons to run the program.

Select the appropriate email parsing template

The Data Splitter installation is accompanied by several sample email parsing configurations.   The best way to create an email parser is to adapt an existing parser that closely matches the desired task.

EMail-To-Database.dss is a good starting point for many email formats.   Modify the string sets and action group "NewEMail" as needed.

Contact Data Splitter support for assistance with email parser configuration.

Set the input

The input is specified as one or more message folders.   Select Input/Output | Input email folders to define the folders to be scanned.   When the Run | Email input (Ctrl+E) command is selected all emails in the specified folders will be scanned.

Set the output

Data Splitter can output to databases or files.   Specify output file(s) using the "Input/Output" menu, option "Output files".   Specify the target database using the "Input/Output" menu, option "Database".   Consult Data Splitter help for information on output destinations.   Visit the Database and ODBC topic for more information on databases.

Map the input to the output

Modify the Data Splitter string set(s) using the "Definitions" menu, option "String sets".   String sets are also accessible via the "Quick Start" menu.

Modify other configuration data as necessary

In order to perform the desired task the sample configurations may require modification beyond changing input, output and string sets.   Consult Data Splitter help and the tutorial for more information.

Run

Use the Run | Email input (Ctrl+E) command to parse the emails.   See the run topic for more information.


Obtaining the desired output typically requires a few iterations of three basic steps :

Contact Data Splitter support for assistance with email parser configuration.


MAPI email / message interface

Data Splitter uses the Messaging Application Programming Interface (MAPI) to interact with the Windows messaging system.   This provides access to messages stored by MAPI-enabled email clients such as Microsoft Outlook.

In order to determine whether or not it has already seen a message, Data Splitter marks each message with a time stamp.   This enables Data Splitter to run in "New messages only" mode.   Apart from this time stamp, Data Splitter's access to the messaging system is read-only.