ETL+ for MS Excel

ETL+ reads and extracts data from Microsoft Excel XLS, XLSX, and XLSM files.

ETL Sources & ETL Source Tables

ETL Source. An ETL Source refers to the connection to a data source. Connections to Excel files are maintained in the ETL Source panel with help from the Microsoft Excel page. The screenshot below shows two ETL Sources named Excel Workbook and zData Warehouse.

ETL Source Table. A ETL Source table refers to a table that ETL+ can extract data from. In the screenshot below Sheet1$ is a ETL Source table belonging to the ETL Source named Excel Workbook.

Excel Workbook is the name of the ETL Source and Sheet1$ is a ETL Source Table in Excel Workbook.

ETL Source Panel

Buttons on the ETL Source panel.

Add a New MS Excel Source

  1. Open and Login to ETL+ for the ETL+ entity. For more see https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032239112.

  2. Open the https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032239184.

  3. Click Add Source > MS Excel on the ETL Source panel.

  4. Configure the Microsoft Excel page.

Notes

Maintain a MS Excel Source

  1. Select the Excel data source’s name from the ETL Source panel.

  2. Click Properties.

  3. Configure the Microsoft Excel page.


Microsoft Excel page

The Microsoft Excel page adds and maintains metadata about connections to ETL Sources.

  1. Configure the fields shown below.

  2. Click Connect or Save.

Microsoft Excel Page Settings

  • Alias. A alias name will be automatically populated, but you can customize it.

  • File Path. Options are:

    1. Enter the path and file name of a local Excel file.

    2. Select Browse to find the file.

    3. Enter a URL address for the file on your local network or internet.


View Excel Sheets (Tables) Available from a Excel Data Source

Each Excel workbook shown in the ETL Source panel How to show each Excel sheet Excel workbooks shown as a data source in the ETL Source panel is known to ETL+ as a source table.

  1. Select the Excel data source’s name from the ETL Source panel.

  2. Click Refresh.

Mapping Excel Sheet to Data Warehouse

How to add a ETL source table in the ETL Source panel to a target/destination data warehouse table in the ETL Objects panel.

  1. Double-click the Excel data source’s table name from the ETL Source panel.

    • In the example above the data source table name is Sheet 1$.

    • Alternative. Right click the data source’s table > Add Table.

  2. Click the name of the ETL source table in the ETL Objects panel and click Design, Preview or Load.
    See the headings below for more about verifying data and scheduling ETL+.

Data Type Issues

Warning – Unexpected Results Extracting Data from Excel

The driver that opens and reads the Excel files controls how the data is read and extracted from Excel. Warning: The driver does not pay attention to the data type assigned to the columns in Excel! This can lead to unexpected results.
The driver looks at the data in the first few rows in each column to determine the data type (float, integer, varchar, etc) and length of character strings the driver will use to extract and data from Excel.

You may need to change the data type and re-cast the data in the Design page.
Recommendation: Read https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032243557.

Preview Data

Preview shows you the data in a ETL source table without (up)loading the data into the destination table in the target data warehouse. For Excel workbooks each sheet in the workbook can be a ETL source table.

  1. Click the name of the data source’s table to preview in the ETL Objects panel.

  2. Click Preview.

  3. Click Close to return to the https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032239184.

Load Now

How to load data from a data source to the destination data warehouse.

  1. Right-click the name of the table on the ETL Objects panel > Load Now.

  2. Click Log to look at the log file entries for the load process.

Examine the Log Records

How to see he log entries from recent ETL+ processes.

  1. Click Log.

Log page help here.

Schedule the Data Load

  • On the ETL main page, click Refresh Batch (top right icon on the center panel).

  • Select a Refresh Batch or create a new one to load your Excel data on a schedule (help here) → on the right panel, be sure the Excel tables to load have their boxes checked and in the desired load order.

  • To load them now, click the Refresh Batch Now (play icon) on the left panel for the desired Refresh Batch.

  • To schedule the data load:

    • Click WTS Settings -> enter the WTS parameters → Apply.

    • This step might require you to run ETL+ app as “Run as administrator” in Windows.

  • Click Close -> Close.

Testing

After ETL+ has loaded your Excel files how to verify that the data extraction process finished successfully and number of rows extracted and loaded matches your Excel table.

Steps

  1. Login to ETL+ (help here).

  2. On the ETL main page, click the top-right Log icon. Log page help here.

  3. Scroll down on the log page and double-check that no table load failed, and the row count matches with the Excel tables. Following is an example highlighting two tables and their loaded row count.

Related Pages