ETL+ Issues Importing from Excel

ETL+ uses a Microsoft driver to read and extract data from Excel data sources. The driver determines the data type assigned to the data extracted from every column.

Windows and Security Issues

Changes to Microsoft Windows security can impact ETL+'s ability to read the Excel file.
In some cases rebooting the computer fixes the problem.

Software Requirements

See https://dataself.atlassian.net/wiki/spaces/DS/pages/1920270337

Microsoft Excel Files

Causes of failure include

  • Incorrect path, or moved file

  • The Excel file is currently open

  • Problems with network file access. Recommend testing access with a locally hosted file.

Column Header Names & Column Order

  • Column Headers should not have any leading spaces in the column name. SQL does not recognize leading spaces and will reject that column.

  • Changes to the columns in a Excel sheet. Issues may arise when the tables and/or their columns inside of the Excel file are not exactly the same as originally mapped.
    Resolution: See https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032242696 to reconfigure.

Character Type Issues

  • The driver determines the length (size) of the string extracted from the Excel column by reading the first few rows in Excel. The driver alone determines the maximum length of the character string it will extract. The driver does not look at the size of the destination string in the ETL SQL Statement panel or ETL Objects panel.

    • If, for instance, the driver determines the maximum length of a string in a Excel column is 30 it will only extract the first 30 characters from that column even if the ETL SQL Statement panel and ETL Objects panel show a destination field that is varchar(255).

    • Another example is if the driver determines the data type of the column is float because the first few rows are integers, when, in fact, the column’s data type should be varchar(255), any row that has a value that cannot be interpreted as a float will be replaced with a NULL value.

    • Changing the string size in the ETL SQL Statement panel or in the destination table in the ETL Objects panel will not prevent truncation.

  • The workaround for these issues is to add a dummy first row with a value(s) in the column(s) you want to force the driver to interpret correctly. For example, if the requirement is for a column to have a data type of varchar(max), then add a value greater than 255 characters in the dummy row column.

Numeric Data Type Issues

  • Columns with integer values end up as data type float in the database and may cause issues with joins and performance unless the data type of that column is changed in the database schema.

Mixed Data Types, Bad Data. etc.

  • Bad data in Excel tables such as invalid characters, or characters in columns with mostly numbers.

  • Mixed types in a column may result in a loss of values.

Resolution

If unable to diagnose the problems, send the Excel file to support@dataself.com.