ETL+ Data Warehouse

ETL+ Data Warehousing, MS SQL Server based, architecture.

Schemas

See Schemas + Multi-Tenant sections below.

Tables in SSMS

SQL Server Management Studio (SSMS)

After a table has been loaded by ETL+, the results of the refresh can be seen in the Create Date and Row Count of the just refreshed table in the data warehouse.

Load type = Load All: If you have a table set for Load All in ETL+, the Create Date displayed by SSMS displays the date from the last refresh because ETL+'s Load All refresh process drops the existing table and creates a new one in it’s place.

Load type = Replace or Upsert or Append: However, if you have a table set for a delta refresh such as Upsert in ETL+, the refresh process doesn’t drop and re-create the table – so the Create Date doesn’t change.

See also: https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032239737

Example from the Screenshot Below

  • The following screenshot is from SSMS → View → Object Explorer Details, then select the Tables section in your data warehouse database. If you don't see some of the columns on the Object Explorer Details panel, right-click on this panel’s column header and select the desired columns.

  • On this example, DataSelf ETL+ started to refresh the tables on 4/28/2022. The Address table is set to Load All, and we can see the latest load was on 9/12/2022. The other 3 tables are using Upsert, therefore their Create Date is when the first load happened.

Logging

More at: https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032242479

Performance

  • Large Tables with Delta Loads: If you need to join large SQL tables that are using delta load (Replace, Upsert or Append), and performance is slow to rend the join, we recommend adding indexes via SSMS. Delta Load tables are not deleted by ETL+ and will keep the indexes. If you force a Load All on such tables, remember to recreate the indexes since Load All deletes the table. More at: https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032239737

  • We recommend MS SQL having at least 50% more RAM than the largest data warehouse table (MS SQL data space used). Provide MS SQL with more RAM than all tables combined for best performance.

  • Use SSDs for the MS SQL data warehouse server.

  • MS SQL 2019 is the current most recommend version.

MS SQL Server Configuration

Database naming convention: Dw_<company-name>_<main-source> (example: Dw_AbcInc_NSuite)

ETL+

Default recommended SQL Server properties:

For recommendations and more information about properties see:


Data Warehouse List

Only available for super users.

From the ETL+ main page, click Settings -> Data Warehouse.

To select an existing data warehouse, select the line from the list below and click Save.

Important:

  • When you change the data warehouse of an Entity, close and reopen ETL+ to re-load all its settings.

  • ETL+ needs the following MS SQL database roles to the data warehouse: db_datareader & db_datawriter.

Modify or Add New Data Warehouse

Only available for super users.

  1. Select a data warehouse from the list → Modify.

  2. Or click Add New to create a new one.

Server name: IP, URL or MS SQL Server instance name. Ex.: (local).

Authentication: Windows or MS SQL Server.

Login and Password.

Schema:

  • Non Multi-Tenant data warehouses: This is the default data warehouse’s MS SQL schema for this ETL+ entity. For instance, all tables will be loaded to the dbo SQL schema. You can configure Sources to have their own SQL schemas. Learn more at https://dataself.atlassian.net/wiki/spaces/ETL202208/pages/2032242696.

  • Multi-Tenant data warehouses: All tables will be loaded to a MS SQL schema = <ETL+ EntityID>.

Database name: Name of the SQL database.

Multi-Tenant: Defines if the data warehouse will host an organization’s specific data, or if for a multi-tenant platform.

  • Multi-Tenant checkbox not checked:

  • Multi-Tenant checkbox checked: User will only have access to the data warehouse schemas:

    • dbo: read only

    • <ETL+ EntityID>: read and write. All tables are loaded into this SQL schema.

 

Related Pages

v2022.08