Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

User guide page for Source Object zData Warehouse. The zData Warehouse Source Object is a specialized data source managed by ETL+ serving two primary purposes. First to lookup and retrieve data in static reference tables such as the ETL+ Time Series tables provided by DataSelf. Second in order to allow re-importing data from existing target tables in the data warehouse for further transformations.

...

Excerpt
hiddentrue
nameInternal_Note
  • The Properties button on the Source Objects panel does not apply to the zData Warehouse source object. The data source is the data warehouse controlled by the Target Objects panel.

In the context of ETL (Extract, Transform, Load) software, "reimporting" refers to the process of extracting data from a source system or file after it has already been extracted and possibly transformed and loaded into a data warehouse or another destination.

Reimporting becomes necessary in several scenarios:

  1. Incremental Updates: When dealing with large datasets, it's often inefficient to extract and process the entire dataset each time. Instead, ETL processes can be designed to identify and extract only the new or changed data since the last extraction. This is known as incremental or delta extraction, and it involves reimporting only the relevant data.

  2. Data Corrections or Updates: If errors are discovered in the previously loaded data or if updates need to be applied retroactively, the ETL process may involve reimporting specific data records or subsets of data to make corrections or updates.

  3. Historical Data: In some cases, historical data may be added or corrected, and reimporting is necessary to ensure that the data warehouse or destination system reflects these changes accurately.

  4. Data Integration: In complex ETL workflows, data from different source systems or files may be integrated into a unified dataset. Reimporting may be required when the source data changes or when additional source systems are introduced.

  5. Periodic Refresh: ETL processes may be scheduled to refresh data at regular intervals (e.g., daily, weekly). Reimporting occurs during each refresh cycle to ensure that the data in the destination system is up to date.

In all these cases, reimporting involves extracting data from the source, possibly applying transformations, and then loading the updated or additional data into the destination, while ensuring data consistency and accuracy.

Efficient handling of reimporting is a crucial aspect of ETL design, as it can impact the performance, data integrity, and timeliness of the data integration process. ETL developers and administrators need to plan for and manage reimporting scenarios effectively to keep data up to date and in sync with source systems.