Generic Import Code Load
Prerequisite
The following are the prerequisite conditions for Generic Import Code Load:
-
To configure reltio tenant details, see Configure Connection.
-
To configure notification channels such as email id or MS teams channel or slack channel, see Adding Alert Notification. Configure at least one notification channel and Subscribe to the task group pipeline.
-
Keep GIF file sets ready as per the layout defined in the GIF DID of Generic Import Format in Data Interface Documents.
Generic Import Code Load Overview
In MDM 2.0, Generic code layout is defined as single file with canonical code, external codes, dependency and locales. This file is transformed into four different table in Canonical schema and pushed into Reltio RDM tenant as per the configuration. The process flow is as shown in the image below.
Import Pipeline Template
To Import Pipeline Template
-
Connect to IDP default s3 bucket and go to the folder <bucket_name>/templates/product.
-
Download the MDM_Generic_RDM_Code_Import_<version>.json file template to local machine.
Note:
If there are multiple files prefixed MDM_Generic_RDM_Code_Import_* with different versions, download the latest version.
-
Open the downloaded template json file with any text editor to find and replace the below placeholders with appropriate values.
Placeholder Replaceable String **rektioconnection** Reltio Connection name configured in Entity Collection for example, RELTIO_MDM Note:
The search strings given above are case-sensitive, do not enclose with any character, replace as it is.
-
Login to IDP platform with valid credentials.
-
Go to Data Pipeline and click Task Group from Template and import the template using updated template file.
Note:
Process creates the pipeline task group after the successful import. This has to be used for execution.
Generic Import Code Load PipeLine Steps
The following are the steps for Generic Import Code Load PipeLine:
-
MDM_GenericCode File- S3 connector plug-in loads complete code file(<S3_Bucket>/<Client_Folder>/input/MDM_Generic_Import*.*) from S3 to Landing and Staging tables with SCD Type 1 Full Refresh Method
-
Using Winscp or S3 browser, login to S3 bucket path and go to the root folder. Place the GIF files inside the root folder.
-
Data present in the each flat file is loaded to landing table. Before each load, previous data present in landing tables is truncated.
-
Data present in landing tables is loaded to staging tables. For each load, data in staging is always either inserted or updated with reference to history.
-
Set Current Date
sqlExecutor Plugin to set the current date time stamp in the log table for the process to identify the incremental data as of last run date to current timestamp.
-
Transform codes from Staging to Canonical Data from code Table is transformed and copied to MDM Canonical schema RDM tables based on Key which is encrypted with SHA1(CONCAT( LOOKUPTYPE, '^', CanonicalCode)). Data present in canonical tables are the one's to be posted to Reltio tenant.
Task Description Move Codes to Canonical Transform and Move Staging MDM_GenericCode table to R_CODE_ITEMS,R_CODE_EXTERNAL_ITEMS,R_LOCALE_ITEMS and R_DEPENDENCY_ITEMS -
Set Last RunDate
sqlExecutor Plugin to sets the step3 captured processing time as the last_run_date in the log table for the process to identify the incremental data.
-
RDM Load - Inbound Plugin Connector
-
Plugin-Inbound builds json message from Canonical schema RDM tables and push them into Reltio RDM tenant as per the configuration.
-
FileName Patterns |
---|
Generic_Code_Import..dat |
Operation
The following are the operations:
-
Open the MDM_Generic_RDM_Code_Import Pipeline and go to MDM_GenericImport_Files task to know the S3 path.
-
Using Winscp or S3 browser, login to S3 bucket path mentioned and go to the root folder. Place the GIF files inside the <root folder>.
-
Click Run, once files are placed.
-
Click Show Execution Logs which is next to run button for verifying the task status.
-
Monitor jobs
-
Both Data pipeline Jobs or API call jobs used to post data to Reltio, can be monitored through spring cloud data flow (SCDF).
-
To monitor, Login to IDP portal, go to Data Pipeline, click Data Pipeline.
-
Identify the Task Group/Pipeline name for which monitoring is required (For Example; OK_US_Data_Load). Click Show Execution Logs.
-
Spring cloud data flow page is opened in a new tab. Click Jobs.
-
-
Once the tasks are completed successfully then execute the below queries to get counts of HCP, HCO, Affiliation and Merge from canonical table present in snowflake data warehouse under the schema name MDM_CANONICAL. These counts are used to validate against the counts of data posted to Reltio.
-
SELECT COUNT(0) FROM MDM_CANONICAL.R_CODE_ITEMSwhere LOAD_STATUS = 'SUCCESSFUL';
-
Codes can be invalidated due to below reasons
Codes which falls under these rules is invalidated and ingested into odp_core_staging.Generic_Invalid_codes table for review.
-
when LOOKUPTYPE is null
-
when DATASOURCE is null
-
when SOURCECODE is null
-
when DATASOURCE,SOURCECODE,LOOKUPTYPE combination is duplicated with different CANONICAL_CODE
-
DATASOURCE, SOURCECODE, LOOKUPTYPE combination is duplicated with different CANONICAL_DESC
-
DATASOURCE, SOURCECODE, LOOKUPTYPE combination is duplicated with different CANONICAL_CODE
-
DATASOURCE, SOURCECODE, LOOKUPTYPE combination is duplicated with different SOURCEDESC
select * from ODP_CORE_STAGING.Generic_Invalid_codes; to review the codes that are invalidated.
Troubleshooting
-
Incorrect codes are pushed into the ODP_CORE_STAGING. Generic_Invalid_codes table. After the load, delivery have to check this table and make sure there is no incorrect code present.
-
In case of pipeline failure in any task, fix the error and restart the task from the failed task till the end.
See DID of Generic Import Format in Data Interface Documents.