Recently on a data quality project I stumbled across an anomoly that I thought I share with the data quality / Informatica community. It involves the use of Informatica Data Quality (IDQ) and the use of certain types of queries. I discovered this anomoly when I recieved the following error message:
Image 1: Failure to initialise plan
One of the root causes of this error is a “behind-the-scenes” process that involves the import of data into an IDQ plan. When importing data into an IDQ plan, via SQL inside the source component, IDQ wraps a select count () around the SQL in the original query. At times this causes IDQ to fail while initializing and returns the error message above. Most commonly this error occurs when using a query that is beyond the standard select statement. I was using a union in my query when I got this error.
If you are using Oracle you have the option of editing your SQL or running the plan from the command line. However if you are using Microsoft SQL Server, you’ll need to get the to command line.
The SQL edit in Oracle is simple and straight-forward so let’s address that first. If you get an error where the IDQ plan fails to initialise try wrapping the SQL as follows:
As for Microsoft SQL Server the command line is the only option but it is very straight-forward. You could even argue that the command line is more enterprise robust since the jobs could then be scheduled and thus integrated into existing data services more easily.
Regardless of the perspective here is how you deploy the plan in realtime using the command line. An IDQ plan can be executed from one of two locations; the IDQ repository or the file location of the XML version of the plan. On Windows, the executable file for implementing runtime functionality is Athanor-RT.exe, located in the bin folder of the Data Quality Server installation. It is also a requirement to specify the location of the plan in the command line utility.
To deploy a plan from an XML file the -f switch is used. So the command to deploy an XML version of a plan, call it myplan, in the C:/idqplans directory would be as follows:
To deploy a plan from the repository the -p switch is used. Consequently the command to deploy the myplan dq plan from the repository would be as follows:
There are other useful switches to utilize while deploying your IDQ plans at the command line.
For instance to run an IDQ plan along with a parameter file, which allows you to override plan variables with those in the parameter file, use the -c switch. The command for executing a plan along with a parameter file would be as follows:
Another helpful switchs allows you to receive feedback as to the status of the plan execution. This switch is -i.
The -i switch will provide feedback in the command line window. The feedback occurs at the specified interval in the command line. For instance, the command -i 200 will provide feedback every 200 records for each field in the process. The following is an example of the feedback provided in the command line with the -i switch:
With these basic switches you can deploy any IDQ plan regardless of the query required to source the data. I hope this post helps someone avoid hours of debugging!
Check back later for more details about scheduling IDQ plans using the command line.