We are using Data Integrator XI 3.0. Our local repository is a Oracle 10g database.
I have a Dataflow which has been running fine for past few years, but fail for this quarter execution.
The dataflow is doing the following:
1. Join table1 (join rank 0, cache, 8M) with table2 (join rank 0, cache, 200K); select distinct; use of ifthenelse(), cast(),ltrim(), rtrim(), and constant $parameter1; where table1=table2.person_id and load_date > $parameter2. Both table are in same datastore1.
2. Split query1 into query_location1 and query_location2
3. Merge query_location1 and query_location2
4. Load to target table3 which is in different datastore2 (Datastore 1 and 2 are host in the same server, different schema)
I have tried to do partial pushdown as following to no success.
1. Join table1 (join rank 2, no cache, 8M) with table2 (join rank 0, cache, 200K); use of cast(),ltrim(), rtrim(), and constant $parameter1; where table1=table2.person_id and load_date > $parameter2. Both table are in same datastore1.
2. Split query1 into query_location1 and query_location2
3. Merge query_location1 and query_location2
4. select distinct; use of ifthenelse()
5. Load to target table3 which is in different datastore2 (Datastore 1 and 2 are host in the same server, different schema)
When I check on the Display Optimized SQL..., it shows SELECT statement where table1=table2.person_id and load_date > $parameter2 only.
What is the bottleneck of current DF design that it is not pushdown? Is it because of the use of parameter in where clause, the Merge Transform, select distinct?
How do I re-design this DF to improve the performance?
Thank you.