Background: I am using SSIS / BIDS 2005 with multiple Sybase ASE 15.0 databases as the "source" for my ETL and SQL Server 2005 as the target where the transformed results will go. I am converting an Access based ETL app I built in the
past to SSIS.
I have a bunch of tables that need to be joined together to create a large result set. many of these tables are reference tables. Creating a giant query that hits Sybase to pull all the data in one go causes Sybase to bog down. The way
I handled this in the past (Access) was I created smaller queries with 2-3 table joins and put the results in staging tables and then in turn joined those tables to get to my end result with another query. This worked pretty well.
In SSIS I have one DFT for this particular query. In the DFT I have a series of OLE DB datasources some of which contain queries to Sybase that join 2-3 tables at a time. I also have a series of raw files I created as outputs from other DFTs
for later use (in lieu of stating tables). I am now using sorts and merges to join all this data together in this single DFT to get to my desired end result.
All the above said here comes my question...
For the merge-joins and sorting I can basically use one of two approaches which I will describe as "serial" and "parallel".
View Complete Post