Currently I'm using SSIS packages, C#,Powershell, BulkInserts technique for ETL and I’ve one database with different schemas.
for example core.table1, core.table2, trans.table1, trans.table2 in SQL-2008.
I would like to create 3 different SQL physical nodes for load balancing, performance and to handle large scale of data.
Two simple questions:
1) create same table structure in all 3 node, and load data parallel,
for example if I’m receiving 1000 records- distribute 1000 records in 3 nodes (300 + 300 + 400),
To retrieve the data I should have same algorithms, three separate thread process parallels and get me the result form 3 nodes.
2) Create one schema on one node, another in second node and dump the data accordingly.
for example all core. tables are under node1, trans. tables are under node2 and so on..
please let me know the pros & cons of above approaches?
Is that Store procedure adviseable in distributed SQL environmnet?
I feel LINQ-SQL is more convineant as application get the data by using DBML Schema? Please advise
Which would be preferable approch on ETL,
1) Extract > Transform > Load
2) Extract > Load (Flat file Load) > Trnasform > Load
View Complete Post