.NET Tutorials, Forums, Interview Questions And Answers
Welcome :Guest
Sign In
Win Surprise Gifts!!!

Top 5 Contributors of the Month
david stephan
Gaurav Pal
Post New Web Links

Performance issue populating fact table

Posted By:      Posted Date: October 25, 2010    Points: 0   Category :Sql Server

I have an ETL job that runs fine until it gets to the step to populate the fact table. It runs for ages, without adding the new data. It used to run in 2 minutes, but I have not changed the definition of the fact table or the dimension tables referenced in the joins.

If I copy the FROM statement from the table definition and use it to perform a SELECT COUNT(*) I get the total back in 5 seconds. There are only 1.3 million rows at present.

Does anyone have any ideas as to why the count runs so quickly yet the actual data upload seems to stall?

Thanks in advance.

View Complete Post

More Related Resource Links

Updating multiple Indexes on a table - Performance issue



I have a performance issue when trying to update a table with multiple indexes. The table itself has about 280 million rows. The selection of the records is fast, about 160 ms, as it has a suitable non-clustered index. However the update itself takes over 10 minutes.

When I look at the excution plan, it shows the update to the clustered index as well 5 other non clustered index which are being affected by the statement. Unfortunately it doesn't show me how those indexes are being accessed. I suspect that the update statement is performing a full index scan against each of the non-clustered indexes in order to do the update.

So my question is this, if i add the key columns of the other non-clustered indexes as included columns on the index used to select the records for update will sqlserver use them to access the additional non-clustered indexes?

any advice greatly appreciated


Under the Table: How Data Access Code Affects Database Performance


In this article, the author delves into some commonly used ways of writing data access code and looks at the effect they can have on performance.

Bob Beauchemin

MSDN Magazine August 2009

Fact table in DSV vs partitions pointed to a different table

I am seeing an issue in my cube for a partition that is based on a separate table than the Fact table in the DSV. I have 8 partitions all from different physical tables. In the DSV I used 1 of those 8 partition tables as the "source" of the DSV so I could model the relationships between the fact and the dimensions. On 1 of the 8 it loads over 1 million rows from the partition into the cube, but when I use the browser to show the count in that particular partition it shows the exact same number of records that are in the table that was used in the DSV. The strange thing is all the other partitions work fine except this 1. I have deleted the partition and added it back multiple times and cant get it to work right. Has someone seen this problem before?   I have run into this a couple times, one way of fixing it was to recreate the entire project in a new project, copy all objects from the old projects and rebuild. I cant seem to figure out another way of fixing this.Craig

Building Fact and Dim table

How to build Fact and Dim tables for below requirement. Fact records (approx 800k) has to be analyzed every day w.r.t AgeGroup's Every record in fact will have a DOBSID and age/agegroup should be calculated every day based on getdate()

Getting counts by 2nd Date Dimension Attribute with Snapshot Style Fact Table

  I have an MDX question finding hard to solve.  I have a Snapshot Fact Table with a snapshot of the records in the source system for each batch date.  All records in the fact table are assigned the batch date with the batch date key.  There are many records for each day and each batch date is an entire copy of the source records.  So, the grain of the fact table is one record for each batch date that exists in the source system.  These facts rows have another date in them for when the record was entered.  This date is different from the batch date in that the batch date is based on the day the batch was processed and the entered date is based on when the record was entered.  If a record was entered many days before, its batch date will be today but its entered date will be several days ago.  Therefore each day a copy of all the records entered the previous batch date and all the records added on today's batch date are present. Fact Table : FactSnaphshotKey (surrogate for easier administration) BatchDateKey (link to batch date dimension – date dimension, first in dimension list so it is used for semi aggregate measures) EnteredDateKey (link to entered date dimension – date dimension) Facts Count – measure for fact table - default measure from Analysis Services cube 2 Dim

Update an accumlating shapshot fact table

This is my first time implmenting an accumulating snapshot fact table and I require some guidance. Accumulating snapshot fact tables show the status at any given moment. It is useful to track items with certain life time, for example: status of order lines.eg everytime there is new piece of information about a particular purchase, we update the fact table record. We only insert a new record in the fact table when there is a new purchase requisition. What I really need to know is how best to handle the updates.  This really feels very similar to managing SCD-1's in dimension processing! Anyone able to advise? thanks in advance Here is a perfect example we can use  http://blog.oaktonsoftware.com/2007/03/accumulating-snapshot-use-accumulating.html Figure 1, below, shows an accumulating snapshot for the mortgage application process. The grain of this fact table is an application. Each application will be represented by a single row in the fact table. The major milestones are represented by multiple foreign key references to the Day dimension—the date of submission, the date approved by mortgage officer, the date all supporting documentation was complete, the date approved by an underwriter, and the date of closing.

Performance Issue on SQL 2008 Box

Will SSRS and SSIS running on the same machine would hinders the perfromance of a database engine. 1. If so to what extent? 2. How do you tackle this kind of issue. 3. Is there a way you can separate these services from running on the same machine. The machine also has a OLTP database running on it.

Issue with Job schedule table?

Hi there, I'm trying to create a table/sheet that will display an employer's job schdeule for the week. Most of the information will come from several tables within a database. This table/sheet will consists of the following columns: Job ID Description Comments-> These 3 columns/fields represent actual colums within the database 7 columns representing days of the week, with the first column representing Today's date, the next column representing the next day...-> The 7 columns for the dates will not come from the database itself (as opposed to the other columns). These will be updated daily to represent the employer's schedule for the week. MY ISSUE IS THIS: One of the DB tables holds a column "StartDate" (which represents the start date of a phase within a particular job). I need to check whether this "StartDate" is equal to the date column (the image will make this more clear). Obviously, 1 job has MANY phases. An example of a such table would as follows (for the employee whose D_ID is = '555'):     Any ideas how I can create such a table?? Should it be done in recursive html? Or gridlist? repeater??... Thanks for any info   BTW, I posted this in the "Web forms" section, but I guess it is more suited to be posted here, as it is a databound issue.

daily complete cube rebuild four dimensions and fact table including remapping of all surrogate keys

Hi SSIS Engineers: Forgive me if this is a multi-forum question. Our primary activity in the next week is to automate the processing in SSIS, where I led the team to create complete processing flows for Full and Add in the order of Dimension, Measure Group, Partition, Cube, Database. These work. The problem occurs in a complete refresh of the ERP database that caused me manual effort inside SSAS, which I plan to find a way to automate in SSIS. I performed a complete refresh of our cube from the ERP source from a time perspective. We are automating this process in SSIS. In SSAS, I had to manually delete the four dimensions from the UDM view via the Solution Explorer. Since the complete refresh increased the surrogate keys in the dimensions and since the names were the same, I couldn't just drop the partition and reprocess the dimensions, since, in effect, new fact rows would have to be mapped to the new keys. SSAS held on to the old keys even with Full Processing of the Dimensions first, then the Cube. Until I dropped--deleted-- the dimensional tables from the Solution Explorer and the UDM then later readded the dimensions with the new surrogate keys (both add, update and delete dimensional attribute changes in full refresh) via the Add Dimension wizard, the cube kept the old surrogate keys and failed in measure group, fact, database and partition processing.

How to calculate a SQL Server performance of a query based upon table schema, table size, and availa

Hi What is the best way to calculate (without actual access to a SQL Server) the processing speed of a query (self-inner-join) based upon table schema, table size, and hardware (CPU, RAM, Drives)? ThanksThanks Jeff in Seattle

XOML only workflow performance issue while creating

  In our application we are using XOML workflow and create workflow instance using workflowRuntime.CreateWorkflow(workflow, ruleReader, workflowParameters) where workflow and ruleReader are objects of XMLReader created using XmlReader.Create().For one of our business process automation we have developed XOML only workflow which has 35 states and 95 events. When we try creating an instance of the workflows it is taking more time and the time taken to create the instance is 3 min.Can any one suggest is there any other way to create workflow instance of XOML only workflows other than workflowRuntime.CreateWorkflow(workflow, ruleReader, workflowParameters)Your help is higly appreciated.Environment:    VS 2008    ASP.NET 3.5    WWF 3.5    XP  

Gridview Performance Issue

Hi All, I have one ASP.NET Application (VS.NET 2005). In that i have one scenario where i have to display 500 records per page (Not less than that bcoz this is client requirement to display 500 records per page) in the Gridview. This functionality is working fine but the application is becoming very slow. Can anybody tell me the solution for increasing performance in this case ???? VERY URGENT...PLZ REPLY SOON Thanks, Biswajit

Avoiding a SELECT distinct query generated by SSAS when using dimension derived from fact table

Hi, I am using a dimension derived out ot a fact table and the factt able primary key is dimension key. Issue is, there are large number of rows and so many attributes. SSAS issues distinct query and it takes large amount of time. Without the distinct statement, query takes only 3 min for 4 million rows. With the distinct, it takes 20 min. Becuase the fact primary key is the dimension key there is no need of a distinct statement. I know there is a option in the dimension to say "By Table" to avoid this. But unfortuantely, i breach the 4 GB limit for strings. Any suggestions for optimization? Thanks,  Sambath

[SCD] fact table and SCD



I've read several articles about SCD, most of them exmplaining the standard SCD type 2 rules, like a customer adress change.

My problem is about PurchaseOrders changes :

I have a fact table "Internet Sales", with a Total property for each sale

Imagine that this Total property can change (ie. : missing product cannot be delivered) --> the order is recalculated, and its total changes. Of course, I need to keep both informations for the order, because if I do a request for the sum of the order totals for example, I won't get the same results if the request is done with data as they where BEFORE or AFTER the Total change. .... hope that's clear.

I wanted to add a "start", "end" and 'current" columns on my fact table, but I read that it's not the good way to do.

Can you help me with that ?

Thank you


 PS : I didn't consider the option of setting the order total in a dimension.... hope that's not the good way to do !

Need 2 Measure Groups for One Fact table



I have one fact table that contains all the measures.  The problem is that I want to have two measure groups that point to this one fact table.  Some measures would be in measure group A and some in measure group B but the underlying source still comes from the one fact table.  I haven't found a way to do this.

My solution currently is I have created another fact table that is an exact copy of the main one, that way I can create two measure groups.  The issue is performance, it takes 4 minutes to build and if I took one of those fact tables out it would be cut in half.

I would have thought there would be a way to create a measure group and drag what you want in there..but it seems that this isn't the case that you can only create measure groups based on how many fact tables you have.

Any help would be appreciated.


.NET 4 Performance Issue String IndexOf



I have a performance issue with the new dot net 4.0 framework.

When i use the function String.IndexOf(string, stringcomparison.ordinalignorecase) my execution time is much higher than framework 3.5 (i had 30ms in a loop for 3.5 and 210ms in framework 4)

All other StringComparison mode are really good in framework 4.

Is there an explanation?

Bridge Table dimension or fact? updating from snapshots



Scenario: Bank Accounts and Customers. One Account can have many customers and many customers can have one joint Account. so its Many to Many relationship.

Special Scenaio: Bank provide us daily snapshot of all thier dimensions and facts, every night thier ETL run, and newsnapshot is available, previous is gone.

I am using SCD Transformation to update the dimensions, Type 1 for all the columns.

Tables1: DimAccounts (AccountsID(PK))

Table2:DimCustomers (CustomerID(PK))

Table3:DimBridge (AccountID (FK), CustomerID(FK), RelationShip (varchar10))

Question1: Are we supposed to treat bridge tables as Dimensions or Facts?

Question2:If it is to be treated as Dimension, How would I apply SCD Wizard to it, Since there are two business keys involved?

Question3: Do i need surrogate key in Bridge Table, like i have in other dimensions?

Thank You


ASP.NetWindows Application  .NET Framework  C#  VB.Net  ADO.Net  
Sql Server  SharePoint  Silverlight  Others  All   

Hall of Fame    Twitter   Terms of Service    Privacy Policy    Contact Us    Archives   Tell A Friend