I am a fresher with the analytics.
Presently, working on a system which tracks the data from Network giants like Bing, Google, Yahoo, Facebook etc. depending upon the profile of customers using our system.
So we our tracking data of around 5-6 million rows in a day. Apart from that, dimensions like Search Keywords and all also contain data of around
50-60 million of rows. Although, daily addition will be low in dimension tables, but it will keep on increasing as customers will increase. Tracking data is expected to grow beyond 15 million.
To provide analytics, I must have minimum of 3 months of data in a cube. To provide trends, it should increase to around 1-4 years (in our future road map).
Presently, I am suffering with fatal issues while processing a cube, like, "The operation cannot be completed because the memory quota estimate xxxx exceeds the available system memory yyyy" - while processing dimensions
Please guide with some good thought process how to manage such a huge data and provide analytics over it. As well throw some light over the approach to follow for testing cubes and data loading packages of ssis.
Please provide the links of articles/videos/blogs & name of books to discuss and decide design the analytics for data-extensive systems. Any kind of reading mater
View Complete Post