What is a typical size of DataWarehouse and Data Marts?

Moderators: chulett, rschirm

chandankambli
Participant
Posts: 14
Joined: Sun Jun 11, 2006 2:16 pm

What is a typical size of DataWarehouse and Data Marts?

Post by chandankambli »

Dear Experts:

What is a typical size of DataWarehouse and Data Marts?
Thanks experts.
datastage_learner
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard. :D

Your question is in the "how long is a piece of string?" category.

The only valid answer is something like "big enough, and growing over time".

I have worked with star schemas as small as a few Megabytes and as large as a few Petabytes.

It's all driven by the business needs of the owners of the data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,
And btw, pls avoid interview question in this forum.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Welcome Aboard
Answer to such questions can be found by a simple google. Try doing some research.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DH Architects are, in my experience, governed by consultation with Business Analysts who prepare target-from-source mapping specifications which, along with row counts from source, allows the size of the target to be estimated.

But there's no way I'm going to put a figure on it, which your original question asks. There is no "typical size".

Growth is an important factor too. It is often more economical to buy larger hardware up front that to increase it later, but try explaining that to the average bean counter. By its very nature (being periodic snapshots of business data, possibly enriched with other data) a DW or DM will grow.

Too simplistically consider the following scenario. Each week you capture 5% of the business data. Ignoring the initial load and any load of historical/legacy data, in only 20 weeks you have captured 100% of the business data volume; in one year 260%, in five years more than 1000%. Don't get caught under-sizing!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chandankambli
Participant
Posts: 14
Joined: Sun Jun 11, 2006 2:16 pm

Post by chandankambli »

ray.wurlod wrote:DH Architects are, in my experience, governed by consultation with Business Analysts who prepare target-from-source mapping specifications which, along with row counts from source, allows the size of ...
Ray, a perfect understanding of the question and a perfect response that's what I can appreciate.
Thanks experts.
datastage_learner
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Planning has its own weightage. As mentioned, its purely depends on your source data and the business needs. The business requirements will be in such a way that, the DM1 will required only 5% of the operational data. Ray has mentioned, how even the 5% grows to 1000%.
Calculate the capacity of individual datamart, excluding the confirmed facts, you ll get the DWH size. (If you are on bottom up.)
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... of course, I meant DWH Architects, not DH Architects! :oops:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

chandankambli wrote:Ray, a perfect understanding of the question and a perfect response that's what I can appreciate.
So say we all. Nothing like shooting for the moon in your first thread here, however. :wink:

I would think you would need to start with the perfect question before you could see perfect understanding and receive a perfect answer... and that only in a perfect world. :cry:
-craig

"You can never have too many knives" -- Logan Nine Fingers
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Like everybody stated above there is no typical size for a Datawarehouse.
It depends on your intended business needs.

You can determine the size, growth of your DWH with the help of the Business Analysts, Domain experts who are supposed to have knowledge of how much of data you get on a daily, monthly and yearly basis.

It is not a easy task. You need to have a complete knowledge of your Data Model.
You can do it yourself or take the help of your DBA's to estimate the size of individual tables.
You can calculate the Average Row length of your tables, multiply it with your estimated data to get an estimate of the space required for that table.

Your estimates may not match the actual numbers all the time.
It is better to ask for more space before hand rather than getting caught with space issues later.
(With the cost of disk space decreasing day by day, the fight for disk space is not a major issue in most of the companies nowadays)
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Once you've done all those calculations, add 42% as a safety margin. Document it as a safety margin. When challenged, mutter something about the "ultimate answer (to life, the universe, and everything)".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

42% With any specific formula you arrived this number :?:
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The reference to 'Life, the Universe and Everything' was supposed to clue you into a Hitchhiker's Guide to the Galaxy reference - and 42 was a pivotal number from the series and was, in fact as Ray notes, the answer to that very question. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

So that was an ultimate answer to the amateur question.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Concurrent knowledge of the answer and the question is not permitted.

-- Douglas Adams, in one of the five books in the inaccurately named Hitchhiker's Guide to the Galaxy trilogy.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply