Difference between cardinality and uniqueness

This forum contains ProfileStage posts and now focuses at newer versions Infosphere Information Analyzer.

Moderators: chulett, rschirm

Post Reply
datisaq
Participant
Posts: 154
Joined: Wed May 14, 2008 4:34 am

Difference between cardinality and uniqueness

Post by datisaq »

Can anyone please tell me the exac difference between cardinality and uniqueness?

I'm doing column analysis in Information analyzer, for a column the percentage of Cardinality and Uniqueness is different, but as per the definition of cardinality it represent "uniquness".

Please help me out..
IBM Certified - Information Server 8.1
tcj
Premium Member
Premium Member
Posts: 98
Joined: Tue Sep 07, 2004 6:57 pm
Location: QLD, Australia
Contact:

Post by tcj »

Think of cardinality as the count of distinct values within the column.

For a field like transaction date you would expect a low number of distinct values.

If you had the values:
01-01-2009
01-01-2009
02-01-2009
02-01-2009
03-01-2009

You would expect a value of 3.

For a field, which is a primary key, you would expect a high number of distinct values.

Uniqueness count is the count of unique values.

For a field like transaction date you would expect a low number. You wouldn't expect to find many dates, which have occurred once.

If you had the values:
01-01-2009
01-01-2009
02-01-2009
02-01-2009
03-01-2009

You would expect a uniqueness count of 1.

For a field that is a key you would expect a high number. The uniqueness count should be the same value as the total actual row count.
datisaq
Participant
Posts: 154
Joined: Wed May 14, 2008 4:34 am

Post by datisaq »

Thanks Mr.Tcj for your reply..

As i have understand based on the transaction date example,
the cardinality is 3 since it's having the 3 distinct values--- 01-01-2009,
02-01-2009,03-01-2009.

But what about the uniqueness count for the given example?

As you've told uniqueness value is same as that of actual record count.

Then

Transation Date Uniqueness
01-01-2009 2
02-01-2009 2
03-01-2009 1

Total=2+2+1=5(Actual Record count)

Please correct me if i'm wrong..
IBM Certified - Information Server 8.1
tcj
Premium Member
Premium Member
Posts: 98
Joined: Tue Sep 07, 2004 6:57 pm
Location: QLD, Australia
Contact:

Post by tcj »

The uniqueness count should be the same value as the total actual row count if you are doing a column analysis run on a primary key field.

For example:

1
2
3
4
5
6
7
8

The uniqueness count would be 8 as there are 8 records which only occur once.

The example I supplied the uniqueness count is 1.

There is only one date which occurs once. That is the date
03-01-2009.

Hope that clears that up.
Post Reply