I read the document about these two methods but still I don't understand the difference.
Can someone explain the difference with some example?
Thank you.
Unduplicate Independent vs Transitive
The way I understand it (and I could be off base...):
* Unduplicate Independent finds 'master' records within groups (blocks). It then looks for records to group. For example, if A is the selected master record, and:
A matches C
B matches C
A does not match B
Then QualityStage would match A and C. Remember that A was the master, so everything is being compared to A. B would have to be brought into the group in another pass. Furthermore, the fact that B matched C is not remembered outside this pass.
* Unduplicate Transitive matches every record to every record in the group (block). There is no concept of a 'master' in this case. So:
A matches C
B matches C
A does not match B
Since A matches C and C matches B, therefore A, B, and C are all grouped together.
I hope this helps!
* Unduplicate Independent finds 'master' records within groups (blocks). It then looks for records to group. For example, if A is the selected master record, and:
A matches C
B matches C
A does not match B
Then QualityStage would match A and C. Remember that A was the master, so everything is being compared to A. B would have to be brought into the group in another pass. Furthermore, the fact that B matched C is not remembered outside this pass.
* Unduplicate Transitive matches every record to every record in the group (block). There is no concept of a 'master' in this case. So:
A matches C
B matches C
A does not match B
Since A matches C and C matches B, therefore A, B, and C are all grouped together.
I hope this helps!
Regards,
Robert
Robert