right tool(s) for data classification request

This forum contains ProfileStage posts and now focuses at newer versions Infosphere Information Analyzer.

Moderators: chulett, rschirm

Post Reply
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

right tool(s) for data classification request

Post by qt_ky »

We are already licensed for IA and use it for traditional data profiling against relational databases. Recently some interesting questions have arisen.

A customer needs to crawl a large number of servers (web servers, file servers, database servers, application servers, etc.) to find where sensitive data resides (like PII), which is one of the features that IA advertises. With the wide variety of servers and file types, I assume they cannot be predefined as importable metadata.

This sounds a bit like what an antivirus product does except that it would try to classify the data.

Does IA have any file-crawling capabilities that could be used to find where PII data resides in this scenario?

Is there another tool or utility that could possibly be used to bridge the gap to help IA to find the sensitive data?

Or is there a more appropriate tool for the job?
Choose a job you love, and you will never have to work a day in your life. - Confucius
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

That is not so much a datastage question but a Unix Security Scan question.

Not sure which forum that would be.

IA is not the tool to crawl your network since it cannot dynamically created connections, schemas, etc...
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

It is an IA question right now because "data classification / find PII" is an IA feature that the customer is quite excited about.

Yes, it may be similar to a UNIX security scan function although I would wager most of the servers or virtual servers are Windows and some are likely to be Linux-flavored.
Choose a job you love, and you will never have to work a day in your life. - Confucius
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

To my knowledge IA doesn't have the ability to crawl across host looking for files that may or may not contain a SN. That is his fundamental issue. ONCE he finds the file he can scan for SN, but I believe that the fact that he found it implies a SN detection of some sort.

And how is IA supposed to know the schema layout of the file?
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

text files might be doable but a SSN for example can be almost any grouping of 8 bytes in any file anywhere on the disk for a binary file. And that is uncompressed text files, compression or encryption would make those impossible also.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Discovery tool's functionality is being incorporated into Information Analyzer thin client, if indeed it hasn't been already (I haven't looked at FP2 yet).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply