How to read the space delimited file??

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

How to read the space delimited file??

Post by SURA »

Hi there


I need some suggestions to read this file using datastage. This is a log file data which need to be loaded into a table.

These all are the columns i am after:

DATE
TIME
Event
Username
Application
Created_By
IP
Details
ClientId
Log_Type

see the below sample records from two different files, though it contains same event type. I thought to use the space delimiter, but it is not working. I am thinking to use to string match and then find a way to load the data by using field function.

Code: Select all

2017-04-03 12:56:43  [http-bio-aaaa-exec-10] INFO a.b.c.d.AppUserManagementController -  Event=User_Created Username=ABCDl8 Application=XYZ Created_By=Admin IP=0:0:0:0:0:0:0:1 Details="Roles_Added=ZZZZZZZZ Office_Added=OrganizationProfile{orgId:12323112, orgName:ASDASDDS}" ClientId=useradmin Log_Type=audit

2017-04-05 11:37:29  [WebContainer : 0] INFO a.b.c.d.AppUserManagementController -  Event=User_Created Username=12345 Application=ABCD Created_By=Admin IP=0:0:0:0:0:0:0:1 Details="Roles_Added=SDFDS SDS SS Office_Added=OrganizationProfile{orgId:45455465, orgName:ASDASFDFDSF}" ClientId=SDFSFD Log_Type=audit
Is this is the best way to load the data \ any other better way?

Please throw some light.

Notes: In the same file, i may end up getting 33 different event. In such case i may need to find 33 different ways to fetch the relevant data.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you tried using a Sequential File stage with space (or 0x20) specified as the field delimiter character? If so, what was the result?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: How to read the space delimited file??

Post by chulett »

SURA wrote:I am thinking to use to string match and then find a way to load the data by using field function.
Or that string match followed by a substring. Once you find the position of an identifying string, like "Event=", seems to me you could find the next space after that position and then pull out the event name. Lather, rinse, repeat.
-craig

"You can never have too many knives" -- Logan Nine Fingers
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Your samples indicate bad design for the extract source. There's no two ways about that.

I often harp on such things. I "grew up" in a Cobol environment, where delimiters are avoided and fixed-length fields make reading and writing an exact science. For this, I would point to how some columns are tagged and others are not. This is worse than inconsistent. It's lazy.

Unless you have variable length columns on your data, I suggest fixed-length column definitions. It's the only way to be sure. The ideal "solution" is to push this back on whoever is responsible for the extract and make them create a format that is consistent. You really shouldn't have to customize your read at the column level.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I've seen this before mining Apache logs and as noted, being a log file I doubt there's any wiggle room here for improvement. Can't hurt to ask, though. :wink:

There used to be a Perl-based module in DataStage specifically for handling data of this nature. That might be another option to consider - Perl.
-craig

"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

ray.wurlod wrote:Have you tried using a Sequential File stage with space (or 0x20) specified as the field delimiter character? If so, what was the result? ...
Ray

Yes i tried that option . The issue I have here is, a single file which contains 33 different events. Each events have different number of columns.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: How to read the space delimited file??

Post by SURA »

chulett wrote:
SURA wrote:I am thinking to use to string match and then find a way to load the data by using field function.
Or that string match followed by a substring. Once you find the position of an ...
Yes that exactly i did after i post this query.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

FranklinE wrote:The ideal "solution" is to push this back on whoever is responsible for .
You are 100% right.

The issue are...

1. When they gave the sample file, it wasn't that bad. Though the work is messy , still it was manageable.

2. When the scope was extended, then few new files came, then we started to get the trouble. (At the time of writing this tread, i saw further changes!!!)

3. Not involved at the right time.

So at this stage, it like catching a tigers tail!!

As i pointed earlier , i used grep the details based on the STRING and finishing this task.

To me personally i wont prefer this way, but .....
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't see any point where you mentioned grep or that you "finished the task"... should we mark this as a "workaround"?
-craig

"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Yes , i am marking this thread as Workaround.

Thanks to all for your time and help.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Post Reply