Author |
Message |
SURA

Group memberships: Premium Members
Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9005
|
|
DataStage® Release: 8x |
Job Type: Parallel |
OS: Windows |
|
Hi there
I need some suggestions to read this file using datastage. This is a log file data which need to be loaded into a table.
These all are the columns i am after:
DATE
TIME
Event
Username
Application
Created_By
IP
Details
ClientId
Log_Type
see the below sample records from two different files, though it contains same event type. I thought to use the space delimiter, but it is not working. I am thinking to use to string match and then find a way to load the data by using field function.
Code: |
2017-04-03 12:56:43 [http-bio-aaaa-exec-10] INFO a.b.c.d.AppUserManagementController - Event=User_Created Username=ABCDl8 Application=XYZ Created_By=Admin IP=0:0:0:0:0:0:0:1 Details="Roles_Added=ZZZZZZZZ Office_Added=OrganizationProfile{orgId:12323112, orgName:ASDASDDS}" ClientId=useradmin Log_Type=audit
2017-04-05 11:37:29 [WebContainer : 0] INFO a.b.c.d.AppUserManagementController - Event=User_Created Username=12345 Application=ABCD Created_By=Admin IP=0:0:0:0:0:0:0:1 Details="Roles_Added=SDFDS SDS SS Office_Added=OrganizationProfile{orgId:45455465, orgName:ASDASFDFDSF}" ClientId=SDFSFD Log_Type=audit
|
Is this is the best way to load the data \ any other better way?
Please throw some light.
Notes: In the same file, i may end up getting 33 different event. In such case i may need to find 33 different ways to fetch the relevant data.
|
_________________ Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn. |
|
|
 |
ray.wurlod
Participant
Group memberships: Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group
Joined: 23 Oct 2002
Posts: 54166
Location: Sydney, Australia
Points: 293772
|
|
|
|
|
|
Have you tried using a Sequential File stage with space (or 0x20) specified as the field delimiter character? If so, what was the result?
|
_________________ RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
|
|
|
 |
chulett
 since January 2006
Group memberships: Premium Members, Inner Circle, Server to Parallel Transition Group
Joined: 12 Nov 2002
Posts: 42479
Location: Denver, CO
Points: 218511
|
|
|
|
|
|
SURA wrote:
I am thinking to use to string match and then find a way to load the data by using field function.
Or that string match followed by a substring. Once you find the position of an ...
|
_________________ -craig
I know I don't say this enough, but I like when you talk to me. It's much better than when nobody talks to me. Or when people that I don't like will not stop talking to me.
|
|
|
 |
FranklinE

Group memberships: Premium Members
Joined: 25 Nov 2008
Posts: 662
Location: Malvern, PA
Points: 6287
|
|
|
|
|
|
Your samples indicate bad design for the extract source. There's no two ways about that.
I often harp on such things. I "grew up" in a Cobol environment, where delimiters are avoided and fixed-length fields make reading and writing an exact science. For this, I would point to how some columns are tagged and others are not. This is worse than inconsistent. It's lazy.
Unless you have variable length columns on your data, I suggest fixed-length column definitions. It's the only way to be sure. The ideal "solution" is to push this back on whoever is responsible for the extract and make them create a format that is consistent. You really shouldn't have to customize your read at the column level.
|
_________________ Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596 Using CFF FAQ: http://www.dsxchange.com/viewtopic.php?t=157872 |
|
|
 |
chulett
 since January 2006
Group memberships: Premium Members, Inner Circle, Server to Parallel Transition Group
Joined: 12 Nov 2002
Posts: 42479
Location: Denver, CO
Points: 218511
|
|
|
|
|
|
I've seen this before mining Apache logs and as noted, being a log file I doubt there's any wiggle room here for improvement. Can't hurt to ask, though.
There used to be a Perl-based module i ...
|
_________________ -craig
I know I don't say this enough, but I like when you talk to me. It's much better than when nobody talks to me. Or when people that I don't like will not stop talking to me.
|
|
|
 |
SURA

Group memberships: Premium Members
Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9005
|
|
|
|
|
|
ray.wurlod wrote: |
Have you tried using a Sequential File stage with space (or 0x20) specified as the field delimiter character? If so, what was the result? ... |
Ray
Yes i tried that option . The issue I have here is, a single file which contains 33 different events. Each events have different number of columns.
|
_________________ Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn. |
|
|
 |
SURA

Group memberships: Premium Members
Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9005
|
|
|
|
|
|
chulett wrote: |
SURA wrote: |
I am thinking to use to string match and then find a way to load the data by using field function. |
Or that string match followed by a substring. Once you find the position of an ... |
Yes that exactly i did after i post this query.
|
_________________ Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn. |
|
|
 |
SURA

Group memberships: Premium Members
Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9005
|
|
|
|
|
|
FranklinE wrote: |
The ideal "solution" is to push this back on whoever is responsible for . |
You are 100% right.
The issue are...
1. When they gave the sample file, it wasn't that bad. Though the work is messy , still it was manageable.
2. When the scope was extended, then few new files came, then we started to get the trouble. (At the time of writing this tread, i saw further changes!!!)
3. Not involved at the right time.
So at this stage, it like catching a tigers tail!!
As i pointed earlier , i used grep the details based on the STRING and finishing this task.
To me personally i wont prefer this way, but .....
|
_________________ Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn. |
|
|
 |
chulett
 since January 2006
Group memberships: Premium Members, Inner Circle, Server to Parallel Transition Group
Joined: 12 Nov 2002
Posts: 42479
Location: Denver, CO
Points: 218511
|
|
|
|
|
|
I don't see any point where you mentioned grep or that you "finished the task"... should we mark this as a "workaround"?
|
_________________ -craig
I know I don't say this enough, but I like when you talk to me. It's much better than when nobody talks to me. Or when people that I don't like will not stop talking to me.
|
|
|
 |
SURA

Group memberships: Premium Members
Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9005
|
|
|
|
|
|
Yes , i am marking this thread as Workaround.
Thanks to all for your time and help.
|
_________________ Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn. |
|
|
 |
|