DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic is not resolved, but there is a WORKAROUND.
Author Message
SURA



Group memberships:
Premium Members

Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9004

Post Posted: Mon Apr 17, 2017 11:34 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Windows
Hi there


I need some suggestions to read this file using datastage. This is a log file data which need to be loaded into a table.

These all are the columns i am after:

DATE
TIME
Event
Username
Application
Created_By
IP
Details
ClientId
Log_Type

see the below sample records from two different files, though it contains same event type. I thought to use the space delimiter, but it is not working. I am thinking to use to string match and then find a way to load the data by using field function.

Code:
2017-04-03 12:56:43  [http-bio-aaaa-exec-10] INFO a.b.c.d.AppUserManagementController -  Event=User_Created Username=ABCDl8 Application=XYZ Created_By=Admin IP=0:0:0:0:0:0:0:1 Details="Roles_Added=ZZZZZZZZ Office_Added=OrganizationProfile{orgId:12323112, orgName:ASDASDDS}" ClientId=useradmin Log_Type=audit

2017-04-05 11:37:29  [WebContainer : 0] INFO a.b.c.d.AppUserManagementController -  Event=User_Created Username=12345 Application=ABCD Created_By=Admin IP=0:0:0:0:0:0:0:1 Details="Roles_Added=SDFDS SDS SS Office_Added=OrganizationProfile{orgId:45455465, orgName:ASDASFDFDSF}" ClientId=SDFSFD Log_Type=audit


Is this is the best way to load the data \ any other better way?

Please throw some light.

Notes: In the same file, i may end up getting 33 different event. In such case i may need to find 33 different ways to fetch the relevant data.

_________________
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 53940
Location: Sydney, Australia
Points: 292679

Post Posted: Tue Apr 18, 2017 12:45 am Reply with quote    Back to top    

Have you tried using a Sequential File stage with space (or 0x20) specified as the field delimiter character? If so, what was the result?

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 41970
Location: Denver, CO
Points: 215411

Post Posted: Tue Apr 18, 2017 6:50 am Reply with quote    Back to top    

SURA wrote: I am thinking to use to string match and then find a way to load the data by using field function. Or that string match followed by a substring. Once you find the position of an ...

_________________
-craig

<this space for rent>
Rate this response:  
Not yet rated
FranklinE



Group memberships:
Premium Members

Joined: 25 Nov 2008
Posts: 588
Location: Malvern, PA
Points: 5494

Post Posted: Tue Apr 18, 2017 7:43 am Reply with quote    Back to top    

Your samples indicate bad design for the extract source. There's no two ways about that.

I often harp on such things. I "grew up" in a Cobol environment, where delimiters are avoided and fixed-length fields make reading and writing an exact science. For this, I would point to how some columns are tagged and others are not. This is worse than inconsistent. It's lazy.

Unless you have variable length columns on your data, I suggest fixed-length column definitions. It's the only way to be sure. The ideal "solution" is to push this back on whoever is responsible for the extract and make them create a format that is consistent. You really shouldn't have to customize your read at the column level.

_________________
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 41970
Location: Denver, CO
Points: 215411

Post Posted: Tue Apr 18, 2017 9:34 am Reply with quote    Back to top    

I've seen this before mining Apache logs and as noted, being a log file I doubt there's any wiggle room here for improvement. Can't hurt to ask, though. There used to be a Perl-based module i ...

_________________
-craig

<this space for rent>
Rate this response:  
Not yet rated
SURA



Group memberships:
Premium Members

Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9004

Post Posted: Tue Apr 18, 2017 5:47 pm Reply with quote    Back to top    

ray.wurlod wrote:
Have you tried using a Sequential File stage with space (or 0x20) specified as the field delimiter character? If so, what was the result? ...


Ray

Yes i tried that option . The issue I have here is, a single file which contains 33 different events. Each events have different number of columns.

_________________
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Rate this response:  
Not yet rated
SURA



Group memberships:
Premium Members

Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9004

Post Posted: Tue Apr 18, 2017 5:48 pm Reply with quote    Back to top    

chulett wrote:
SURA wrote:
I am thinking to use to string match and then find a way to load the data by using field function.

Or that string match followed by a substring. Once you find the position of an ...


Yes that exactly i did after i post this query.

_________________
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Rate this response:  
Not yet rated
SURA



Group memberships:
Premium Members

Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9004

Post Posted: Tue Apr 18, 2017 5:57 pm Reply with quote    Back to top    

FranklinE wrote:
The ideal "solution" is to push this back on whoever is responsible for .


You are 100% right.

The issue are...

1. When they gave the sample file, it wasn't that bad. Though the work is messy , still it was manageable.

2. When the scope was extended, then few new files came, then we started to get the trouble. (At the time of writing this tread, i saw further changes!!!)

3. Not involved at the right time.

So at this stage, it like catching a tigers tail!!

As i pointed earlier , i used grep the details based on the STRING and finishing this task.

To me personally i wont prefer this way, but .....

_________________
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 41970
Location: Denver, CO
Points: 215411

Post Posted: Tue Apr 18, 2017 7:12 pm Reply with quote    Back to top    

I don't see any point where you mentioned grep or that you "finished the task"... should we mark this as a "workaround"?

_________________
-craig

<this space for rent>
Rate this response:  
Not yet rated
SURA



Group memberships:
Premium Members

Joined: 14 Jul 2007
Posts: 1229
Location: Sydney
Points: 9004

Post Posted: Tue Apr 18, 2017 9:15 pm Reply with quote    Back to top    

Yes , i am marking this thread as Workaround.

Thanks to all for your time and help.

_________________
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours