Automatic Job Generator - does one exist??

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

jdmiceli
Premium Member
Premium Member
Posts: 309
Joined: Wed Feb 22, 2006 10:03 am
Location: Urbandale, IA

Automatic Job Generator - does one exist??

Post by jdmiceli »

Hi all!

One of the features of Informatica's PowerMart (V5.x), was the rather simple ability to create jobs en masse to quickly be able to just do a raw copy of a database to another. It was extremely limited, but suited our needs fairly well because it didn't have to be real efficient at the time.

I'm looking for any similar utilities that might exist for DataStage. Basically, I want to be able to give it a source and target, point the utility to my metadata and have it build 2 jobs for each table: 1 - extract the data to a sequential file and 2 - load the sequential file to the target. I just want it to build a generic .dsx file to import into my project.

If one doesn't exist, I can't think of a better way to get more into the inner workings of DataStage than to take on a challenge and go play! :shock:

If it doesn't exist, I guess I'm going to need to know if there is a command line utility for extracting and/or loading metadata from source databases to DataStage? I'm sure I'll think of other questions as well as I go along. I'll just post them as I run across them.

One of my primary concerns is to write it in a way that makes it useful for both Unix and Windows environments. I know I'm just running on AIX Unix, but I want it to work on Windows as well so that if anyone else can use it, I can make it available. I'll be writing it on my own time, so we'll see how long this takes. :roll:

As always, any input or existing solutions would be greatly appreciated,

Bestest!
Bestest!

John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services


"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

There is no such code generator, but if your metadata has been loaded into the repository creating such a job as you described should take no more than 1 minute. Drop source type on the canvas, ditto for target. Fill out source/target path and file information, load in the metadata and link the stages. Presto, job finished.

Just as with 4GL generators the devil lies in the detail and seldom does it go so quickly, either with an automated generation program or with a simple DS job.

If you ensure that all OS objects are referred to using UNIX style paths and/or have them as full paths in the job's parameters your DS jobs are fully transferrable between different UNIX flavors and Windows, and if I remember correctly the object code is now also auto-FNUXI'd so even a recompile of object code is no longer necessary for server jobs.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

They do exist, and are called Intelligent Assistants. Essentially a wizard that asks questions about source and target, and builds a simple (or a slightly more complex) job design.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I wrote one a few years ago. I think Craig has written one too. Mine takes a DSX for 4 jobs and replaces the column metadata with the new columns. It assumes basically it is a straight table copy from one table to another. It does assume it is a source table to a dimension table. The table names are global find and replacements.

Jobs:
1. Extract
2. BuildXrefHash
3. Load
4. Sequence to run above jobs

I believe I could do this concept on any DSX. So you could build your own sample jobs and I could use this concept to rip the DSX apart and replace it with new metadata. I believe ProfileStage does something similar. Mine does server jobs. I think this concept would work on PX jobs as well. I forced the user to import the metadata ahead of time.

For a price I might generate your jobs.
Mamu Kim
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Re: Automatic Job Generator - does one exist??

Post by Sreenivasulu »

HI

There is a template feature as well where you can store the standard base
designs of your project as a template

Regards
Sreeni
jdmiceli wrote:Hi all!

One of the features of Informatica's PowerMart (V5.x), was the rather simple ability to create jobs en masse to quickly be able to just do a raw copy of a database to another. It was extremely limited, but suited our needs fairly well because it didn't have to be real efficient at the time.

I'm looking for any similar utilities that might exist for DataStage. Basically, I want to be able to give it a source and target, point the utility to my metadata and have it build 2 jobs for each table: 1 - extract the data to a sequential file and 2 - load the sequential file to the target. I just want it to build a generic .dsx file to import into my project.

If one doesn't exist, I can't think of a better way to get more into the inner workings of DataStage than to take on a challenge and go play! :shock:

If it doesn't exist, I guess I'm going to need to know if there is a command line utility for extracting and/or loading metadata from source databases to DataStage? I'm sure I'll think of other questions as well as I go along. I'll just post them as I run across them.

One of my primary concerns is to write it in a way that makes it useful for both Unix and Windows environments. I know I'm just running on AIX Unix, but I want it to work on Windows as well so that if anyone else can use it, I can make it available. I'll be writing it on my own time, so we'll see how long this takes. :roll:

As always, any input or existing solutions would be greatly appreciated,

Bestest!
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Post by 120267 »

Hi Sreeni,

Based on the requirement. We can write the VB code which will take the template as the input and generate the DSX as output. For this we have to provide the configuration files such as target table, source table.etc. Based on this it will connect to database and fetch the Meta data and place it appropriately.

In our project we are using one template for scd2; initially we find difficulties in using the template. It will take minimum half an hour to develop the job. If there is any metadata change it is very difficult to change the job. So we have written the VB code to automate the process, now we are able to generate the dsx for 100 jobs in a minute.
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Hi,

Can you send the sample code of how you are using the vb code with the Datastage template.

Regards
Sreeni
120267 wrote:Hi Sreeni,

Based on the requirement. We can write the VB code which will take the template as the input and generate the DSX as output. For this we have to provide the configuration files such as target table, source table.etc. Based on this it will connect to database and fetch the Meta data and place it appropriately.

In our project we are using one template for scd2; initially we find difficulties in using the template. It will take minimum half an hour to develop the job. If there is any metadata change it is very difficult to change the job. So we have written the VB code to automate the process, now we are able to generate the dsx for 100 jobs in a minute.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In Designer, select New > Data Migration Job
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I doubt if anyone is going to share these types of tools. First of all it is a difficult thing to support. Second it is a competitive advantage to have tricks like these. If you want to share these then I could post on my web site until the upload feature is released on this site.

I would like to see this code too. Not sure I would want to share my jobs because I do not want to answer questions about it.
Mamu Kim
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Hi Duke,

Thanks for the inputs

I understand the competitive constraints. My only curiosity was whether the suggestion specified was possible or not ?



Regards
Sreeni
kduke wrote:I doubt if anyone is going to share these types of tools. First of all it is a difficult thing to support. Second it is a competitive advantage to have tricks like these. If you want to share these then I could post on my web site until the upload feature is released on this site.

I would like to see this code too. Not sure I would want to share my jobs because I do not want to answer questions about it.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

This functionality was apparently in ProfileStage - the ability to generate DataStage jobs based on target mapping. I never saw it in action. It is supposed to be in the IBM Information Server release via the Rational Data Architect product. In Data Architect you import the source data model, create or import the target data model and then map source to target. You then generate either Federation Server queries or DataStage jobs from the mapping. The Federation server generation is available now but I don't know if the DataStage generation will be available in the first IBM Information Server release or whether we will be waiting for a new version of Rational Data Architect.

Intelligent Assistants are in the Information Server release and they may be better integrated with template than previous versions, the Data Migration Assistant still looks like the only Intelligent Assistant provided with the release:
Creating Jobs Using Intelligent Assistants WebSphere DataStage provides intelligent assistants which guide you through basic WebSphere DataStage tasks. Specifically they allow you to: IBM Confidential
52 Designer Client Guide
v Create a template from a server, parallel, or mainframe job. You can subsequently use this template to create new jobs. New jobs will be copies of the original job. v Create a new job from a previously created template. v Create a simple parallel data migration job. This extracts data from a source and writes it to a target.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In Designer, select New > Data Migration Job - it's there and works in 7.5x2 at least.

ProfileStage allows you to generate a source to target mapping specification (including DataStage expressions) based on its results and/or imported metadata and, from that, can generate the DDL to create the target and a DataStage DSX file that can be imported. I have done this, and it does work. You can always tweak the performance of the generated jobs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jdmiceli
Premium Member
Premium Member
Posts: 309
Joined: Wed Feb 22, 2006 10:03 am
Location: Urbandale, IA

Question about Ray's response

Post by jdmiceli »

Hi Ray,

Thanks for getting back to me. When I go to New to start a new job, the only options I have are
1. Job Sequence
2. Server Job
3. Server Shared Container
4. New Job from Template
5. New Template from Job

Am I just looking in the wrong place? Is this a new feature in something above Server 7.5.1a?

Thanks!
Bestest!

John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services


"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It is only valid for Parallel jobs, from what I recall.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jdmiceli
Premium Member
Premium Member
Posts: 309
Joined: Wed Feb 22, 2006 10:03 am
Location: Urbandale, IA

Post by jdmiceli »

That would explain it :?

The basic functionality I'm looking for is the ability to give the program a DSN and have it go get the user defined table definitions and build two jobs for each table:
1. Source table to sequential file
2. Sequential file to Target table

To keep it simple at first, I'm planning on using the following assumptions:
1. Target db exists
2. Target tables exist with or without data
3. Source and target table structures match
4. Metadata will be the responsibility of the user to pull into DataStage
5. The program will build a .dsx file to be imported into DataStage
6. The initial defaults will be insert only

I'm using these assumptions because they fit the need of one of my cow-workers (Dilbert alert :lol: ) has a database migration coming up where this exact functionality is needed. The database he will be working with has over 1300 tables.

Though Arnd's post is absolutely correct that it would take less than a minute to build a 'single' table job going db to db, our business standard is to land the data for processing so the process can be split into multiple threads if needed for large/wide tables. Also, particular to his project, his data will be coming from different parts of the country (USA) plus Hong Kong and going to the east coast to our NOC for processing and storage. Since this would limit data flow to the slowest database connection and bandwidth, we opt to land the data and process it all on the DataStage server to the SAN, which is also where the DataStage disks are.

As I'm looking at this I want to make sure I leave it extensible to add more functionality like Kim has in his, but for now I only need insert capability. If nothing else this would provide the bulk of what was needed with maybe a touch to each job to add a little functionality or derivation stuff if needed.

Either way it sounds like fun! (I know, I'm warped :roll: )

One question: is there a way to build a metadata file or format that could be directly inserted into the Universe database? I've already mapped all the datatypes (I think) and I'm just wondering if there is a way to automate the insertion of the metadata as well.

Thanks again!
Bestest!

John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services


"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
Post Reply