ftping an excel file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
datastagedw
Participant
Posts: 53
Joined: Fri Mar 07, 2008 1:17 am

ftping an excel file

Post by datastagedw »

Hello All,

I have an excel file(.xls) lying on a remote m/c. i am trying to use FTP plug in stage to bring it to my server which is a linux box. Now this remote m/c is a windows based application. How shall i define the format and metadata in the ftp stage to be able to ftp the file from the remote. Shall i use tab as the column delimeter? and what abt the column definition. shall i give the columns in the file or shall i read it as a single column?

Please suggest

Thanks
ETL DEVELOPER
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You cannot FTP a remote Excel file from inside DataStage and be able to parse the contents. Do you have an ODBC or other driver on your Linux server which can read excel files?
datastagedw
Participant
Posts: 53
Joined: Fri Mar 07, 2008 1:17 am

Post by datastagedw »

ArndW wrote:You cannot FTP a remote Excel file from inside DataStage and be able to parse the contents. Do you have an ODBC or other driver on your Linux server which can read excel files? ...
Thanks for the quick reply. Do you mean to say that with ODBC connector we can connect to a remote m/c without actually using an ftp stage? Because the first activity to happen is to bring the file from that server to our server.

Please clarify
ETL DEVELOPER
jgreve
Premium Member
Premium Member
Posts: 107
Joined: Mon Sep 25, 2006 4:25 pm

what kind of Excel data are you working with?

Post by jgreve »

datastagedw wrote: Thanks for the quick reply. Do you mean to say that with ODBC connector we can connect to a remote m/c without actually using an ftp stage? Because the first activity to happen is to bring the file from that server to our server.
Please clarify
remote m/c = "remote machine" ?


1) what kind of Excel data are you working with?

1a) Is your data in an XLS file? That is pretty much a binary format and is unreadable by DataStage directly - one way to handle that is what Andrew said about setting up your Excel file as an ODBC data source.
You might want to start here - this isn't something I have experience with, but Google is pretty helpful for looking up reference info like this: http://support.microsoft.com/default.as ... US;Q195951&

1b) Is your data in CSV format? This is something you could process in
data stage by reading from a sequential file.

2) r.e. metadata, I don't know of a metadata importer for excel; I suspect you'll need to declare the columns on your own with 1b - but the odbc driver should import like any other table defn with 1a.

3) What kind of processing are you doing?
a one-time load?
low frequency updates?
something else?

The FTP idea suggests you want a repeatable process; for example, pull Bob's SalesPipepline.xls file into the data warehouse for every month-end.
Post Reply