Page 1 of 1

Can we enable RCP in a Dataware Environment or not ?

Posted: Mon Dec 22, 2014 10:32 am
by dgokulakrishnan
Dear All -

Its a general question.

We are developing a Dataware house environment by using Data stage-Linux-Netezza.

Most of the times our source would be "views" and without transformations the data would be mapped into Target Tables.

There are few transformations, but we are handling in the Netezza "View" itself.

So plan to create a common Data stage JOB and pass the following as the parameters.

1) Source View Name
2) Source DB Name
3) Target Table Name
4) Target DB Name

We plan to use the common JOB to load around 50 tables.

Just wanted to know whether is it good practice to enable the RCP in the Data ware House environment which is scheduled to run daily.

Please share your valuable thoughts.

Posted: Mon Dec 22, 2014 10:39 am
by chulett
Your "without much transformation" statement is worrisome. If you had said "without any transformations" then I would agree that RCP would be the way to go.

Posted: Mon Dec 22, 2014 10:48 am
by dgokulakrishnan
Thanks Craig. And I have modified the original post.

Posted: Tue Dec 23, 2014 2:14 pm
by cppwiz
DataStage is for building transformations. If you are only using it as a table copy tool, then it will work but you are using the wrong tool for the task. RCP will add some minor overhead, but it shouldn't be noticeable.

Can you pound in a nail with a screwdriver? Yes, but I recommend a hammer.

Can you get a screw into a piece of wood with a hammer? Yes, but I recommend a screwdriver.

Can you use DataStage as a copy/replication tool? Yes, but I recommend a bulk copy utility (nzload) or InfoSphere CDC.

Posted: Wed Dec 24, 2014 4:47 am
by priyadarshikunal
Yes you can use RCP if the transformations are already taken care of in netezza views.

On the other note I do like the comment of cppwiz.

Posted: Wed Dec 24, 2014 7:07 am
by qt_ky
Speaking of using the correct tool for the data movement job, have you all tried InfoSphere Data Click?

It provides the user self-service data provisioning through a simple web based interface. The admin has to setup what the user is allowed to move, including limits, if desired.

Data Click generates and executes the DataStage jobs. Whether or not such jobs use RCP, you may choose to investigate (or not)...

Data Click is included in the newer versions of Information Server 9.x and 11.x, although I am not sure which edition(s). Maybe you will end up recommending it!

In a Data Warehouse environment, you could setup Data Click to allow a power user to provision their own data marts on demand, as subsets of the data warehouse.

Posted: Wed Dec 24, 2014 8:25 am
by chulett
And just to be anal, being in a "Data Warehouse environment" really isn't a consideration here. Regardless of environment, if the restrictions that RCP put on your processing don't rule it out you can certainly use it for data movement.

Posted: Wed Dec 24, 2014 8:46 am
by rschirm
From what you are describing RCP will work just fine. Performance will not be an issue as there is no additional overhead unless your target does not have all of the same column names. The only impact that you will have would be if you are utilizing Metadata Workbench as the RCP is not seen in the lineage.

Posted: Thu Dec 25, 2014 10:27 am
by eostic
Unless you are using 11.3, where rcp is fully sipported in lineage.

DataStage with rcp is great for buikding a single Job that can perform a vast amount of table to table moves....

.....and yes, data click makes extensive use of rcp.

Ernie

Posted: Mon Jan 26, 2015 7:52 am
by trenicar
eostic wrote:Unless you are using 11.3, where rcp is fully sipported in lineage.
We are using 11.3.1 and i have tested this but cannot see the columns being populated? Can you explain what we can expect from this , and is there anything different we have to do

Posted: Wed Jan 28, 2015 1:44 am
by trenicar
We have subsequently found that there is a IGC Rollup patch no 6 that fixes this problem, maybe not perfectly, but better