Datastage JDBC Connector Ingest Data to Apache Kudu

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Datastage JDBC Connector Ingest Data to Apache Kudu

Post by olgc »

Hello every one, does any one ingest data to Hadoop platform (Hive, Impala) or Apache Kudu? How do you resolve the performance issue? We use Datastage JDBC connector with Cloudera Impala JDBC driver, it works well for extraction, but not loading - the insert performance is terrible: 20,000 records takes almost 8 minutes. Has any one there trying ingesting data to Hadoop platform/Apache Kudu? What's your experience?

Thanks,
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hi. I don't have any experience with Kudo, but in past threads here and elsewhere over the years, I've heard many people talk about loading to their hive and hive related tables using direct writes to the hdfs files that they are abstracting....much faster.....usually using the File Connector.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
olgc
Participant
Posts: 145
Joined: Tue Nov 18, 2003 9:00 am

Post by olgc »

Thanks, eostic. Yes, we try use file connector uploading result file to hdfs file system, then through there to Kudu. It'w a work around, not look that good.
Post Reply