Hive Connector Vs ODBC Connector

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
TNZL_BI
Premium Member
Premium Member
Posts: 24
Joined: Mon Aug 20, 2012 5:15 am
Location: NZ

Hive Connector Vs ODBC Connector

Post by TNZL_BI »

Hi All ,

I have recently developed a job to connect to the Hive database in the Hadoop Ecosystem. Now I have used two methods to connect to the Hive database which are :-

1. ODBC Connector
2. Hive Connector

However , I am facing massive performance issues with the hive connector stage . Its taking hours to simply load some 80k rows where as when I use the ODBC connector stage , the performance is very good. We see this getting loaded in around 5 minutes time.

Does any one have an idea on this. Ideally the native connector stage should be faster and should have more options but in my case , the performance is really bad ...

Any inputs here will be very helpful.
TNZL_BI
Premium Member
Premium Member
Posts: 24
Joined: Mon Aug 20, 2012 5:15 am
Location: NZ

Post by TNZL_BI »

I have just got some patches to be installed on my services / engine tier as suggested by IBM . This may improve the speed. Will do that and then revert back with my findings
AnnDSX
Participant
Posts: 4
Joined: Mon Dec 04, 2017 2:12 am

Hive Connector Vs ODBC Connector

Post by AnnDSX »

Hello,

Did you install the patches and see performance enhancement

Thanks
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

Hive connector leverages JDBC connectivity.

We are using both ODBC Connector and Hive Connector for connect with Hive and have not seen much difference between the performance of the two.
AnnDSX
Participant
Posts: 4
Joined: Mon Dec 04, 2017 2:12 am

Post by AnnDSX »

We are using the FileConnector for moving the files to HDFS and the performance is fair. However the performance of Hive connector is dismal.

The best that we could achieve was writing 1000 records in 20 minutes.
Post Reply