DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
sachinshankrath
Participant



Joined: 29 Mar 2010
Posts: 7
Location: WASHINGTON, DC
Points: 78

Post Posted: Fri Jun 15, 2018 12:56 pm Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Hi -

We are on Datastage v11.5.2 Parallel / Unix Platform and are trying to write to Hadoop using Hive Connector but we are not able to establish a connection! Have you guys done this before and have been successful in writing using Hive Connector? Please share your tips because at this stage, any tip helps!!

- Sachin
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42984
Location: Denver, CO
Points: 221720

Post Posted: Fri Jun 15, 2018 2:41 pm Reply with quote    Back to top    

Exclamation

You posted this in three places. One works just fine. I deleted the other two.

_________________
-craig

Help I'm steppin' into the twilight zone, place is a madhouse, feels like being cold
My beacon's been moved under moon and star, where am I to go now that I've gone too far?
Rate this response:  
Not yet rated
sachinshankrath
Participant



Joined: 29 Mar 2010
Posts: 7
Location: WASHINGTON, DC
Points: 78

Post Posted: Sat Jun 16, 2018 6:41 pm Reply with quote    Back to top    

Sorry, could not figure out how to post my own topic at first. Thank you for deleting the other two.
Rate this response:  
Not yet rated
asorrell
Site Admin

Group memberships:
Premium Members, DSXchange Team, Inner Circle, Server to Parallel Transition Group

Joined: 04 Apr 2003
Posts: 1694
Location: Colleyville, Texas
Points: 23058

Post Posted: Thu Jun 21, 2018 11:48 am Reply with quote    Back to top    

Two questions:

1) Does your site use Kerberos for security? If so, I might be able to help (I have only worked at Kerberos sites).

2) Have you or your admin setup the config file for Hive?

https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.hive.usage.doc/topics/hive_config_driver.html

The file should look like this (but with correct installation location for your site) assuming you are using the Hive library that comes with BigIntegrate:

$ pwd
/InformationServer/Server/DSEngine
$ cat isjdbc.config
CLASSPATH=/InformationServer/ASBNode/lib/java/IShive.jar
CLASS_NAMES=com.ibm.isf.jdbc.hive.HiveDriver

Answer those two questions first... Then I might be able to assist with getting the stage working (assuming Kerberos is used).

_________________
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2017
Rate this response:  
Not yet rated
skathaitrooney
Participant



Joined: 06 Jan 2015
Posts: 101

Points: 904

Post Posted: Fri Jun 22, 2018 6:31 am Reply with quote    Back to top    

Andy, can i ask you a question, are you able to connect to Hive Server2 using ISHive.jar that is shipped with IIS itself?

I also have a kerberos setup. I get this error.
java.sql.SQLException: [IBM][Hive JDBC Driver]THRIFT protocol error.



This is the jdbc URL is use:

Code:
jdbc:ibm:hive://hivehostname:2181;DataBaseName=test;AuthenticationMethod=kerberos;ServicePrincipalName=datastage@XXX.NET;loginConfigName=JDBC_DRIVER_dsadm_keytab


Here is the JDBCDriverLogin.conf that is i created in the same folder as ISHive.jar

Code:
JDBC_DRIVER_dsadm_keytab {
com.ibm.security.auth.module.Krb5LoginModule required
credsType=both
principal="datastage@XXX.NET"
useKeytab="FILE:/etc/security/keytabs/datastage.hdfs.headless.keytab";
};
JDBC_DRIVER_cache{
com.ibm.security.auth.module.Krb5LoginModule required
credsType=initiator
principal="datastage@XXX.NET"
useCcache="FILE:/tmp/krb5cc_22367";
};




The above does not work with the Hive connector for me. Although beeline client does work. I am running this with dsadm

Code:

kinit -kt /etc/security/keytabs/datastage.hdfs.headless.keytab datastage
beeline --verbose=true -u "jdbc:hive2://hivehostname:2181/hive;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"


And i am absolutely clueless what i do wrong here.....
Rate this response:  
Not yet rated
asorrell
Site Admin

Group memberships:
Premium Members, DSXchange Team, Inner Circle, Server to Parallel Transition Group

Joined: 04 Apr 2003
Posts: 1694
Location: Colleyville, Texas
Points: 23058

Post Posted: Fri Jun 22, 2018 12:00 pm Reply with quote    Back to top    

Yes we connect to Hive Server2, but our URL looks slightly different...

jdbc:ibm:hive://hiveserver.company.com:10000;AuthenticationMethod=kerberos;ServicePrincipalName=hive/hiveserver.company.com@KERBEROS_DEFAULT_REALM;loginConfigName=JDBC_DRIVER_USERID

Look in /etc/krb5.conf under libdefaults for the Kerberos default realm. Also - the USERID seems to be case sensitive and should be upper case.

I also recommend debugging in "non-YARN mode" because it makes it simpler. To do that use an APT file that only runs nodes on the Edge Node (no Dynamic Hosts). Also set your APT_YARN_CONFIG to the empty string and APT_YARN_MODE to 0 (zero) in your job. That should make it run edge-node only.

Hopefully that will get your connection up and running on the edge node. If that works, then there are probably other steps that have to be done to get the keytab dispersed to the data nodes for "YARN mode".

_________________
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2017
Rate this response:  
Not yet rated
sachinshankrath
Participant



Joined: 29 Mar 2010
Posts: 7
Location: WASHINGTON, DC
Points: 78

Post Posted: Wed Jan 23, 2019 11:47 am Reply with quote    Back to top    

Hi -

Sorry for the 6 months delay but we abandoned this project and restarted working on it now. So, at this stage the update is we were able to successfully establish a connection using Hive Connector in the sense that the "TEST" connection returns a successful message. Then we built a simple test job to read a few records from a flat file and write to a target table in Impala using Hive Connector. The job does not return any error messages but we find that it does not load any rows into the target table either. What could be going on? Again, we are on v11.5 Parallel datastage running on a linux box.
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54524
Location: Sydney, Australia
Points: 295662

Post Posted: Wed Jan 23, 2019 1:55 pm Reply with quote    Back to top    

Just to add to this thread I, too, have been bitten by the case sensitive user ID both in this case and in certain connections to/from Enterprise Search.

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
pkhadapk
Participant



Joined: 28 Aug 2009
Posts: 1
Location: Mumbai
Points: 6

Post Posted: Wed Feb 27, 2019 6:54 am Reply with quote    Back to top    

I am using format :

jdbc:ibm:hive://hiveserver.company.com:10000;AuthenticationMethod=kerberos;ServicePrincipalName=hive/hiveserver.company.com@KERBEROS_DEFAULT_REALM;loginConfigName=JDBC_DRIVER_USERID

in JDBC connector and able to get through for limited records and using default queue. Is there any way to set queue in above connection string?
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours