DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 146

Points: 1542

Post Posted: Wed Jan 18, 2017 3:47 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Hi All,

We are using JDBC Connector to connect to the remote Database using oracle thin client using Type 4 connection.

When we try to extract the data from the Database....the fetch or rows/sec is very low(800-1000 rows/sec).

Properties:
FetchSize:2000
HeapSize:256

Kindly let me know your thoughts to increase the performance of the data extraction.

Thanks,
Satheesh.R
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 41779
Location: Denver, CO
Points: 214289

Post Posted: Wed Jan 18, 2017 7:49 am Reply with quote    Back to top    

You need to determine the bottleneck. Rather than a magic setting in the connector it could be the source database, complexity of the source SQL, network issues or just how long it takes to write to the target. Eliminate each one by one.

_________________
-craig

The black queen chants the funeral march, the cracked brass bells will ring
To summon back the fire witch to the court of the crimson king
Rate this response:  
PaulVL



Group memberships:
Premium Members

Joined: 17 Dec 2010
Posts: 1039

Points: 6935

Post Posted: Wed Jan 18, 2017 2:42 pm Reply with quote    Back to top    

Also, are you doing a partitioned read?

Rows per second is misleading if you do not indicate what your row width is. 800 rows per second when you have 100 bytes per row is terrible. But if you have 100K bytes per row then 800 is pretty darn good.
Rate this response:  
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 146

Points: 1542

Post Posted: Thu Feb 16, 2017 5:18 am Reply with quote    Back to top    

Hi paul,

Can we use partition read for non-partitioned tables.When I try to read data on partitioned read, the count shows a huge variation and it fetches much faster 25k rows/sec. But..i am really worried about the huge count mismatch.



Kindly let me know your thoughts on the same.


Regards,
Satheesh.R
Rate this response:  
leandrohmvieira
Participant



Joined: 02 Sep 2015
Posts: 36
Location: Brasilia, Brazil
Points: 371

Post Posted: Thu Feb 16, 2017 6:18 am Reply with quote    Back to top    

Hello Satheesh, you can use partition reads on non-partitioned tables for sure, there are partions methods like round-robin which divides the table evenly.

You just need to be aware of your ETL design, some stages need specific partition types to work, so if you have a complex ETL design, you data may need be repartitioned, leading to new bottlenecks.

But if its a simple source->target design, it will be fine. Just make a validation/count on target based on some key after the operation to check.

_________________
Leandro Vieira

Kyros Tecnologia- Uberlândia, Brazil
Rate this response:  
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 146

Points: 1542

Post Posted: Thu Feb 16, 2017 7:46 pm Reply with quote    Back to top    

Hi,

We were using simple extraction jobs from JDBC to a Dataset(Using Hash Partitioning) then it will be used in another jobs for join/lookup. Still i could see the count mismatch(huge difference) when i try to run the job with Partitioned read=YES.


Appreciate for your help!!!



Regards,
Satheesh.R
Rate this response:  
Not yet rated
PaulVL



Group memberships:
Premium Members

Joined: 17 Dec 2010
Posts: 1039

Points: 6935

Post Posted: Fri Feb 17, 2017 8:22 am Reply with quote    Back to top    

What is the count mismatch you are talking about? Are you talking about the end result in the quantity of rows obtained? Or simply the rate at which you are extracting from Oracle?
Rate this response:  
Not yet rated
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 146

Points: 1542

Post Posted: Sat Feb 18, 2017 12:07 am Reply with quote    Back to top    

Hi Paul,

Count mismatch is i could see the huge difference in extracting as well as Dataset rows.

Eg: My source table has the count of 1000, when i try to run with 4 nodes with partition read enabled...the extraction count exceeds more than 5000 rows.


Regards,
Satheesh.R
Rate this response:  
Not yet rated
Mike



Group memberships:
Premium Members

Joined: 03 Mar 2002
Posts: 970
Location: Austin, TX
Points: 6243

Post Posted: Sat Feb 18, 2017 9:17 am Reply with quote    Back to top    

Check with your support provider to determine if there are any known issues with partitioned read in the JDBC Connector.

Add $CC_MSG_LVL to your job parameters and set it to 1, 2, or 3 to enable some connector tracing to be recorded in the job log. I usually start with a setting of 2. You may have to add it as a user defined project environment variable if you have not done so previously (leave the default empty so as not to generate excessive logging across your entire project).

In the job log you will see the actual query that was passed to the database. A partitioned read usually injects some additional logic in the WHERE clause. If you have anything more complex than a single table SELECT, the partitioned read may not work properly. You can determine that by evaluating the generated query.

Mike
Rate this response:  
Not yet rated
PaulVL



Group memberships:
Premium Members

Joined: 17 Dec 2010
Posts: 1039

Points: 6935

Post Posted: Mon Feb 20, 2017 9:33 am Reply with quote    Back to top    

Start to debug in a phased approach.

chop your job up into pieces.

Oracle Connector -> dataset
Oracle Connector -> sequential file

With 1000 rows in your source table, I would have expected maybe 4000 rows on your output if your partitioning was entire, but ... 5000ish?!? that is odd.

When faced with an issue, I like to break it down into it's smallest components. That way you remove "noise" from the equation. If it's your Oracle read that you suspect is wrong, then just do a read in your job, no other transforms, etc...
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours