Oracle GroupBy wrong output
Moderators: chulett, rschirm, roy
Oracle GroupBy wrong output
Hi.
I'm working with Datastage for years and I've never faced a problem like that.
I'm loading a dwh aggregatate table. The job is very simple
OCI (User Defied Sql) ---> seq file --->OCI (Truncate / Insert)
select field1, field2, field3, sum(field4), sum(field5)
from table
group by field1, field2, field3
The SQL launched by Datastage produces a wrong output for some group, extracting couple record for the same group by fields:
field1 field2 field3
a b c 10000 15000
a b c 20000 40000
The same query launched with SQL Developer has the right output:
field1 field2 field3
a b c 30000, 45000
Oracle db version is 10g (10.2.0.5.0).
I've read about problem related to Oracle Algorithm HASH GROUP BY.
Anyway the question is why SQL Developer gets the right result and OCI not ?
Many thanks for your help.
Andrea
I'm working with Datastage for years and I've never faced a problem like that.
I'm loading a dwh aggregatate table. The job is very simple
OCI (User Defied Sql) ---> seq file --->OCI (Truncate / Insert)
select field1, field2, field3, sum(field4), sum(field5)
from table
group by field1, field2, field3
The SQL launched by Datastage produces a wrong output for some group, extracting couple record for the same group by fields:
field1 field2 field3
a b c 10000 15000
a b c 20000 40000
The same query launched with SQL Developer has the right output:
field1 field2 field3
a b c 30000, 45000
Oracle db version is 10g (10.2.0.5.0).
I've read about problem related to Oracle Algorithm HASH GROUP BY.
Anyway the question is why SQL Developer gets the right result and OCI not ?
Many thanks for your help.
Andrea
What Oracle client version are you using? While your target DB might be 10gR2 there can be two other clients involved - the one on your ETL server (assuming the target instance is located elsewhere) and the one on your PC that you used with SQL Developer. What versions are they? They could all be different.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Hi Craig
I've mentioned SQL Developer but it's not pertinent because according to more recent tests the problem seems related to the output dimension.
Moreover the source and target db are the same and the Oracle Client on Datastage Server is the same for years...so I don't think it's a client version problem....
In SQL Developer I've always launched filtered query reducing the output dimension in a way.
At the same way when I launch the query with Datastage OCI using filters on the values causing the problem, this doesn't occur.
But as soon as I launch the OCI query on the full output I get the error.
Maybe the problem could be related to Oracle temporary areas ...
I've recently added a second file to TEMP tablespace and maybe not in the rigth way.
This is the script:
I've also read a problem with HASH GROUP BY Oracle algorithm ...
Thanks,
Andrea
I've mentioned SQL Developer but it's not pertinent because according to more recent tests the problem seems related to the output dimension.
Moreover the source and target db are the same and the Oracle Client on Datastage Server is the same for years...so I don't think it's a client version problem....
In SQL Developer I've always launched filtered query reducing the output dimension in a way.
At the same way when I launch the query with Datastage OCI using filters on the values causing the problem, this doesn't occur.
But as soon as I launch the OCI query on the full output I get the error.
Maybe the problem could be related to Oracle temporary areas ...
I've recently added a second file to TEMP tablespace and maybe not in the rigth way.
This is the script:
When the OCI query is running Oracle fills first the second file TEMP02.DBF (5 GB added few days ago) and then the the orginal TEMP file TEM01.DBF (32 GB).-- 21/7/2014
alter TABLESPACE "TEMP"
add tempfile 'D:\ORACLE\BISE1\ORADATA\BISE1DB\TEMP02.DBF'
SIZE 4096M REUSE AUTOEXTEND
ON NEXT 1024M MAXSIZE 32767M
;
I've also read a problem with HASH GROUP BY Oracle algorithm ...
Thanks,
Andrea
I mentioned the client issue because I have seen it cause problems so I thought it was worth checking. You've said they haven't changed for years but that doesn't really answer the question as to the version number of each, up to you if you want to double-check on that. And can't really help with the DBA questions on the temp space but don't really think that would be an issue either.
Not really sure what this means. However, seems like you'll need to involve support of some kind for this. Probably DataStage at first and then perhaps Oracle as well seeing as how DS is just a client and really just uses OCI and SQLNet like any other tool... other than the binding of the output columns into the data flow, of course.ponzio wrote:according to more recent tests the problem seems related to the output dimension.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
You're right. I need to contact support.chulett wrote:However, seems like you'll need to involve support of some kind for this. Probably DataStage at first and then perhaps Oracle as well seeing as how DS is just a client and really just uses OCI and SQLNet like any other tool... other than the binding of the output columns into the data flow, of course.
But first of all I'll try to launch the full query with Sql Plus and with another etl tool to confirm it's not a client problem.
Thanks.
Andrea
Just a note to say that I've seen "client issues" with Oracle where everything works great for years and then one specific thing doesn't work properly - something similar to what you might be seeing. Oddly enough, there can be bugs in Oracle releases and you may find that there are known issues with the exact version you are on and may need to update to a higher maintenance release.
For whatever that is worth.
For whatever that is worth.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi Ray
Anyway, the problem is not in Datastage because it occurs also in SQL Developer , SQL*Plus and using other ETL tool.
Oracle gives a bigger numer of record than the correct one, in fact Aggregator Stage (after OCI or Connector) can reduce the number of rows from 4022708 to 3966088 rows (aggregating on the same query group by columns).
This could be a work around but it's not acseptable in my opinion.
I've requested Oracle support.
Many thanks
Andrea
sorry for the question but how can do it ?ray.wurlod wrote:Try forcing the Oracle SELECT stage to execute in sequential mode. ...
Anyway, the problem is not in Datastage because it occurs also in SQL Developer , SQL*Plus and using other ETL tool.
Oracle gives a bigger numer of record than the correct one, in fact Aggregator Stage (after OCI or Connector) can reduce the number of rows from 4022708 to 3966088 rows (aggregating on the same query group by columns).
This could be a work around but it's not acseptable in my opinion.
I've requested Oracle support.
Many thanks
Andrea