Failed to create FreqDist Summary

This forum contains ProfileStage posts and now focuses at newer versions Infosphere Information Analyzer.

Moderators: chulett, rschirm

Post Reply
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Failed to create FreqDist Summary

Post by chesterkchu »

I am trying to run column analysis in Information Analyzer on one column, but am getting the error "Failed to create FreqDist Summary for Error in post-process for DIM_TEMP". I have attached the SystemOut.log and systemErr.log messages. Any ideas on what is happening? Thanks!

systemErr.log:
"[3/27/12 15:47:32:633 PDT] 00000041 SystemErr R com.ascential.investigate.exception.CreateFrequencyDistributionException: Failed to create FreqDist Summary for Error in post-process for DIM_TEMP
[3/27/12 15:47:32:633 PDT] 00000041 SystemErr R at com.ascential.investigate.ca.job.BaseProfileJob.doPostProcess(BaseProfileJob.java:834)
[3/27/12 15:47:32:633 PDT] 00000041 SystemErr R at com.ascential.investigate.utils.jobs.JobProcessor.execute(JobProcessor.java:156)
[3/27/12 15:47:32:633 PDT] 00000041 SystemErr R at com.ascential.investigate.auth.engine.Task.execute(Task.java:93)
[3/27/12 15:47:32:633 PDT] 00000041 SystemErr R at com.ascential.investigate.auth.engine.Worker.run(Worker.java:84)
[3/27/12 15:47:32:633 PDT] 00000041 SystemErr R at java.lang.Thread.run(Thread.java:736)"


SystemOut.log:
[3/27/12 15:47:03:321 PDT] 0000002f SystemOut O Schedule: id = d70c6594.80cb2b5c.qji3clin9.a07u9rc.1dt0mu.kbp1h5sablkbt1vh55; get_xmeta_created_by_user()= testuser ; getCreator=testuser --- isCreatorSystemUser=false
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Is this the first time you have attempted to run column analysis?

If you've run it successfully before, check your tablespace for IADB. If you have separate tablespaces for data and indexes, best to check both. They can get pretty big if you do a lot of profiling.

If it is the first time, it could be a few things. To start with, make sure you have a DB user that can connect to IADB and that is has been set up and tested in the project properties.
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

This is the first time attempting to run column analysis.

The project properties for Analysis Engine tab was validated successfully and the Analysis Database tab/Validate Engine and Client Connection were both validated successfully.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Might still be worth checking the DB manually using a standard DB tool, to make sure the assigned user can do the other stuff, in case the only thing the connection test tests is the ability to connect.

Did you also check the size of the IA DB tablespace[s]?
I once got this, straight off the bat. Some ridiculous size that was exceeded on our first table.

Are there any native database errors in the log?
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

Hi Stuart,

I did check the user and it does have the correct credentials to the DB (which is admin rights).

I did not check the tablespace size, but I am only running analysis of one column on a table that has only 10 records on it.

There were no native database errors in the log.

Thanks for your help!
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

I checked the tablespace and there seems to be sufficient enough of space allocated.

In running the analysis and retaining scripts option on, I am able to see that the job successfully completes. However, the log says that it drops table/creates table/inserts 1 record successfully. When checking the actual IADB db, there is no table created. When I extract the actual create statement from the OSH script and run the create statement manually, it works and it physically creates the table. Then I run the analysis over and it is successful! No changes to the options, just re-running the analysis. What is going on???

Successful run scenario steps:
-run column analysis on one column in one table with sample of 1
-fails with "Failed to create FreqDist Summary"
-get create table statement from OSH script
-run create table statement manually from DB editor
-re-run column analysis for same column
-SUCCESSFUL

Another scenario:
-run column analysis on one column in one table with sample of 1
-fails with "Failed to create FreqDist Summary"
-get create table statement from OSH script
-run create table statement manually from DB editor
-re-run column analysis for same column with no sample
-FAILED with same error "Failed to create FreqDist Summary"
-re-run column analysis for same column with sample of 1
-SUCCESSFUL

Two questions now are:
-Is this a bug in IA that the job says that the table was created, when clearly the table was not? This is not a viable option to always create the tables manually.
-Why does using a sample create a successful run when there is already data in the table?

Thanks guys!
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

After closer look, the column analysis says Analyzed after the steps I took in previous post, but when I open the analysis I see that there are 0 total rows. I tried running with no Data Sampling and another time with Data Sampling of 2000. The table that I am doing analysis on has ~1000000 records. The connection credentials were tested to this DB.

Any ideas on why the analysis could not see any of the records? Much appreciated!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you really connecting the the source you THINK you're connecting to?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

ray.wurlod wrote:Are you really connecting the the source you THINK you're connecting to? ...
In the Connections tab of running the column analysis, I have the Data Source Name listed correct.

In the Director log, I see in the OSH script of the BaseProfile_ColumnAnalysisTask88423806 job with the correct TableName, Database, Username defined. The select sql also looks correct, in that it selects from the right schema.

Any ideas on where else I can check for the connection strings?
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

In this job log, I see the following messages that might help:

pxbridge: No design time SQL datatype provided for field CITY
pxbridge: The length of WVARCHAR column CITY cannot be validated because the database column is VARCHAR and character set conversion is involved. Inadequate column lengths can lead to data truncation or unexpected errors.
pxbridge,0: The number of records returned (20,005) has met the specified limit
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In your design make CITY a VarChar with Unicode extended property enabled.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

ray.wurlod wrote:In your design make CITY a VarChar with Unicode extended property enabled. ...
What do you mean by design?

When I run column analysis with a smallint datatype, I get this message only:
pxbridge: No design time SQL datatype provided for field ACCT_PD_STATUS_CD

Thanks Ray!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you imported the table definition for this source? I guess you must have. What are the defined data types for these columns.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chesterkchu
Participant
Posts: 35
Joined: Mon Nov 15, 2004 9:46 am

Post by chesterkchu »

ray.wurlod wrote:Have you imported the table definition for this source? I guess you must have. What are the defined data types for these columns. ...
Yes, this was an import of physical data model from Sybase PowerDesigner. The defined data types for CITY is VARCHAR(50) and ACCT_PD_STATUS_CD is SMALLINT(2) on the model.

In Metadata Workbench, the Design Column and Database Column definition of CITY is Native Type VARCHAR(50), ODBC Type VARCHAR, Data Type STRING, Length 50. The Design Column and Database Column definition of ACCT_PD_STATUS_CD is Native Type SMALLINT, ODBC Type SMALLINT, Data Type INT16, Length None.
Post Reply