IOD EMEA Day Two

ray.wurlod · Post by **ray.wurlod** » Thu May 20, 2010 3:24 am

It's a very busy conference, with 560 presentations over the three days. Obviously no one person can get to very many of them at all.

Alas (even though it's now day three) I have not been able to assemble my notes to post here. I will get around to doing so.

No roadmap presentations thus far.

Please stay tuned. Off to another presentation!

(Incidentally, the main keynote session had to be delayed this morning because, due to Rome traffic, ten coaches full of attendees staying at other hotels had not arrived!)

The other reason I haven't posted is that I refuse to pay the exorbitant internet charges levied by the hotel.

ray.wurlod · Post by **ray.wurlod** » Sat May 22, 2010 2:21 am

I have not forgotten you all, but it's been a hectic three days. Right now I'm posting from Rome airport, but there's only 30 minutes before my flight departs.

Essentially the message builds on previous IBM direction. "Building a smarter planet" has morphed, at least for customers and business partners, into "the decade of smart". (I have a good name for a product for the decade of smart, namely AGENT86, but no product to go with the name - maybe a data quality monitor of some kind.) You will also hear IBM talking about the Information Server platform from now on, as a basis on which to build the Information on Demand strategy.

There were some roadmap sessions - they deliberately held them till later in the conference - and I will post about these separately.

The big news for Information Server users is that they're having an open (as in "everyone's invited") beta for the next version, beginning June 1st. So sign up if you're interested in seriously trying to break the new version, or at least to break it in, play with the new functionaity (including a much easier installer - at least allegedly, as I didn't get a chance to play with it). There will be three drops in the beta program, the second mid-July and the third probably September/October.

Well, the plane's just been called, stay tuned (or come back) for more later.

chulett · Post by **chulett** » Sat May 22, 2010 6:08 am

Thanks Ray! :D

ray.wurlod · Post by **ray.wurlod** » Sat May 22, 2010 9:28 am

The usual caveat from IBM to begin with - what follows is an indication only of what's on plan - nothing is promised for sure, and there is no legal obligation incurred, blah, blah, blah.

Second, there's a long open beta to happen before the next version of Information Server (Stewart Hanna said he's not being coy about the version number - he's simply not going to tell me what it is) goes GA. If you want to get into the beta program - in which you WILL undertake to test stuff and report back - but don't have a local IBM contact, get in touch with Sue Cofer (scofer@us.ibm.com) who is co-ordinating this beta program. The beta starts on June 1st, so don't procrastinate!

What follows is my synthesis of a number of sessions. Any errors in it are mine, unless erroneous information was given in the first place by the IBM presenters.

What they're trying to do with the next version is to make things even more seamless across the Information Server suite, as well as adding some new functionality of course. The term "integration acceleration" was bandied about a bit, to put across the idea that these tools assist the collaborative effort in building a complete information integration strategy for an organization, leading to faster delivery and faster ROI (if done right).

Some of the items were just (or seemed to be) "throwaway lines", for example one of the goals is "improve operational management". Not really sure what this is, unless it's some of the new command line interfaces and integration with source code control.

Many of what follow were mentioned at the IOD 2009 conference, but in some cases more concrete information has been added.

Information Server

High availability and clustering support will be available for the services and repository tiers.

Performance improvements for larger teams of developers - fewer contentions in XMETA primarily.

Improved end-to-end XA support (whatever that is).

A new grid toolkit. Users on grid will more easily be able to tweak the environment and use the grid management software.

Optimizations to balanced optimization (this is the part where logic is pushed out from DataStage into database servers based on a cost-based optimizer).

Support for DB2 version 9.7, Oracle version 11g for Repository.

Operational Management

Reverse engineering in FastTrack (that is, the ability to generate mappings from a DataStage job design).

Direct tooling integration with Clear Case and CVS (use REST API or Eclipse plug-in for other SCCS systems) from Information Server Manager. Note that there is no support planned any time soon for direct check-out/check-in from DataStage and QualityStage Designer.

Simplified installation, configuration tools. This includes a fully integrated prerequisite check in the installer, the ability to effect incremental install or uninstall of suite products and checkpoint/resume capability in the installer.

Command Line Interfaces

Recognizing that some places use third party schedulers for all kinds of things, we may (probably will) see the following.

iaJob - command line for running Information Analyzer tasks.

ProcessEnvVariables.sh and RunWorkbenchLinkage.sh scripts for Metadata Workbench

DSXImportServices (server side import).

Command line interface to new migration utility (not sure whether this is istool or something else).

No further details available on these (sorry).

DataStage

Distributed Transaction Stage (released in 8.1.1?) uses MQ transaction manager to effect distributed transactions, two-phase commit, and so on. Available for DB2, Teradata and Oracle. More capabilities including being able to use MQ and ODBC stages as target. (There's a lot of code in this stage that also exists in the MQ Connector stage.)

Transformer Stage

Looping (loop variables separate from stage variables). This allows, for example, multiple output rows to be generated on one output link from a single input row.

End of data detection.

New system variables and functions for these: @EOD, @ITERATION, LastRowInGroup()

Input cache - the ability to cache input rows for comparative processing. Functions SaveInputRecord(), GetSavedInputRecord()

Stage variables to be optionally nullable. More options for null handling. Most in-built functions to be tolerant of null.

Other new functions such as IsValidTime(), NthWeekdayFromDate(), DecimalToTimestamp(), based on customer requests. See, they DO listen.

Data Set Stage

New operational behaviour to handle input records that may be missing columns defined in record schema:
- ignore
- fail
- default non-nullable
- default nullable
- default all

Sequential File Stage

New option to set Null Field Value globally rather than having to do it column by column. (Existing restrictions still apply for fixed-width data.)

PX Pivot Stage

New ability to perform vertical pivot. (Uses same code base as that used to allow looping in Transformer stage.)

zOS File Stage (NEW)

Read (and sometimes write) directly to mainframe files, leveraging Classic Federation [and means that you have to license the file capability of Classic Federation]. Read/write for sequential file VSAM structures (KSDS, ESDS, RRDS), sequential QSAM, sequental read only BDAM/BSAM files. Single or multiple record types. Automatically handles conversion from EBCDIC to ASCII, unpacking of packed data.

Connectors

Connectors supported in server jobs as well as in parallel jobs.

Migration tool automatically converts "old" database stage types to connectors during upgrade.

Support for multiple input links to Connector stages, including transaction grouping, error codes per link. Support for multiple reject links for Connector stages.

XML Stage

Support for more schemas (XSD XML schema 1.0, WSDL 1.1).

Complex transformations without shredding (hierarchical join, relational join, filter, switch, sort, union, regroup, row to columns, column to rows, aggreates, distinct).

Hierarchical transformation editor.

Web services as a transformation step.

Multiple inputand output links including reject and reference links.

Support for partitioning.

Parallel Debugger

Supports all topologies (SMP, MPP, cluster, grid).

Multiple breakpoints including conditional logic (and node, if required).

Works like server debugger, but per-node or overall. [I asked whether one could trace a record to another node as it's repartitioned; the answer was that Stewart would expect it to but did not know that particular detail.]

QualityStage

Standardization quality assessment report:
- summary
- percentage of records that populate categories
- composition of standard output patterns

Standardize stage (operator) to be combinable.

Classification table to allow mixed data types. (NEW)

Pattern Action Language to allow tokenizable locale. (NEW)

Match Designer

All the match types they removed between 7.5 and 8.0 have been reinstated.

Enhanced weight viewer - lots of the weights that weren't exposed in the Match Designer now will be, which gives greater control, visibility of data contributing to the weights, and statistical details (which should result in improved analysis).

Ability to override match and clerical cutoffs via parameters.

Match specification report available through Reporting console. (NEW)

New rule sets:
- Korea
- Argentina
- Brazil
- Chile
- Mexico
- Peru
- Netherlands
- product rule sample
- rules development sample

Global geocoding (for example ability to add spatial data such as latitude and longitude to address data).

Information Services Director

More bindings:
- REST
- XML/JSON
- RSS
- TEXT/HTTP

Common administration

Command line administration (stop/disable/enable/start services).

Ability to get lineage on services via metadata layer (for example using Metadata Workbench).

Business Glossary

Expand the stack of consumers.

Workflow control of term management.

Custom properties (expanded on what's in 8.1.2).

Discovery

Expand modelling capabilities. (UML?)

Fast Track

Continue expansion of native data source support.

Reverse engineering of DataStage jobs to mapping specifications.

Information Analyzer

Integrate name recognition (through IBM's Global Name Recognition software).

Generate ata rules directly from data review and decisions.

Continue expansion of native data source support.

Generate data rules that can be consumed by DataStage/QualityStage processing.

More methods for sharing data rules.

Notification mechanism and exception management. For example if number/percentage of records violating rule or meeting rule or number of rules violated by one record exceeds threshold, generate notification.

New APIs for accessing analysis results.

Metadata Workbench

Improve extended metadata import. (No idea what this involves. Sorry.)

Information Flow Monitoring (NEW)

Manages the "information supply chain". (Again few details, looks like a non-technical user's Metadata Workbench.)

OK, digest that!

chulett · Post by **chulett** » Sat May 22, 2010 10:01 am

Yum! However, I think I have a tummy ache now.

tcj · Post by **tcj** » Wed May 26, 2010 6:07 pm

This line scares me.

Looping (loop variables separate from stage variables). This allows, for example, multiple output rows to be generated on one output link from a single input row.

That in the hands of a bad dev is dangerous.

tcj · Post by **tcj** » Wed May 26, 2010 6:09 pm

Ray,

Any word on when this will be put in?

Migration tool automatically converts "old" database stage types to connectors during upgrade.

Going into the next version of DataStage? It will be quite handy for the migration I have to do.

Tim

ray.wurlod · Post by **ray.wurlod** » Wed May 26, 2010 6:30 pm

The "next version".

Note that the next version will almost certainly make it to GA in 2010; it will spend months in a long, open beta program.

I guess IBM want you all to detect/fix the problems before it goes GA, rather than afterwards.

vmcburney · Post by **vmcburney** » Wed May 26, 2010 7:42 pm

tcj wrote:Ray,

Any word on when this will be put in?

Migration tool automatically converts "old" database stage types to connectors during upgrade.

Going into the next version of DataStage? It will be quite handy for the migration I have to do.

Tim

It's already available in 8.1.x. It's a utility for converting enterprise database stages to connectors. I've had it running. Cannot remember what stages it supports - I know it does enterprise stages but not sure if it does API stages. It wont do Server jobs because Connectors are not in server jobs (yet).

ray.wurlod · Post by **ray.wurlod** » Wed May 26, 2010 9:45 pm

Connectors in server jobs - in "version next" (see above). And the migration tool will handle server jobs also.

I got the feeling that, if you use this tool for your upgrades, you don't get given a choice about whether or not to change to Connectors - it happens.

muralisankarr · Post by **muralisankarr** » Tue Aug 17, 2010 10:17 pm

Hi Ray,

Did you got the beta version to test. Is it available for IBM customers?

Regards
MSR

ray.wurlod · Post by **ray.wurlod** » Wed Aug 18, 2010 12:04 am

It's an "open beta" - that means available for all IBM Information Server customers. But you've missed much of it.

DSXchange

IOD EMEA Day Two

IOD EMEA Day Two

Futures