How Datastage handle streaming events (MQ / Kafka messages)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sensiva
Premium Member
Premium Member
Posts: 21
Joined: Tue Aug 22, 2017 10:39 am

How Datastage handle streaming events (MQ / Kafka messages)

Post by sensiva »

Hello All,

I am experiencing the error "The record is too big to fit in a block" in one of my job which uses MQ connector to read messages from a queue.

Scenario
  • The MQ queue has around 15000 messages with the size of each message is some kb. Definelty less than a 0.5MB
  • The Datastage connects to the MQ in client mode and reads the messages
  • The message quantity is set to -1 so as to read all messages from the queue
I would like to better understand how the datastage works for streaming events as its says records too big to fit in a block. How datastage split MQ messages to block , how does this mapping happen internally.

I understand that the mq connector would read message by messgae and not line by line, in this case, does the connector waits until it reads all the messages in the queue before propagating it to the next stage ? or does it propagate the messages to next stage as it reads ?

Because, our transport block by default was set to 128kb and Datastage was saying it required around 750kb initially. Since it was below the recommened level of 1MB, I increased it to 900kb, but during the next run, the job again failed and said it needs 1.2MB, Though its a little above the recommended level, I could increase it to 1.2MB, but would like to understand if 1.2MB would be ok, or would it need more later ..

During functional and unit testing, we were testing with less than 10 messages in the queue, and we did not face any issues at those time.

Any pointers would be of great help.

Thanks
Sen
sen
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

It's been a really long time (probably pre-11.3, but this is a feature that I would expect to be quite stable) since I played with these settings (and I've forgotten their exact names, but it's the "apt transport block size" named value. The default behavior is really small. I recall doing a series of tests and having my settings approach 2gig for really large payloads. It sounds like your values are a whole lot less.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply