Hope you are all doing well ?
Request your expertise for the below scenario.
I am trying to create a custom Parallel Stagetype - Wrapped. Need your help with the below queries.
1. Is it possible to design such a stage and then make it work for each and every record from the source ?
2. Is it possible to make it work on a specific column alone and not on the whole record ? For example, say the source has 4 columns, of which one column alone needs to be encoded and then all the four columns need to be passed to the output for further processing ?
3. Is it possible to have a sample design/clear documentation for creating the stage ? I have been through the forum and the Advanced Developer's guide. I am not able to understand the examples listed such as sort -r... and the tr etc and the options on the Input/Output/properties etc. Any clearer documentation/ sample dsx would be much appreciated.
Scenario:
Oracle table has 4 columns including a BLOB column having PNG images in them. I am trying to extract them and get them loaded to a HIVE table or sequential file.
The downstream systems seem to be requesting the PNG files as Base64 encoded text since they are not able to read the BINARY data. Not sure where the trouble is i.e the load/read process. Hence I was thinking of creating a custom stage type which could maybe use the xxd command to encode it to base64 text. The stage should work for each row read from the source.
Code: Select all
Oracle -> Custom Wrapped Stage -> Target (File/Hive Table)
Also when a BLOB column is being read as LongVarBinary, does it automatically convert the data to HEX format ?
Can an external filter stage be used for this requirement ? Will it act for each record from the source. If so, can I have a sample example of the command ?
Options considered:
1. Reading the BLOB column as LongVarBinary and writing to HIVE. The downstram java systems are unable to read the data and have reported as being corrupted.
2. Parallel Routines, Support team may be reluctant supporting C++ codes.
3. Unable to find any Basic Routines with most of them returning an error as unrecognised character encountered.
Thanks in advance for all the help. Would be happy to help with any additional information on the scenario if needed.