Dynamic arrays/or pivot

dj · Post by dj » Mon Aug 22, 2016 1:21 pm

Hi All,
We have got requirement to compare 2 datasets and generate reports for the mismatches/ detail records and counts.

And also we are trying to keep it generic for any file metadata.(to enable rcp later)

1. After passing thr diff stage, the below layout is achieved

Code: Select all

 
Custid, File1name, File2Name, Matchflg, File1address, File2Address, matchflg_addr
1, a, aa, n, x, x, y
2, b, b, y, x, x, y

2.to get the non match records and counts, we did horizontal pivot.
Custid, file1value, file2value, match
1, a, aa, n
1, x, x, y
2, b, b, y
2, x x, y

How to add the columnnames to above layout, so we can filter n records and group by column name to get the counts.
ie..
Custid, columnname, file1value, file2value, match
1, name, a, aa, n
1, address, x, x, y
2, name, b, b, y
2, address, x, x, y

The file has got 50+ columns.

Instead of pivot, are there any dynamic arrays that can be build and read the values form arrays ? Thanks

ray.wurlod · Post by **ray.wurlod** » Mon Aug 22, 2016 4:36 pm

Dynamic array structures are natively supported in server jobs. I'd recommend looking at using an approach based on server jobs.

ray.wurlod · Post by **ray.wurlod** » Mon Aug 22, 2016 11:26 pm

Where do you get the column names (name, address) from in your design?

ray.wurlod · Post by **ray.wurlod** » Mon Aug 22, 2016 11:28 pm

You might also consider reducing the 50+ non-key columns to one using a Column Output stage, which generates a delimited string. You should then be easily able to compare those delimited strings to fulfil your requirement.

dj · Post by dj » Wed Aug 24, 2016 9:41 am

Thanks for the details.

The column names "name" and "address" are the file field names.File1_name and File2_name.

For the comparision part, even after using column export how to get the fieldnames?

how to get the field names over there without hardcoding to make it generic

custid, columnname , file1value,file2value, match
1, name ,a, aa, n
1, address,x, x, y
2, name,b, b, y
2, address,x x, y

so that later the report can be generated as

columname mismatch count
name 1
address 1

Thanks