hi,
I have a look that is set to FAIL on lookup failure. However, if there were already written records before the failure, they are not reverted.
For example if I have 10 records, and 6th record fail, the job aborts indeed, but the previously written 5 records stays in the database.
TIA!
How to rollback all written records if lookup fails
Moderators: chulett, rschirm, roy
easy way is to break it up into two jobs.
first job writes to a dataset.
if the lookup fails, then abort.
Job #2 would only run if job #1 didn't abort.
=============
Write to a temp table and do an after job upsert into your main table if the job didn't abort.
================
There are other tactics you can use as well. Like including a watermark in your table load. Then just delete the entries you inserted with that watermark. You would have to understand if it was an insert or an update...
I would simply go for the dataset approach, but your data VOLUME will dictate the best approach for you to use.
first job writes to a dataset.
if the lookup fails, then abort.
Job #2 would only run if job #1 didn't abort.
=============
Write to a temp table and do an after job upsert into your main table if the job didn't abort.
================
There are other tactics you can use as well. Like including a watermark in your table load. Then just delete the entries you inserted with that watermark. You would have to understand if it was an insert or an update...
I would simply go for the dataset approach, but your data VOLUME will dictate the best approach for you to use.
Okay... at a high level if it only issued a commit after 2000 records or end-of-job then saying that when record 6 fails the job the previous 5 are still in the target didn't make sense to me, hence the question. Something else is going on, perhaps partitioning is affecting the behavior - one node completes and commits its work while others fail? Not sure. Running on a single node might help make the behavior less... unpredictable.
To have full control over it, I would follow what Paul is suggesting... use two jobs to make a validation pass through the data first and only if it is fully blessed, load the data.
To have full control over it, I would follow what Paul is suggesting... use two jobs to make a validation pass through the data first and only if it is fully blessed, load the data.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers