I have a challenge trying to deduplicate in the transforn and cleanse process.
Couple of question on this:-
-
What is the order of removing duplicates in transform and cleanse.
-
Is it from top top bottom or bottom to top or arbitrarly?
My issue is that I have loan numbers with mulitple records due to activities and task performed on the same loan number over time. I will want to remove the duplicates and keep the record with the most current completed task by date. In excel I will sort by New to Old and de-dupe. This keeps the most current date and removes the duplicates below . How can I achieve same with transform and cleanse feature.
Your help will be appreciated
Chris
Hi…I responded in the Beta forum that your answer did not provide the solution I was looking for. Can you please review and see if there is any solution. I need to use it in setting up a reporting automation.
Your help would be appreciated!!
Chris
I had some time today to look at this. I get the same ending result set as you do but in a different order. I start the Transform and Cleanse session by ordering my result set this way.
SELECT accountNumber
, createdDate
FROM AutoRange_DeDup_test
ORDER BY accountNumber ASC
, createdDate DESC
I then debup by account number. This gives me the same result set as you have. Have you started your T & C with a ordered query?
You can also just do this in code:
SELECT ACCOUNTNUMBER, CREATEDDATE
FROM AUTORANGE_DEDUP_TEST
WHERE (ACCOUNTNUMBER, CREATEDDATE) IN
( SELECT ACCOUNTNUMBER, MIN (CREATEDDATE)
FROM AUTORANGE_DEDUP_TEST
GROUP BY ACCOUNTNUMBER)
ORDER BY ACCOUNTNUMBER ASC, CREATEDDATE DESC