Toad World® Forums

Deduplicate issue in Transform and Cleanse

I have a challenge trying to deduplicate in the transforn and cleanse process.

Couple of question on this:-

  1. What is the order of removing duplicates in transform and cleanse.

  2. Is it from top top bottom or bottom to top or arbitrarly?

My issue is that I have loan numbers with mulitple records due to activities and task performed on the same loan number over time. I will want to remove the duplicates and keep the record with the most current completed task by date. In excel I will sort by New to Old and de-dupe. This keeps the most current date and removes the duplicates below . How can I achieve same with transform and cleanse feature.

Your help will be appreciated

Chris

I answered a similar post in beta forum, http://www.toadworld.com/products/toad-data-point/toad_data_point_beta_program/f/91/t/31791

Hope that helps.

Hi…I responded in the Beta forum that your answer did not provide the solution I was looking for. Can you please review and see if there is any solution. I need to use it in setting up a reporting automation.

Your help would be appreciated!!

Chris

I had some time today to look at this. I get the same ending result set as you do but in a different order. I start the Transform and Cleanse session by ordering my result set this way.

SELECT accountNumber
, createdDate
FROM AutoRange_DeDup_test
ORDER BY accountNumber ASC
, createdDate DESC

I then debup by account number. This gives me the same result set as you have. Have you started your T & C with a ordered query?

You can also just do this in code:

SELECT ACCOUNTNUMBER, CREATEDDATE
FROM AUTORANGE_DEDUP_TEST
WHERE (ACCOUNTNUMBER, CREATEDDATE) IN
( SELECT ACCOUNTNUMBER, MIN (CREATEDDATE)
FROM AUTORANGE_DEDUP_TEST
GROUP BY ACCOUNTNUMBER)
ORDER BY ACCOUNTNUMBER ASC, CREATEDDATE DESC