How to use toad to identify duplicate data in tables

how can I use toad to identify duplicate data? please I need help.

Database menu > Compare > Data Dupliates

how about in a solution where I do not know the exact table which is duplicate how can I go about it?

In that case, you’d just have to start looking in tables one by one. So, in Data Duplicates, pick the first table, click on the “Summary” tab. If there are duplicates, you’ll see them. Then go back and pick the next table and repeat.

okay thanks. so what do duplicate (summary) and duplicate data (editable) do?

what does duplicate data (editable) does?

You can edit the grid in the tab that says “editable”, but the query can be a lot slower than query that the “Summary” tab runs.

right now I just click on the duplicate data (editable) the it still running but now my question is what does duplicate data (editable ) do? thank you.

Well, I did tell you it might be slow…

If it comes back with a result in the grid, you can click in the grid and change the data by typing something else over it.

okay so now what is the next step because it still running.

You have to wait for it to finish. If it’s a big table, maybe you should read a book, take a jog, play with your kids, etc.

If you don’t want to do that, you can click the 2nd toolbar button to cancel it.

if there is any duplicate data in the table it will display something on the grid right? if so what can I do next to eliminate the those duplicate data? I don’t know if I am making sense to you.

Yes, it will display duplicates in the grid if it finds any.

If it is just a few duplicates, you can use the grid toolbar buttons on the bottom to delete the rows you don’t want. Or you can change the data by typing in the grid.

If there are a lot of duplicates and you want to eliminate all but one, click the first toolbar button. It will show you the query with a comment at the top like this:

/* To show all but one of each duplicate row, change to “<>” to:
‘>’ (keeps first duplicate)
or ‘<’ (keeps last duplicate) * /
Select A1.rowid, A1.*
from SCOTT.EMP A1
where exists (Select ‘x’ from SCOTT.EMP A2
where A1.ROWID <> A2.ROWID

So, you can copy/paste that whole thing into the Editor, and change it to something like

Delete
from SCOTT.EMP A1
where exists (Select ‘x’ from SCOTT.EMP A2
where A1.ROWID < A2.ROWID <<<<<<<<<<<<<-- notice here I changed <> to <
and ((A1.EMPNO = A2.EMPNO) or ((A1.EMPNO is null) and (A2.EMPNO is null)))
and ((A1.ENAME = A2.ENAME) or ((A1.ENAME is null) and (A2.ENAME is null)))
and ((A1.JOB = A2.JOB) or ((A1.JOB is null) and (A2.JOB is null)))
and ((A1.MGR = A2.MGR) or ((A1.MGR is null) and (A2.MGR is null)))
and ((A1.HIREDATE = A2.HIREDATE) or ((A1.HIREDATE is null) and (A2.HIREDATE is null)))
and ((A1.SAL = A2.SAL) or ((A1.SAL is null) and (A2.SAL is null)))
and ((A1.COMM = A2.COMM) or ((A1.COMM is null) and (A2.COMM is null)))
and ((A1.DEPTNO = A2.DEPTNO) or ((A1.DEPTNO is null) and (A2.DEPTNO is null))));

then run the DELETE query in the editor. It will delete all duplicates except one.

thank you very much Jdorlon I really appreciate it.

if I have any request again I will let u kno.

1 Like