Removing duplicate records from script?

Hello,

I have the following script and instead of bringing back duplicate records I just want to bring back one of the records for each Image Link. This is an examaple for just two Supliers (Oracle Canada/Ceragem) but the script is used for hundrds of Suppliers.

SELECT

docs.supplier_name, docs.invoice_num, docs.invoice_sum_calc, 'caepayables.pwcinternal.com/…/ + docs.doc_id + ‘&SelectedView=ALL’ AS ImageLink,

flow_current.recipient_name, eflow_users.active

FROM

docs INNER JOIN

flow_current ON docs.doc_id = flow_current.doc_id INNER JOIN

eflow_users ON flow_current.recipient_name = eflow_users.user_name

WHERE

(docs.status_index = 1) AND (NOT (flow_current.recipient_name = N’MASTERUSER’)) AND (docs.supplier_name = N’ORACLE CANADA ULC’) OR

(docs.supplier_name = N’CERAGEM INTERNATIONAL INC.’)

Supplier Name

Invoice #

Invoice Funct. $

ImageLink

recipient_name

Status

ORACLE CANADA ULC

1129085

543.54

caepayables.pwcinternal.com/…/

Karen Gerry|L4

1

ORACLE CANADA ULC

1129085

543.54

caepayables.pwcinternal.com/…/

Linda Dhanushan|L4

1

ORACLE CANADA ULC

1129085

543.54

caepayables.pwcinternal.com/…/

Shira Gurel|L3

1

ORACLE CANADA ULC

1129085

543.54

caepayables.pwcinternal.com/…/

Karen J. Weeks|L4

1

CERAGEM INTERNATIONAL INC.

DIC826569201540173

496.78

caepayables.pwcinternal.com/…/

Melina Garcia|TCS

1

CERAGEM INTERNATIONAL INC.

DIC826569201540173

496.78

caepayables.pwcinternal.com/…/

Eythan Sabban|TCS

1

Thanks,

Hi Paul,

UNIQUE or DISTINCT are your friends here. They are both synonymous.

Change your SELECT .... to SELECT DISTINCT .... and if there are any

rows with exactly the same values in ALL SELECTED COLUMNS, then only one

will be displayed.

Some vendors I've had to deal with in the past thought that this syntax:

SELECT DISTINCT (A), B, C, D FROM SOME_TABLE ;

returned rows where only the A column was unique. While Oracle allows

this to be syntactically correct, it doesn't do what it appears to - and

how could it?

A row will be ignored if A, B, C and D are all identical to those

returned in other rows.

As you may imagine, a SELECT DISTINCT requires a sort to find and filter

out the duplicates.

HTH

Cheers,

Norm.

On 06/10/15 13:14, paul.sousa wrote:

*Removing duplicate records from script?

Thread created by paul.sousa

Hello,

I have the following script and instead of bringing back duplicate

records I just want to bring back one of the records for each Image

Link. This is an examaple for just two Supliers (Oracle Canada/Ceragem)

but the script is used for hundrds of Suppliers.

SELECT

docs.supplier_name,docs.invoice_num,docs.invoice_sum_calc,'caepayables.pwcinternal.com/.../

+docs.doc_id+'&SelectedView=ALL'ASImageLink,

flow_current.recipient_name,eflow_users.active

FROM

docsINNERJOIN

flow_currentONdocs.doc_id=flow_current.doc_idINNERJOIN

eflow_usersONflow_current.recipient_name=eflow_users.user_name

WHERE

(docs.status_index=1)AND(NOT(flow_current.recipient_name=N'MASTERUSER'))AND(docs.supplier_name=N'ORACLE

CANADA ULC')OR

(docs.supplier_name=N'CERAGEM INTERNATIONAL INC.')

...

Thanks Norm for your usual quick turn around!!

I changed the select to select distinct but this still brought back all the records since as you mentioned this will only work if ALL the columns are the same…in this case the recipient name is different so it would not remove any of the records. So are you saying that this won’t work …returning only one record for each of the unique image links? Sorry but I’m new to SQL

On 06/10/15 13:14, paul.sousa wrote:

*Removing duplicate records from script?

Thread created by paul.sousa

Hello,

I have the following script and instead of bringing back duplicate

records I just want to bring back one of the records for each Image

Link. This is an examaple for just two Supliers (Oracle Canada/Ceragem)

but the script is used for hundrds of Suppliers.

SELECT

docs.supplier_name,docs.invoice_num,docs.invoice_sum_calc,'caepayables.pwcinternal.com/.../

+docs.doc_id+'&SelectedView=ALL'ASImageLink,

flow_current.recipient_name,eflow_users.active

FROM

docsINNERJOIN

flow_currentONdocs.doc_id=flow_current.doc_idINNERJOIN

eflow_usersONflow_current.recipient_name=eflow_users.user_name

WHERE

(docs.status_index=1)AND(NOT(flow_current.recipient_name=N'MASTERUSER'))AND(docs.supplier_name=N'ORACLE

CANADA ULC')OR

(docs.supplier_name=N'CERAGEM INTERNATIONAL INC.')

I may have read that wrong, so just in case ...

Assuming you want to bring back on row only, for each imagelink, and you

don't care which one it is, nor what is in the other selected columns?

You could do this with max():

SELECT

max(docs.supplier_name,docs.invoice_num),

max(docs.invoice_sum_calc),

'caepayables.pwcinternal.com/.../>

+docs.doc_id+'&SelectedView=ALL'

AS ImageLink,

max(flow_current.recipient_name),

max(eflow_users.active)

FROM ...

WHERE ...

GROUP BY ImageLink

ORDER by ;

You could use max(),min(), whatever you like as long as it applies to

the data type of the column.

Alternatively, you can probably use the analytical function RANK() or

DENSE_RANK() - but someone well versed in their use might be willing to

step in! I'm not all that used to them I have to confess.

HTH

--

Cheers,

Norm. [TeamT]

Hi Paul,

On 06/10/15 13:41, paul.sousa wrote:

I changed the select to select distinct but this still brought back all

the records since as you mentioned this will only work if ALL the

columns are the same....in this case the recipient name is different so

it would not remove any of the records. So are you saying that this

won't work

yes, if the recipient name is different in each row, then no two rows

will be the same, so no duplicates are being removed. See my "just

posted" email on the subject which might get you what you want using

max() on everything except the image - but check that it brings back all

the rows that you need of course! You know your test data - it worked

fine with my simple set of data that I used.

--

Cheers,

Norm. [TeamT]

Just added max() to all the fields except for the image and it worked as expected …thanks so much Norm