Self join, date values, unique records, generate a flag, Teradata

wpoling_953 · November 28, 2012, 9:15am

I have a large data set with admission dates and discharge dates.
There is one unique field value we can call Uniq1.
Other fields can be called MId, DxCD, DxDl, PrId, PrDl, PsId, PsDl.
Uniq1, MId, DxCD, PrId, PsId are Integer.
Each record has a Uniq1.

What I need to do is output a dataset (thinking SelfJoin) from this dataset that has flagged each record with a yes or no, (1 or 0), where a 1 = a record whose discharge date is followed by another record’s admission date within 30 days of the discharge date.

This would be specific to the MId, so that if MId 1 had 6 records chronologically and the second record was an admission within 30 days of the previous record’s discharge date the second record would be flagged = 1 while the initial record would be flagged 0 and among the 4 subsequent records for MId 1 there are no other similar 30 day occurrences, so they are flagged = 0 as well.

Ideally, I would like the following output, for example:

Uniq1 MId Admit Disch #Days 30DyReAdFLG PrDl, PsDl

123 1 1/24/2010 2/4/2010 0 CharValue CharValue

321 1 2/25/2010 3/5/2010 21 1 CharValue CharValue

789 1 6/21/2010 6/28/2012 116 0 CharValue CharValue

Notice that the # days between Uniq1-123 and Uniq1-321 are21 and therefore flag =1, however, between Uniq1-321 and Uniq1-789 the # of days are 116 and the flag = 0.

Any advice would be most appreciated, thank you

jacquesrk · November 29, 2012, 1:51pm

I am assuming an Oracle database.

This is a fairly simple query once you become familiar with analytic functions. In this case, all we need to know is the discharge_date from the row immediately preceding the current row. we can use the LAG () function for that.

Here is my example:

drop table large ;
create table large
(uniq1 number, mid number, admit_date date, discharge_date date,
prdi varchar2 (16)) ;
insert into large (uniq1, mid, admit_date, discharge_date, prdi)
values (123, 1, DATE ‘2010-01-24’, DATE ‘2010-02-04’, ‘PRDI-SOMETHING’) ;
insert into large (uniq1, mid, admit_date, discharge_date, prdi)
values (321, 1, DATE ‘2010-02-25’, DATE ‘2010-03-05’, ‘PRDI-OTHER’) ;
insert into large (uniq1, mid, admit_date, discharge_date, prdi)
values (789, 1, DATE ‘2010-06-21’, DATE ‘2012-06-28’, ‘PRDI-DIFFERENT’) ;
insert into large (uniq1, mid, admit_date, discharge_date, prdi)
values (1011, 2, DATE ‘2012-05-15’, DATE ‘2012-05-16’, ‘PRDI-UNUSUAL’) ;
insert into large (uniq1, mid, admit_date, discharge_date, prdi)
values (1291, 2, DATE ‘2012-09-25’, DATE ‘2012-10-05’, ‘PRDI-OFFBEAT’) ;
insert into large (uniq1, mid, admit_date, discharge_date, prdi)
values (1374, 2, DATE ‘2012-10-30’, DATE ‘2012-11-02’, ‘PRDI-DISTINCTIVE’) ;
commit ;

select uniq1, mid, admit_date as admit, discharge_date as discharge,
lag (discharge_date, 1, null)
over (partition by mid order by mid, admit_date) as “Prev. Disch.”,
admit_date - lag (discharge_date, 1, null)
over (partition by mid order by mid, admit_date) as “Day Diff”,
case
when
admit_date - lag (discharge_date, 1, null)
over (partition by mid order by mid, admit_date) <= 30
then 1 else 0
end as “30DyReAdFLG”,
prdi
from large
order by mid, admit_date ;

Query output:

UNIQ1 MID ADMIT DISCHARGE Prev. Disc Day Diff 30DyReAdFLG PRDI

123 1 2010/01/24 2010/02/04 0 PRDI-SOMETHING
321 1 2010/02/25 2010/03/05 2010/02/04 21 1 PRDI-OTHER
789 1 2010/06/21 2012/06/28 2010/03/05 108 0 PRDI-DIFFERENT
1011 2 2012/05/15 2012/05/16 0 PRDI-UNUSUAL
1291 2 2012/09/25 2012/10/05 2012/05/16 132 0 PRDI-OFFBEAT
1374 2 2012/10/30 2012/11/02 2012/10/05 25 1 PRDI-DISTINCTIVE

wpoling_953 · November 29, 2012, 2:13pm

Hello Jacque, terrific, thank you for the quick support, this is helpful, however, I am in Teradata DB, and I am wondering if this will still work, I will try and leave question open for the moment.

Have had problems trying to use Oracle & other SQL logic, Teradata finicky that way. I will keep you informed though, thanks again.

wpoling_953 · November 29, 2012, 3:26pm

Hello again Jacque.

I replaced the example fields with the true field names from the true database using your logic, and I am getting the following:

[Teradata][ODBC Teradata Driver][Teradata Database] Syntax error: Data Type “AdmitDte_Disch_Id2”

does not match a defined type

The field value is in 'date format" as I used in my illustration, however, it looks like your example is different?

Mine 08/12/1999
Your ‘2012-09-25’

Do you think that might be what this means?

Thanks again for any support

Oh, DAH, I don’t know what I was thinking, my date values are the same as yours, now I am more confused, not sure why it does not run?

Thanks for your patience!

Message was edited by: wpoling_953

jacquesrk · November 29, 2012, 5:40pm

Can you post your SQL statement in here?

wpoling_953 · November 30, 2012, 6:10am

Dte_SvcFrom_id BETWEEN ‘2008-07-01’ and ‘2011-06-30’

select

Hi Jacque:

Here is the script below.

I have validated that my data looks like 12/08/2012, however, in my where clause filter further upstream, I am using something like which is the format you are using.

Thanks for your help, kind of new to this and appreciate it.

AdmitClaim_id2, AdmitMem_id2, AdmitDte_Admit_id2 as admit, AdmitDte_Disch_id2 as discharge,

lag (AdmitDte_Disch_id2,

1, null)

over (partition

by AdmitMem_id2 order by AdmitMem_id2, AdmitDte_Admit_id2) as “Prev. Disch.”,

AdmitDte_Admit_id2

lag (AdmitDte_Disch_id2, 1, null)

over (partition

by AdmitMem_id2 order by AdmitMem_id2, AdmitDte_Admit_id2) as “Day Diff”,

case

when

AdmitDte_Admit_id2

lag (AdmitDte_Disch_id2, 1, null)

over (partition

by AdmitMem_id2 order by AdmitMem_id2, AdmitDte_Admit_id2e) <= 30

then 1 else 0

end as “30DyReAdFLG”, Admits1.AdmitPatStat_dl, Admits1.AdmitPatStat_id,

Admits1.AdmitProvRend_id, Admits1.AdmitProvRend_LastName

from

ar_asrh_dm_etl_beacon.Flg_ReAdAlgo1a1

order

by AdmitMem_id2, AdmitDte_Admit_id2;

/*Getting Database Error Dialog Box "[Teradata][ODBC Teradata Driver][Teradata Database] Syntax error: Data Type “AdmitDte_Disch_Id2”

does not match a defined type*/

wpoling_953 · November 30, 2012, 7:48am

Interestingly enough, I get same error when I just run your generic script (doc attached).

Thanks
Doc1.docx (263 KB)

AlekseyBazhenov · November 30, 2012, 12:51pm

Hi ,
You should remember that every SQL provider is different. That’s why the Jacques’ Oracle scripts give you this error.
You should change the script according to Teradata syntax.I see the following issue with the script.
Teradata does not have type “number “ so you should replace it.
create table test_2 (id integer, d date);
To insert into Data type column you could use the following syntax (I assume the format for Date column is ‘yyyy-mm-dd’)
insert into test_2 values (1, ‘2012-11-30’);

I hope it will help you.

wpoling_953 · November 30, 2012, 1:14pm

Hello Aleksey, and thank you, I understand better now.

However, this does not solve my real life problem.

I can see how this would work in the create “large” generic demonstration Jacque provided, but I have a data set already and do not need to create new, just need to understand how to develop Teradata specific syntax to accomplish what I outlined above in my initial posting.

But very helpful to understand better what you described Aleksey,thanks!

jacquesrk · November 30, 2012, 1:18pm

I see what Aleksey is trying to say here, but you don’t need to recreate my tables to try my query (unless you absolutely want to.)

Maybe Teradata doesn’t support the LAG function? Do you have any documentation you can look at to see if the LAG function is described? I found some documentation online that shows a LAG function in Teradata Aster, but then Aleksey showed me some other documentation that doesn’t include the lag function.

If you don’t have the LAG function we will have to rewrite the query differently. I can do that but you’ll have to wait until I come out of my next meeting.

wpoling_953 · November 30, 2012, 1:54pm

Hi Jacque, wow, I had no idea this could be so convoluted, Teradata, Oracle, I figured SQL was SQL, my apologies.

I was trying your generic “large” script just for my own edification, teaching moment; it helps me to understand if I do just the basics as well as I go along.

You guys must be darn good at this if you can go in and out of meetings and still help a novice like myself on line, lots of laughs!

I did find an old reference on line related to lag() in Teradata if it helps, thanks so much to you both!

http://datawarehouse.ittoolbox.com/groups/technical-functional/teradata-l/analytical-functions-in-teradata-1666797

AlekseyBazhenov · November 30, 2012, 1:54pm

Also you have to replace the LAG function (unfortinatly Teradata does not have it) with the following expression

Max(discharge_date) over (partition by mid order by mid, admit_date rows between 1 preceding and 1 preceding )

So the whole query becomes

select uniq1, mid, admit_date as admit,
discharge_date as discharge,
Max(discharge_date) over (partition by mid order by mid, admit_date rows between 1 preceding and 1 preceding) as “Prev. Disch.”,

admit_date - Max(discharge_date) over (partition by mid order by mid, admit_date rows between 1 preceding and 1 preceding ) as “Day Diff”,
case when
admit_date - Max(discharge_date) over
(partition by mid order by mid, admit_date rows between 1 preceding and 1 preceding ) <= 30
then 1 else 0 end as “30DyReAdFLG”,

prdi
from large
order by mid, admit_date ;

jacquesrk · November 30, 2012, 2:46pm

wpoling_953 wrote:

Hi Jacque, wow, I had no idea this could be so convoluted, Teradata, Oracle, I figured SQL was SQL, my apologies.

I was trying your generic “large” script just for my own edification, teaching moment; it helps me to understand if I do just the basics as well as I go along.

You guys must be darn good at this if you can go in and out of meetings and still help a novice like myself on line, lots of laughs!

I did find an old reference on line related to lag() in Teradata if it helps, thanks so much to you both!

http://datawarehouse.ittoolbox.com/groups/technical-functional/teradata-l/analytical-functions-in-teradata-1666797

Yes, you would need to rewrite the LAG function to something similar to what you see in the webpage you linked to. Aleksey has alread done this in the post immediately preceding.

wpoling_953 · November 30, 2012, 3:25pm

Thank you Aleksey, I will try this, have a great weekend!

wpoling_953 · November 30, 2012, 3:27pm

Thank you Jacque, I will try Aleksey’s script, have a great weekend!

wpoling_953 · December 3, 2012, 7:22am

Good morning Aleksey & Jacque. This latest version is working as I had intended, thank you both for your help, my question is now answered. I am sure I will have others though, lots of laughs.
Thanks Again!

AlekseyBazhenov · December 3, 2012, 9:58am

You are welcome.

wpoling_953 · December 11, 2012, 2:03pm

CREATE

Hello Aleksey.
I hope I am allowed to ask another question in this string even though I marked it as question answered, not sure?

Anyway,

I am stuck, I need to modify this script so that the flag = 1 actually populates the previous record based on discharge date within 30 days of the new admit date I suppose.

So that where appropriate, the previous record maintains the 30 day flag, not the following record, and in this case then theflag forthe last record for the admitMem_ID2 will always be 0

Here is what script currently looks like:

TABLE ar_asrh_dm_etl_beacon.Flg_ReAdAlgo1a4 AS

(

SELECT AdmitClaim_id2, AdmitMem_id2, ADte AS admit, DDte AS discharge,

MAX(DDte) over (partition BY AdmitMem_id2 ORDER BY AdmitMem_id2, ADte

ROWS between 1 preceding and 1 preceding) AS “Prev. Disch.”,

ADte

MAX(DDte) over (partition by AdmitMem_id2 ORDER BY AdmitMem_id2, ADte

ROWS between 1 preceding and 1 preceding ) AS “Day Diff”,

CASE WHEN

ADte

MAX(DDte) over (partition BY AdmitMem_id2 ORDER BY AdmitMem_id2, ADte

ROWS between 1 preceding and 1 preceding ) <= 30

THEN 1 ELSE 0 END AS “30DyReAdFLG”, AdmitPatStat_dl2, AdmitPatStat_id2,

AdmitProvRend_id2, AdmitProvRend_LastNamed2

FROM ar_asrh_dm_etl_beacon.Flg_ReAdAlgo1a3

GROUP BY AdmitMem_id2, ADte, AdmitClaim_id2, DDte, AdmitPatStat_dl2, AdmitPatStat_id2,

AdmitProvRend_id2, AdmitProvRend_LastNamed2)

WITH DATA;

AdmitClaim_id2 AdmitMem_id2 admit discharge Prev. Disch. Day Diff 30DyReAdFLG
1005821 3057 1/24/2010 2/4/2010 1
2304418 3057 2/25/2010 3/5/2010 2/4/2010 21 0
3398955 3057 5/5/2010 5/11/2010 3/5/2010 61 0
5117714 3057 6/18/2010 6/29/2010 5/11/2010 38 1
5812207 3057 7/17/2010 7/30/2010 6/29/2010 18 1
6092480 3057 8/18/2010 8/24/2010 7/30/2010 19 1
7161403 3057 9/4/2010 9/22/2010 8/24/2010 11 1
7162225 3057 9/24/2010 9/30/2010 9/22/2010 2 1
7450820 3057 10/13/2010 10/14/2010 9/30/2010 13 1
7503170 3057 10/15/2010 10/16/2010 10/14/2010 1 0

Message was edited by: wpoling_953Message was edited by: wpoling_953

Actually after thinking this through a little further, due to necessary changes, I think I will need my output to look more like the example attached please, thank you for any support.

Message was edited by: wpoling_953Message was edited by: wpoling_953Message was edited by: wpoling_953

Cannot seem to get attachment function to work, keeps closing the edit without uploading?

wpoling_953 · December 11, 2012, 2:02pm

I have some follow-up changes and need to reopen this question if possible. If not please advise appropriate alternative.ThanksMessage was edited by: wpoling_953

Message was edited by: wpoling_953

jacquesrk · December 13, 2012, 4:05am

Sorry for the delay in answering, I was out of the office. I will go post the answer in the other thread you opened on this topic.

Topic		Replies	Views
Preceding versus Following using TOAD in Teradata Toad Data Point & Toad Intelligence Central	4	2592	December 13, 2012
Toad, Teradata, "following" script another question please. Toad Data Point & Toad Intelligence Central	0	676	January 11, 2013
"FOLLOWING" statement with Flag necessary for more than occurance Toad Data Point & Toad Intelligence Central	0	695	December 18, 2012
Toad, Teradata, SQL, CASE Statements and multiple "following" statement Toad Data Point & Toad Intelligence Central	1	780	January 25, 2013
Data duplication issue Toad Data Point & Toad Intelligence Central	2	1083	December 14, 2012

Self join, date values, unique records, generate a flag, Teradata

Related topics