Extracting words from multiple columns

Hi, I have two tables that I need to join to enable me to add a country code to a list of addresses. I’m trying to search through 4 columns to find a country to match against the country table.

table1

column1,column2,column3,column4 all contain address fields where the country could be the whole column or part of:

row1

23 HIGH ST, SOMEWHERE, FREETOWN, XX1 2XX

row2

HAPPY HOME, 17 HIGH ST, SOMEWHERE, ANYWHERE FREETOWN

table2

column1,column2

row1

FREETOWN,FT

So a simple join between column 3 of table 1 and column 1 of table 2 would return FT for row1, but would not find a result for row2.

I’d quite like a simple way of searching the column (or a combination of columns) for a match to another table, especially when the columns can contain many different words.

Is this even possible?

Beth:

I think you need to use the country field in your country table and INSTR on each of the four columns ORed together as a means of joining the two tables to get your results.

Hope this helps,

Dave

From: bethmccann [mailto:bounce-bethmccann@toadworld.com]
Sent: Wednesday, July 31, 2013 8:24 AM
To: oracle@toadworld.com
Subject: [Toad World - Oracle Discussion Forum] Extracting words from multiple columns

Extracting words from multiple columns

Thread created by bethmccann

Hi, I have two tables that I need to join to enable me to add a country code to a list of addresses. I’m trying to search through 4 columns to find a country to match against the country table.

table1

column1,column2,column3,column4 all contain address fields where the country could be the whole column or part of:

row1

23 HIGH ST, SOMEWHERE, FREETOWN, XX1 2XX

row2

HAPPY HOME, 17 HIGH ST, SOMEWHERE, ANYWHERE FREETOWN

table2

column1,column2

row1

FREETOWN,FT

So a simple join between column 3 of table 1 and column 1 of table 2 would return FT for row1, but would not find a result for row2.

I’d quite like a simple way of searching the column (or a combination of columns) for a match to another table, especially when the columns can contain many different words.

Is this even possible?

To reply, please reply-all to this email.

Stop receiving emails on this subject.
Or Unsubscribe from Oracle notifications altogether.
Toad World - Oracle Discussion Forum

Flag this post as spam/abuse.

That will work – but join performance on expression and hence non-indexed data is going to stink. You could create virtual columns in Oracle (see my blog on
toad world) then index the virtual columns – this is allowed but little known – then join on virtual columns. That should give decent performance …

From: reedda [mailto:bounce-reedda@toadworld.com]

Sent: Wednesday, July 31, 2013 9:01 AM

To: oracle@toadworld.com

Subject: RE: [Toad World - Oracle Discussion Forum] Extracting words from multiple columns

RE: Extracting words from multiple columns

Reply by reedda

Beth:

I think you need to use the country field in your country table and INSTR on each of the four columns ORed together as a means of joining the two tables to get your results.

Hope this helps,

Dave

From: bethmccann [mailto:bounce-bethmccann@toadworld.com]

Sent: Wednesday, July 31, 2013 8:24 AM

To: oracle@toadworld.com

Subject: [Toad World - Oracle Discussion Forum] Extracting words from multiple columns

Extracting
words from multiple columns

Thread created by bethmccann

Hi, I have two tables that I need to join to enable me to add a country code to a list of addresses. I’m trying to search through 4 columns to find a country to match against the country table.

table1

column1,column2,column3,column4 all contain address fields where the country could be the whole column or part of:

row1

23 HIGH ST, SOMEWHERE, FREETOWN, XX1 2XX

row2

HAPPY HOME, 17 HIGH ST, SOMEWHERE, ANYWHERE FREETOWN

table2

column1,column2

row1

FREETOWN,FT

So a simple join between column 3 of table 1 and column 1 of table 2 would return FT for row1, but would not find a result for row2.

I’d quite like a simple way of searching the column (or a combination of columns) for a match to another table, especially when the columns can contain many different words.

Is this even possible?

To reply, please reply-all to this email.

Stop receiving emails on this subject.

Or
Unsubscribe from Oracle
notifications altogether.

Toad World - Oracle Discussion Forum

Flag
this post as spam/abuse.

To reply, please reply-all to this email.

Stop receiving emails on this subject.

Or
Unsubscribe from Oracle
notifications altogether.

Toad World - Oracle Discussion Forum

Flag
this post as spam/abuse.

Thanks both, I’d tried the substr/instr across each of the 4 columns but it takes a dogs age to run, I also have the problem that a country may be made up of more than one word so if I substr/instr then || for every possible option, I’ll end up with a million rows of code for less than 100k rows of data :slight_smile:

I’ll have a read of your blog Bert and see if I can create virtual columns.

Thanks again

Hope this helps – it’s a cool oracle feature that’s very useful when working with poor relational design databases …

http://www.toadworld.com/platforms/oracle/b/weblog/archive/2007/08/15/oracle-11g-adds-virtual-columns-amp-indexes.aspx

From: bethmccann [mailto:bounce-bethmccann@toadworld.com]

Sent: Wednesday, July 31, 2013 9:13 AM

To: oracle@toadworld.com

Subject: RE: [Toad World - Oracle Discussion Forum] Extracting words from multiple columns

RE: Extracting words from multiple columns

Reply by bethmccann

Thanks both, I’d tried the substr/instr across each of the 4 columns but it takes a dogs age to run, I also have the problem that a country may be made up of more than one word so if I substr/instr
then || for every possible option, I’ll end up with a million rows of code for less than 100k rows of data :slight_smile:

I’ll have a read of your blog Bert and see if I can create virtual columns.

Thanks again

To reply, please reply-all to this email.

Stop receiving emails on this subject.

Or
Unsubscribe from Oracle
notifications altogether.

Toad World - Oracle Discussion Forum

Flag
this post as spam/abuse.

User_Data

Col1 | Col2 | Col3 | Col4
23 HIGH ST | SOMEWHERE | FREETOWN | XX1 2XX
HAPPY HOME | 17 HIGH ST | SOMEWHERE | ANYWHERE FREETOWN

Country_Code_Lookup

Country | Code
FREETOWN | FT
  select ccl.code, ud.*
from User_Data ud,
Country_Code_Lookup ccl
where regexp_like(col2, lower(ccl.country), 'i')
or regexp_like(col3, lower(ccl.country), 'i')
or regexp_like(col4, lower(ccl.country), 'i');

I believe this would work. It should do a hash join, which shouldn’t take too long.