Performance of multiple database connection query

abhinav.gargafgonlin · January 1, 2008, 8:36pm

Hello,

I have tried running a query using Toad for DA’s multiple database connection feature for the first time. I’ve got two data sources, a local Access database and an Oracle database. I’m running a query using a simple inner join on two tables based on row ID.

There are only about 100 rows in the Access table, but the Oracle database table I’m joining to has about a million rows.

The performance of the query is VERY slow. It is hogging a lot of memory on my computer. For such a simple query I would normally have expected a near-instantaneous result. I am guessing that it is trying to load both database tables into the local computer’s memory - is there any way to tweak the performance of such queries?

Debbie_Peabody · January 2, 2008, 2:12pm

There is no surprise that this query is slow. The query is retrieving and joining data from two completely different database types. Retrieval time will not be the same comparing the execution time of querying the one Oracle table. The fact that you are able to do the query is the benefit here. If it is something you need to do often, you may consider automating this query so it is run in off-hours and the result set ready for you when you come into the office.

We are continuing to research improvements to cross-platform query execution, but for now it is substantially slower than single database execution. Other solutions to this would be to bring the data from Oracle locally into Access or visa-versa, Access table into Oracle.

jacquesrk · January 2, 2008, 2:18pm

Another question on your query - is the column you are joining on an indexed column in the Oracle table? You say you are joining on “row ID” - is that the name of one of the columns in the Oracle and Access table?

MissM · January 11, 2008, 5:02am

Hey!

I also have encounterd similar problem. I run a query from, let say, connection1 via dblink on database that is on connection2. I got the resoults quite fast - like couple of mins. Then I wanted to create a table with resoults of a query in database on connection1 and it took about 2 hrs!
The main reason for creating new table was to keep original data types. When I exported resoults of a query, all data types were VARCHAR2, and I wanted to keep original data types. So, this seemed the easiest way - to create a table. It wasn’t big table - aprox. 65000 rows.
Is there a way to speed things up a bit ? Or some other work-around?

M

jacquesrk · January 11, 2008, 12:45pm

Hello, I am assuming that your situation is something like this:

connect to DB1

run a query that uses a table in DB! and a table in DB@, something like
SELECT * FROM TABLE1 T1, TABLE2@DB2 T2 WHERE T1.COL1 = T2.COL2 ;

And then, using the results of this query, you were tryijng to create a table in DB1. How did you do this? Was it using CREATE TABLE … AS SELECT in the Editor window of Toad for Data Analysis? Or using some other method?

MissM · January 12, 2008, 11:04am

Hello,

this was the situation: connected on connection1 I did something like:

create table tableD1 as
select …

from table1@connection2 t1,
table2@connection2 t2,
table3@connection2 t3
where t1.col1_id=t2.col1_id
and t2.col1_id=t3.col1_id

I was creating table on connection1, but I was selecting data only from tables on connection2.
If I run only a query, it was done rather quick, but creating table took very long time.

jacquesrk · January 14, 2008, 12:07pm

Hello,

Did you run the CREATE TABLE statement in the SQL Editor window of Toad for Data Analysis?
If so, I will mention that all we do in the editor window is send the SQL statement to the database and wait for the database to return a “statement completed” message. I would assume that the CREATE TABLE statement would take just as long in SQL*Plus for example.
Is the SELECT statement in the “CREATE TABLE … AS …” command identical to the SELECT statement that you ran, and which was much faster?

thomas.hamiltonus.in · May 15, 2008, 10:09am

Hi Debbie.

Can you tell us more about the optimizations for cross-database joins? e.g., is there any analysis of table size when determining the sequence of queries to execute?

Also, you indicated that this is slow because “The query is retrieving and joining data from two completely different database types.” If the query were against two separate databases of the same type (e.g., UDB), would that make a difference? Thanks.

jacquesrk · May 15, 2008, 3:36pm

Heterogeneous queries can be slow even if both RDBMS systems used in the heterogeneous query are the same (e.g. two Oracle databases.)

The engine used for heterogeneous queries works by “vertically” splitting the query by table.

Example

ORDER_DATES (Oracle table) with columns ORDER_ID (primary key), ORDER_DATE

ORDER_AMTS (SQL Server table) with columns ORDER_ID (primary key), ORDER_AMT

Query:
select
od.order_id, od.order_date, oa.order_amt
from
[ODBC connection to Oracle.ORDER_DATES] od
inner join [ODBC connection to SQL Server.ORDER_AMTS] oa
on (od.order_id = oa.order_id) ;

The Oracle database engine will see the query
“select order_id, order_date from order_dates”

The SQL Server database will see the query
“select order_id, order_amt from order_amts”

Both result sets (notice that this is a full table scan in both databases) are returned to the client and the heterogenous query engine on the client will do the joining and eliminating rows that don’t match.

Example 2:
select
od.order_id, od.order_date, oa.order_amt
from
[ODBC connection to Oracle.ORDER_DATES] od
inner join [ODBC connection to SQL Server.ORDER_AMTS] oa
on (od.order_id = oa.order_id)
where
od.order_id = 1 ;

The Oracle database engine will see the query
“select order_id, order_date from order_dates where order_id = 1”

The SQL Server database will see the query
“select order_id, order_amt from order_amts”

Only one row is returned from Oracle, but all the rows are returned to the client from SQL server and the heterogenous query engine on the client will do the joining and eliminating rows that don’t match.

Example 3:
select
od.order_id, od.order_date, oa.order_amt
from
[ODBC connection to Oracle.ORDER_DATES] od
inner join [ODBC connection to SQL Server.ORDER_AMTS] oa
on (od.order_id = oa.order_id)
where
od.order_id = 1 and oa.order_id = 1 ;

The Oracle database engine will see the query
“select order_id, order_date from order_dates where order_id = 1”

The SQL Server database will see the query
“select order_id, order_amt from order_amts where order_id = 1”

Only one row is returned from Oracle, and only one row is returned from SQL server. (The heterogenous query engine on theclient will do the joining and eliminating rows that don’t match.) This will run fast because both databases are reading using an index and returning one row. So even though the where condition looks redundant it is a good thing to put both conditions in: “od.order_id = 1 AND oa.order_id = 1”