Toad World® Forums

Best way to test contents of datasets?


#1

Testing the contents of tables, views, etc. is tricky due to several issues:

  • These are persistent data structures defined in a namespace in Oracle. We generally don’t want to or cannot change the names of those objects in our code, so we have to very carefully manage them.

  • The structures we will be testing must be populated with the good test data. And the datasets against which these structures will be tested must have the expected data in them.

  • It is usually the case that you cannot compare all columns of a row, because some values like primary keys and audit columns will be generated as unique values for new rows.

For small tables and queries with a small number of rows returned, a reasonable approach to take is to make a local copy of the structure before the program is run, make changes to that copy and then use it as a control against which the table/query can be compared after the program is run.

We will generate the code to do this for you in the Expected Results edit window.

Over time I would like to add generation support for a variety of approaches. What would you like to see us do?


#2

Hello Steven,

You can also use a tuple of materialized views with complete refresh. Use one for the before image and one for the after image. This way you can store the filtering clause for the subset using oracle structure rather than qute structure, which would be easier to implement, already supported etc. etc.

Regards,


#3

"For small tables and queries with a small number of rows returned, a reasonable approach to take is to make a local copy of the structure before the program is run, make changes to that copy and then use it as a control against which the table/query can be compared after the program is run. "

I’m not sure that I understand you correctly. Are you attempting to define the expected results of a test by saving a copy of the correct results?

I think you may be talking about something else and if so, why not use Oracles read consistency mechanism and flashback query “as of :scn” select syntax to compare the before/after datasets?

Just an ideas.

Mathew Butler


#4

That’s a fine idea, Mathew. I don’t know much about those mechanisms, but I cannot rely on anything unless it is available from 9iR2 forward. Is that the case for AS OF :SCN?

Also, in terms of local copy:

Suppose my program is supposed to remove a row from the employee table. One way to test if the program worked is to make a copy of the employee table into say COPY_EMP before the program is run, and remove that from copy the row I want to see gone from employee. Then run the program and afterwards employee should match COPY_EMP.

Does that make sense? Sound useful?


#5

Hi steven,

Apologies for the tardy response.

Have a look at the flashback clause of SELECT. Yes, it is available from 9iR2 onwards:
http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96540/statements_103a.htm#2065648

I agree that what you are suggesting is useful. There is a definate need to verify the state of the data after a process has run. This is particularly difficult where this is a batch process and the volume of data is “large”. Your results verification may also need to be aware of which columns in the stored results set are allowed to be different across tests eg: date/time columns and columns populated by sequences.

I’ve seen this done in the past by associating a test id with the copied resultset. When a test is run, the test knows it’s own id, and as the results are also tagged it can compare its results with the expected results ( and it knows to ignore or “soft check” the contents the result - a soft check might be to ignore the actual values, but say check the data format ).

With this approach, you need

  • a copy of the application schema to store the results ( and each table would need to have an additional column added ).
  • A known starting dataset, and a simple means to get back there to re-start your testing ( all the results must be defined based on the starting point ) [ flashback database might be the easiest way to get back to the starting dataset, but believe this feature is 10GR1 ]
  • An easily configurable mechanism for checking the contents of a set of tables based on the test id for the test being run.

Jsut some thoughts…


#6

Thanks, I will definitely review your ideas and see what we can do - and when!


#7

Hi,

I use dynamically generated triggers for tracking of change traffic, both for test and production.

It’s all controlled by a simple table and these columns:
Chd_Key: Sequence. Also used as part of trigger name
Table_Name: Ex: My_Table
Text: A string, that defines what I wish to track.
Ex: Destination changed from [old.destination] to [new.destination] by {a_function(new.userid)}
Trigger_Type: Before/After: Insert, Update, Delete (or combination thereof)
Trigger_Condition: As needed

A trigger on this table creates a trigger on the table that I wish to monitor. The trigger body only contains a procedure call with the parsed out content of the Text column as parameter.

Ex: Log_Change(‘My_Table’,'Destination has changed from '||:old.destination||
’ to ‘||:new.destination||’ by '||a_function(:new.userid));
In my case, all such Log_Change procedures write the resulting strings to the same log table. There’s also some standard info to ID the data, timestamp, Etc.

In the CodeTester, I could imagine some use for this concept. If I enter which columns I wish to monitor and the conditions, add in values that I definitely don’t want to see and others that should be represented, Etc. Etc. CodeTester builds one or more triggers and attach them to my table(s). Results are written off to some standardized, parsable log table and, when I’m done testing it drops the triggers again.

The only problem with this is when other triggers already exist on the tables. Due to the lack of a controllable firing order, multiple same-type triggers are a real pain in the rear end. (Thanks Bryn, for fixing that in 11g). Until then, the only option is to amend the existing trigger rather than creating a new one. Our cm guys get multiple heart attacks though, when hearing this kind of “crazy-talk”.

Admittedly, I haven’t tried the CodeTester yet. But seeing it demo’ed at OpenWorld definitely got my attention.

Mike


#8

Thanks, Mike. This looks very interesting. I will add it to our list of ideas to consider post-February release!

SF


#9

We usually restrict our test runs down to small (<10) rows - the aim here is test the quality of the code in question, and to be able to run your test suite quickly, so that you get into the habit of running it frequently and spot errors as you code them, rather than a day or (worse) even longer. Do anything to force the results of the batch to be small -
ensure that the input tables/lists are small for this run.
If neccasary, add an optional input parameter to the job which defaults to % but can be a specific id or range of ids.
Change date ranges - anything you can to keep the changes small and manageable.


#10

I contend that the amount of available data should have no influence on a test case. For one, it means filtering the data for some particular purpose as well as possibly storing that data outside of its usual environment. How do you guarantee that an un-biased filter-mechanism was applied? And, it takes a, possibly, considerable amount of work to design the filter. And, how do you know that the otherwise successful code has no adverse effects on all the data that wasn’t tested?

I maintain that only database triggers can provide an honest answer. In other words, the test case does whatever it was designed to do and the triggers capture any modifications to the data. As a positive side effect, independently created test cases (business logic vs. data changes) may allow for the additional check of the business logic doing anything at all as well as anything it should or should not do.

Mike


#11

As an aside to testing datasets, does anyone know what happened to Quest’s Data Factory? From my reading about testing methodology years ago, wouldn’t you want to manufacture several different sets of data for each test case to test all possible execution paths? Data Factory would create a number of rows based on your creation criteria and using its own manufacturing for standard types of data, then store these as reloadable sets.

I’m dealing with a very large legacy database where the data is batch loaded/manipulated and the rows are in the millions. Also, this db deals with data that could easily be used for identity theft so we have to obfuscate any production data we use for testing. So testing with raw production data is not very viable.

While looking at a way to test the effects of code changes on different execution paths, I decided to look at data manufacturing tools. It seems to me that Data Factory and Code Tester should go well together. But now Data Factory has disappeared!


#12

Data Factory has just entered a sunset phase at Quest Software. It’s no longer available for purchase and as of December 31, 2007, it will no longer be supported by Quest Support.

While the product is no longer available, we are evaluating the possibility of making it a feature in one of the Toad family line of products.

-Jeff