Data compare results export issues

I've been working in TOAD data point 5.0+ 64 bit on Windows 10 with 64 bit Office 365, and while the data compare is working great within the application, I am having some problems when exporting the data compare results. This is a problem as we would like to use the exported Excel files to share compare results with users who do not have TOAD Data Point, and save them as documentation.

There are currently 2 major issues with the data compare export files:

1.) Data compare exports (Excel files) do not function well if larger than 50-100k rows, and freeze Excel if larger than 400-500k rows.
I tried using the data compare on a table that has 500k+ rows X 80 columns and the compare works fine in TDP, however if I export this to Excel it locks up my excel application upon opening this file in Excel. RAM use will climb to about 8gb until Excel just becomes unresponsive. It appears to be a problem with the formatting causing a lot of strain on Excel as if I take a smaller file and clear formatting , it becomes much more responsive. I was also able to get the 500+k row file to allow me to select all data , and paste values to a new workbook. This new workbook was working fine until I then tried to paste formatting from the original data compare export , at which point both Excel instances froze. So it would appear the formatting may be causing a large strain on Excel for files with a lot of rows. Clearing the formatting would render the compare files almost useless as being able to filter by color is what makes these so very useful.

2.) When doing a data compare, if either the option for "Ignore whitespace" or "treat empty values as nulls" is enabled, the data compare export to excel causes EVERY string to be highlighted in purple as a "difference". This is not seen in the GUI, and the compare results are displayed correctly in TDP. It only occurs when exporting the compare with one of these options enabled. This essentially makes exporting a data compare result using these options unusable.

Aside from the above issues, it would perhaps be an item for the idea pond that TOAD data point offers other options for exporting data compare results. Excel is good for smaller comparisons but doesn't work (currently) for large comparisons. It will also have a limit to 1 million rows making it impossible to save a comparison result of 1+ million rows.

Thanks,
Dan

Hi Dan,
thanks for your question.
ad 1) Maybe you've already browsed for some performance advice, this thread seems to discuss a few Excel performance points. From our point view, you could consider how to optimize your query - by lowering the number of exported columns, rows.
How much RAM have you got? 16 GB is recommended. Which version of TDP do you use? There were a couple of performance adjustments in .0.6 so that may help too.
ad 2) I need to test this point.

Which format do you suggest? Excel or CSV files in 64bit Excel are usually large enough for most of our users.

Hi Dan,

Thanks for your response.
1.) I just read through this, and here are my responses;

  • I can rule out most of the suggestions dealing with external links, custom formulas etc as this file is directly exported from TOAD data point, I have not made any modifications to it.
  • Lowering the exported rows does help, but in this particular instance this is already the reduced row count (500k rows) . The full comparison row count was 1.4 million but excel cannot support over 1 mm rows.
  • I've tried the other suggestions but it did not improve the performance.
  • I have tried this on 2 machines. My laptop with a 4 core (8 hyperthreaded) Intel Core i5-8359U with 16 gb RAM. The excel sheet will freeze immediately on this PC. I also have a Virtual PC that has enhanced resources, this has an Intel Xeon Gold 6148 x 8 cores , and 32 gb RAM. This VPC is able to open the file, and sometimes I can do a few small actions before it freezes, but eventually excel just becomes unresponsive here as well. It typically happens once excel RAM use gets to about 8gb.
  • I was previously trying this on TDP 5.06.97, and now am trying on 5.07.106 .

2.) The format for your input file may not matter, as when I'm doing the compare I am doing it directly from a results tab within a query. So it seems to happen regardless if the source data for my query was an excel file, Hadoop table, or csv. But if you wanted to replicate the examples I have done , I am importing data sets from Excel sheets into Local Storage Connection as their own tables with matching schema. I set most columns to varchar to prevent any loss of data from conversion.

I will attach some screenshots shortly .

Thanks,
Dan

before_opening

After 15 minutes I am able to see the data and click on some things, although there is a 5 second delay with everything I do. Most operations will cause the screen to go white and excel stops responding for 1-5 minutes.

Eventually Excel will just become completely unresponsive. RAM use stopped at 5gb this time. This screenshot is after 1 hour of waiting. The last thing I tried to do was click on a cell and scroll down.

FYI the original file as exported from TDP is 204mb.

I was able to copy & paste values from the workbook into a new workbook. By pasting values and not carrying over any formatting, this pasted values workbook works fine, even on my laptop. So it doesn't appear to be the number of rows / cols of data, but something with formatting or something else that TDP is adding into the sheets.

I have made some progress with the very slow / unresponsive excel exports. Item number 1 above.
It seems to be related to Word Wrap being enabled by default.

I took an unresponsive export workbook and tried copying the data to a new workbook. It was fine, so then I tried replicating formatting from the toad export over one component at a time. I added cell borders and cell shading to match the TOAD export and this drove memory use up, but the workbook was still very responsive.
I then added word wrapping and boom, excel went into meltdown mode again.

So now I am experimenting with turning word wrapping off on exported excel sheets, and it's taking a very long time for Excel to process (~30-90 minutes per workbook). So if this does "fix" the exported workbooks, it would indicate the option to disable word wrap on the exports would be a huge improvement.

Update: I took a 560k row by 100 column data compare export and opened it on a machine with 32gb ram. If I am careful not to interact with the worksheet much, I am able to select all columns and widen the column lengths to 150. I can then turn off word wrapping for all cells. This takes a long time but eventually finishes.

After this, if I save the document and close out, then reopen it, it is still using a lot of memory but it responds slightly better. If I then delete the first column containing the arrow images, then save and reopen again, the workbook will now become much more responsive and use much less memory.

Finally I can delete all empty "spacer" columns and save again, at which point the file is able to be opened and used even by someone with 32 bit excel and 16gb ram.

So it would seem the ability to disable the following will vastly improve the performance of data compare exports:

-Disable word wrap
-Disable the arrow images in column A
-Disable export of spacer columns

Thanks,
Dan

Hi Dan,
really thank you for your investigation steps. I already created task in our system TMB-1713 and we will try to go over and implement some of your suggestions.

In other hand. I would like to know why do you need to export all of your results to Excel file? Usually users use it as a report only for Diff results which is not so big so formatting is helpful. Isn't there any better workaround how to present your results?

Filip

Hi Filip,

Thank you for the update, and I look forward to hearing more about the solution implementation!

As far as reducing the exported rows, that is currently what I am doing - reducing the export to only show differences. For example, the 60k row export was only differences from a 100+ k data sets, and the 560k export was differences from a 1.4 million row data set.

I have tried doing a comparison of only a few columns at a time, thus reducing the amount of data differences exported, however this makes reviewing the results much more cumbersome as I have to split it into 5+ workbooks, and even then each workbook is still very slow.

Thanks,
Dan

As an update this is still an issue, running TDP 5.1.0.142 64 bit, and multiple users are seeing the slowness issues and excel freezing when working with large export files. I have not yet found a workaround other than painstakingly removing word wrapping which can take 1-4 hours for excel to process on larger files.

Hi,
we will start to work on this item in the next few weeks. If you are interested I can provide you some pre-release build for test.
Filip

1 Like

That is great news Filip. And yes please keep me in the loop and I can help with testing.

Thanks,
Dan