Just How Difficult *IS* Index Tuning?

Here's the Problem

In a recent blog post I discuss some difficulties that one of our customers had when attempting to index-tune a poorly performing report query: adding one new index to one table caused the report query to blaze , but it messed up a bunch of other SQL operations that the DBA team had to fix. Has that ever happened to you?

Index tuning is best done not by looking at an individual SQL statement or two, but by looking at the concept of a SQL workload (many SQL statements). Ah, but just finding new indexes and how they can be grouped—let alone testing which ones are the best—can be very difficult and time-consuming.

If you haven’t read the blog, I encourage you to read it, but even if you haven’t, it's ok...the quiz questions below might give you an appreciation for how difficult it can be to consider new indexes and their combinations.

Here's the scenario.

Suppose you have a SQL workload (say, a list of SQL statements). Those SQL statements only access five small tables. Only five. Each table has only three columns. Only three. Your goal? Find the best combination of indexes that maximize the performance of SQL in the workload against those five tables and their columns.

Sounds easy, right? You may be surprised. See if you can guess the answers to the questions below. To keep it simple, let’s consider only “vanilla” column-based indexes. Composite indexes are ok, but don’t consider clustered-type indexes, function-based, or other index types.

Here's the Quiz

Post your replies here in the forum by July 17, and we’ll send you some cool Toad stickers. Answers will be included in part 3 of my Turnpike blog post in July. Stay tuned!

  1. What is the total number of possible indexes on just one table? How about on all five tables?
    (Two questions in one because they’re easy!)

  2. What is the total number of different combinations of these indexes, choosing up to, say, three indexes at a time?

  3. How about the total number of index sets choosing up to, say, 10 indexes at a time?

  4. (Extra Credit) What is the total number of index sets, regardless of the number of indexes in each set?

3 Likes

Hey everyone, I'm not going to "grade" your answers here, so don't be shy about responding... just a bit of fun to lighten your day.

And i'll give you a head start... answer to question 1a: number of possible indexes for just one table is 15.

1 Like

Hi Gary,

ok. I'll bite....

1. One Table, How many indexes possible with three columns?

Single columns: A, B and C = 3 indexes.
Two columns: AB, AC, BC, BA, CA and CB = 6 indexes.
Three columns: ABC, ACB, BCA, BAC, CAB and CBA = 6 indexes.
Total = 3 + 6 + 6 = 15.

For 5 tables, all with 3 columns, 15 * 5 = 75 total indexes.

2. How many combinations of three indexes?

C(N,R)  = N! / (R! * (N - R)!)
C(75,3) = 75! / (3! * (75 - 3)!)
        = 75! / (6 * (72)!)

My calculator doesn't do 75! or 73!, so the anwer was cheatingly retrieved from an online calculator at https://www.omnicalculator.com/math/factorial.

Without duplicates, 67,525 different combinations. With duplicates 73,150 combinations.

(75! = 24,809,140,811,395,398,091,946,477,116,594,033,660,926,243,886,570,122,837,795,894,510,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000)

3. How many combinations of 10 indexes?

C(N,R)   = N! / (R! * (N - R)!)
C(75,10) = 75! / (10! * (75 - 10)!)
         = 75! / (10 * (65)!)

Online again, gives 828,931,106,355 unique combinations or 2,761,025,887,620 with duplicates. (That's a lot!)

4. Total number of index sets, regardless of how manuy indexes in the set?

Sorry, no idea. It was at this point my aging brain decided it had had enough and escaped through my ears, ran across the road and was last seen catcing a bus! :slight_smile:

Cheers,
Norm. [TeamT]

1 Like

Just discovered, I don't need the C(N,R) function/formula really. To pick 3 indexes from 75 options, it's simplified to:

(75*74*73)/(3*2*1)
= 405150/6
= 67,525

And the pick 10 from 75 is similar:

(75*74*73*72*71*70*69*68*67*66) / (10*9*8*7*6*5*43*2*1)
= 3.0080251e18 / 3628800
= 8.2893107e11

Cheers,
Norm. [TeamT]

2 Likes

I knew there's got to be folks out there who might want to take a crack at this! On the right track, sir... remember: answers coming up in the next blog!

1 Like

1: 6, 30
2: 30 * 29 * 28 = 24,360
3: 30* 29 * ... * 22 * 21 = 109,027,350,432,000‬
4: 30! = 2.6525285981219105863630848e+32‬
(I would argue you would never index all columns or you would just use an index organized table...)

So, the first quiz question I tried to keep simple, and even furnished the correct answer. However, I was wondering if anyone would catch the subtlety regarding practical implementation (IOTs aside). So kudos to you, Lawrence, for recognizing that it doesn't buy us anything to create a composite index on all three columns of each table. e.g. once you define a composite index on two of the three columns, then there's only one choice for the third column.

Responders, you're definitely on the right track here, although, Lawrence, 3+6= 9.

D'oh! Always get nailed by the simple things.

1 Like

So did I. I indexed all three columns. Heavens above! (As almost nobody says these days!)

Cheers,
Norm. [TeamT]

Well, Gromit's owner, Wallace, says that all the time... :upside_down_face:
(for those who actually know who I'm referring to...)

OK, so here’s the answer to the quiz questions, folks, with some dialog about being practical in all this.
Good job by those who contributed on this thread, and those who contacted me directly. We’ll send some Toad goodies to y’all shortly.

  1. What is the total number of possible indexes on just one table? How about on all five tables?
    Answer:
    Assuming you can define one-, two-, or three-column indexes, then 15 indexes per table.
    5 x 15 = 75 :wink: total possible indexes on the five tables.

    Let’s be practical:
    Indexes that include the third (last) column of the table offer no performance advantage over its two-column index. There are 6 such 3-column indexes. So for practical purposes, we can safely ignore those six for each table. That brings us to 9 practical indexes per table.
    9 x 5 = 45 :wink: total possible practical indexes.

  2. What is the total number of different combinations of these indexes, choosing up to, say, three indexes at a time?
    Answer:
    If we’re considering 75 total indexes on the five tables, then there are 70,375 :wink: such index combination sets.
    (Original 75 indexes, plus the combination of those 75 indexes taken 2 at a time, plus the number of indexes taken 3 at a time.)

    Let’s be practical:
    If we’re considering 45 practice indexes, then there are 15,225 :wink: such index sets.
    (45 indexes, plus the combination of those 45 indexes taken 2 at a time, plus the number of indexes taken 3 at a time.)

  3. How about the total number of index sets choosing up to, say, 10 indexes at a time?
    Answer:
    If we’re considering 75 possible indexes on the five tables, then there are just under one trillion such index sets (973,602,516,870 :dizzy_face: to be exact).

    Let’s be practical:
    If we’re considering 45 useful indexes in practice, then there are over four billion such index sets (4,337,283,236 :dizzy_face: to be exact).

  4. (Extra Credit) What is the total number of index sets, regardless of the number of indexes in each set?
    Answer:
    2 to the power 75, if we’re considering 75 total indexes for all 5 tables.
    2 to the power 45, if we’re considering the 45 practical indexes for all 5 tables.

These numbers, by the way, are enormous. Consider the “smaller” of these two (2 to the 45th power). This number is so large that it would take your fastest computer over one year of no-downtime continuous operation to simply identify all the index sets here, assuming it could generate them at one million sets per second! :woozy_face: :woozy_face: :woozy_face: And that’s not counting the time it would take to test the performance of any of these index combinations!

But really, let’s be practical:
While these numbers are mathematically interesting, an overwhelming percentage of these index combinations are definitely not feasible or practical. In fact, any performance consultant would argue (and rightfully so) that one should never (ever) over index database tables.

Ok, time for a different tact. Let’s say I only want to consider 2 indexes per table. That sounds much more reasonable, right? But even here, we’re running into largeness. Nine possible practical indexes choosing two at a time for each of the 5 tables gives us 36 to the power 5, or over 60 million :roll_eyes: possible combinations.

Alright. Let's be frugal and choose just one of the nine possible indexes per table. Even here, we would need to look at 9 to the power 5 = 59,049 :sleepy: feasible sets of indexes for just those five tables.

Lastly, I would be remiss if I didn’t talk about other factors that determine which index sets are feasible, practical, optimal, etc. Factors like

  • The volume of data in the tables
    (Do we really need to index small tables ?)

  • The referential integrity between the tables
    (Might want to index those foreign key columns, etc.)

  • The SQL Workload of your applications
    (What is the mix of SQL? Are some queries more executed more than others?)

  • The datatypes of the columns to be indexed
    (Are bitmap- or function-based indexes in order? How about clustered indexes? Index-organized tables?)

  • How often columns are cited, and in which DML filters
    (Some columns are cited more frequently in WHERE clauses than others, etc.)

  • .............................................................
    (Fill in the blank here with other factors… there’s lots more)

Is your head spinning? Mine is. This is tough stuff to do. Is it any wonder why vendors of CRM, ERP, and other enterprise-level solutions---which have thousands of tables and columns, by the way---can’t quite seem to nail down an optimal set of indexes for their solutions that satisfy their customers’ performance needs? Now you know. It’s near impossible.

3 Likes