Toad World® Forums

How much TPS should I target in TPC-E ?

I am running TPC-E benchmark using BMF trial edition and facing several problems :

  • Database created is much more than what is estimated in BMF. For example ~350GB for SF 20 where estimated is ~200GB.
  • I am using SQL server 2014 trial edition on Windows Server 2012 R2 with BMF. Running with SF 20 gives below error :
    Agent(WIN-J41U7IFG39Q) Error: High memory usage of 98%. Could not create thread for virtual user #1.

Test Cancelled due to errors.

The test runs fine with SF 5 and 10 but how much TPS should I expect. The BMF user guide suggest 2 TPS per 1000 customers i.e. SF 1.

But I saw some Dell papers only stating that this TPS is per user. So if you are running with 10 or 20 virtual users, TPS will get multiplied to same factor.

For Ex for SF 5 , 5000 customer rows:

Expected TPS per virtual user= 2 TPS * 5 = 10

For 10 virtual users = 10 *10 = 100 TPS

For 50 = 500 and so on.

Link : (http://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/Using-the-R820-with-Express-Flash-to-accelerate-the-performance-of-a-large-SQL-2012-database.pdf)

Whereas in my case I am always getting max TPS of 99.x only with 100 virtual users. Please clarify.

User Load

TPS

Avg. Response Time (ms)

Avg. Transaction Time (ms)

Total Executions

Total Rows

Total Errors

50

49.839

1

3

8952

610198

0

80

79.717

1

3

14331

962267

0

100

99.264

1

7

17906

1197926

0

The error you are seeing on the agent means that the agent does not have enough resources to function. Usually this happens when trying to drive high amounts of throughput through a single agent. Try creating more agents or perhaps moving the agents to a different machine.

For the size estimation problems, the results can vary due to block sizes or other database configuration items that BMF cannot account for.

As for TPS - it’s dependent on the database configuration. 100 users against an SSD is going to yield a higher TPS than 100 users on a fixed disk.

Hope that helps!

Hi Kevin,

Thanks for the quick reply. I have tried running with multiple remote agents. It gets connected to the remote agent but fails to start it while running the job.

I saw a similar question on the community and I am trying to implement their suggestions.

About TPS, I understand that an SSD should perform better than HDD. However I have tested the same with single SSD as well as connecting 3 SSD in RAID 5. The performance is not increasing than 99.8 whereas others are getting TPS in thousands as I see in some papers.

Can you please have a look at the above link of Dell paper. I do not get its logic that TPS obtained is per user. They are testing with 24,000 customer rows and hence targetting 48 TPS per user or 4800 TPS for 100 virtual users :open_mouth:

Thanks

Hi - I’ve looked over the paper and it’s not showing how many agents were used or whether or not think time latencies were decreased (by default Benchmark Factory uses 1000 ms think time latency between transactions).

This blog article might help explain some things about TPS:

www.toadworld.com/…/chasing-database-benchmark-maximum-transactions-per-second-tps.aspx

Hi Kevin,

Thanks for the reply.

I saw the blog post that you referred and I understand that for getting huge TPS , we need to increase the user load.

Though, I am still confused on the method used in the dell paper to calculate TPS. It contrasts my understanding of virtual users.

According to BMF user guide

“Each virtual user is a separate thread, acting independently of the other virtual users, with its own connection to the system-under-test. Each virtual user tracks its own statistics that includes transaction times and the number of times a transaction executes”.

So what I understand from here is that we are trying to simulate real world users through virtual users. Also,from TPC_E specification, definition of tpsE is given as “The Nominal Throughput of the TPC-E benchmark is defined to be 2.00 Transactions-Per-Second-E (tpsE) for every 1000 customer rows in the Configured Customers”.

Now this definition do not mentions that this tpsE is calculated per user. Inceasing the user load indeed increases the tpsE but what tpsE should be ideally expected is still a mystery to me.

If I keep the keying and thinking time as default and run with 100 virtual users, I am getting the same performance with HDD as I get with 960GB SSD which is 99.x. So how to see the benefit of an SSD ??

So my question is , how this dell paper defines it’s logic of “Each user simulates the same TPC-E like workload, so one user performs the workload once, but running with 10 userswould run the workload 10 times in parallel” therafter aiming for 480 tpsE with 10 users running with 24,000 customer rows. Is the logic flawed ?

Please pardon me for my ignorance but I am a newbee in this benchmarking field. So please bear with me :slight_smile:

Any replies on this thread, I am facing the same issue. I am new to Benchmark factory.

1-How do virtual users in BMF map to users in TPC-E/TPC-C?

2-We are testing TPC-E in an 8 node Vmware cluster with 40 VCPU cores per host. We would like to push the cluster and were planning on running multiple instances of BMF and aggregating the results. Is this feasible with the commercial version that we acquired?

3-Would also like to get clarification on the HDD vs. SSD differences above.

Thank you.

Let me start with the first set of questions;

If I keep the keying and thinking time as default and run with 100 virtual users, I am getting the same performance with HDD as I get with 960GB SSD which is 99.x. So how to see the benefit of an SSD ??

To see the benefit of the SSD vs. a HDD you will need to drive more physical IO, which is the opposite to what RDBMS do to improve performance, especially with an OLTP type workload. So for an extreme example to illustrate this point, running a test using an OLTP workload, either TPC-C or TPC-E, against a database where the OLTP database size is 100G and the database has 256G of memory, then the entire database will be in memory cache so no physical IO and thus, no difference in SSD vs. HDD. So even in the case of the dataset size is larger than the database memory, if the database is able to cache 80% of the data used, then the performance improvement will not be realized. If you want to show the improvement of the SSD vs. HDD you can several things. One, force the database to perform more physical IO by reducing the database memory. Or you can the workload to something which requires more IO, like the TPC-H which is a data warehouse/OLAP type workload.

So my question is , how this dell paper defines it’s logic of “Each user simulates the same TPC-E like workload, so one user performs the workload once, but running with 10 users would run the workload 10 times in parallel” thereafter aiming for 480 tpsE with 10 users running with 24,000 customer rows. Is the logic flawed ?

It doesn’t appear that the logic is flawed. For the TPC-E each virtual user runs the transaction mix. There are no direct relationships, per say, between the virtual user and the underlying TPC-E dataset. For the different queries each virtual random select the data based of what the TPC-E specification specifies. It is just that the random number generators used by each virtual user are seeded uniquely so as to generate different number streams as compared to other virtual users.

Now with the TPC-C, BMF does assign virtual users a specific warehouse ID and a district ID, but when the scale factor, which in the TPC-C case each scale factor represents a warehouse and each warehouse has 10 districts, does not allow each virtual user to be assigned a single warehouse id/district ID pair, then it will give the virtual user as many pairs as needed to cover the dataset. For example using a TPC-C scale factor of 10, so 10 warehouses X 10 districts/warehouse = 100 pairs, and only 10 virtual users. Each virtual user will be given 10 pairs or warehouse/district ids.

I hope this makes sense but let me know if you have any further questions.

No for the second set;

1-How do virtual users in BMF map to users in TPC-E/TPC-C?

See the post above.

2-We are testing TPC-E in an 8 node Vmware cluster with 40 VCPU cores per host. We would like to push the cluster and were planning on running multiple instances of BMF and aggregating the results. Is this feasible with the commercial version that we acquired?

With the current version you can accomplish most of this, meaning you can run the TPC-E workload against the 8 different nodes at the same time. But to do the correlated report you will need to export the results of each of the test to something like Excel to correlate them.

3-Would also like to get clarification on the HDD vs. SSD differences above.

See above.

Thank you for the detailed explanation to the above Kevin. One of the difficulties I was having was correlating virtual users to the ability of BMF to push the database. I don't see a clear explanation of the relationship between system specs (CPU, RAM), virtual users, number of agents to maximize transactions etc...

In my case as I explained above, the objective is to see what the cluster can accomplish. Architecturally it would like the image below.

Here are my concerns:

1- We would be running multiple instances of BMF or at a minimum one on each ESX host, possibly more.

2-We have a license for 100 virtual users...Is that enough to drive this cluster so we get maximum CPU utilization (40 CPU cores/host for a total of 320 Cores)

3-Does the architecture as presented make any sense at all as a BMF use case.

The link to this bolg post:
http://www.toadworld.com/products/benchmark-factory/b/weblog/archive/2014/03/08/chasing-database-benchmark-maximum-transactions-per-second-tps.aspx
seems to be dead. Does anyone have an updated link?
Thanks!