Benchmark on Mini Fish

How to Implement a Simple Load Using Sysbench

Mon, 14 Dec 2020 12:06:00 +0800

Sysbench is a tool commonly used in database testing. Since version 1.0, it has supported more powerful custom functions, allowing users to conveniently write some Lua scripts to simulate load. The purpose of writing this article is, firstly, because I wanted to explore Sysbench’s custom load usage. Secondly, because I tried the mysqlslap tool provided by MySQL’s official source, and found that it freezes easily during database performance testing, which could mislead users into thinking there is an issue with the database, causing trouble for many. Therefore, I want to help people avoid these pitfalls.

A Simple Example

#!/usr/bin/env sysbench

require("oltp_common")

function prepare_statements()
end

function event()
    con:query("set autocommit = 1")
end

The first line require includes Sysbench’s built-in basic library; the empty prepare_statement is a callback function from oltp_common that must be present; the specific execution of a single load is implemented in the event function.

Save this script as a Lua file, for example, named set.lua, and then execute it using sysbench.

sysbench --config-file=config --threads=100 set.lua --tables=1 --table_size=1000000 run

You can use the above command. Of course, here --tables=1 and --table_size=1000000 are not useful for this load, so they are optional. --threads controls concurrency.

$ cat config
time=120
db-driver=mysql
mysql-host=172.16.5.33
mysql-port=34000
mysql-user=root
mysql-db=sbtest
report-interval=10

In the config file, parameters you don’t frequently adjust are written once to avoid having a long string of parameters in the command line. These are required fields: time represents the test duration, report-interval is used to observe real-time performance results, and the others pertain to how to connect to the database.

The running output generally looks like:

[ 10s ] thds: 100 tps: 94574.34 qps: 94574.34 (r/w/o: 0.00/0.00/94574.34) lat (ms,95%): 3.68 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 100 tps: 77720.30 qps: 77720.30 (r/w/o: 0.00/0.00/77720.30) lat (ms,95%): 5.28 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 100 tps: 56080.10 qps: 56080.10 (r/w/o: 0.00/0.00/56080.10) lat (ms,95%): 9.22 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 100 tps: 93315.90 qps: 93315.90 (r/w/o: 0.00/0.00/93315.90) lat (ms,95%): 4.82 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 100 tps: 97491.02 qps: 97491.02 (r/w/o: 0.00/0.00/97491.02) lat (ms,95%): 4.65 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 100 tps: 94034.27 qps: 94034.27 (r/w/o: 0.00/0.00/94034.27) lat (ms,95%): 4.91 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 100 tps: 74707.37 qps: 74707.37 (r/w/o: 0.00/0.00/74707.37) lat (ms,95%): 6.79 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 100 tps: 89485.10 qps: 89485.10 (r/w/o: 0.00/0.00/89485.10) lat (ms,95%): 5.18 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 100 tps: 109296.44 qps: 109296.44 (r/w/o: 0.00/0.00/109296.44) lat (ms,95%): 2.91 err/s: 0.00 reconn/s: 0.00

Finally, there will be a summary report.

SQL statistics:
    queries performed:
        read:                            0
        write:                           0
        other:                           10424012
        total:                           10424012
    transactions:                        10424012 (86855.65 per sec.)
    queries:                             10424012 (86855.65 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

Throughput:
    events/s (eps):                      86855.6517
    time elapsed:                        120.0154s
    total number of events:              10424012

Latency (ms):
         min:                                    0.09
         avg:                                    1.15
         max:                                 1527.74
         95th percentile:                        4.91
         sum:                             11994122.49

Threads fairness:
    events (avg/stddev):           104240.1200/600.21
    execution time (avg/stddev):   119.9412/0.01

How to Test CockroachDB Performance Using Benchmarksql

Fri, 06 Jul 2018 21:21:00 +0800

Why Test TPC-C

First of all, TPC-C is the de facto OLTP benchmark standard. It is a set of specifications, and any database can publish its test results under this standard, so there’s no issue of quarreling over the testing tools used.

Secondly, TPC-C is closer to real-world scenarios as it includes a transaction model within it. In the flow of this transaction model, there are both high-frequency simple transaction statements and low-frequency inventory query statements. Therefore, it tests the database more comprehensively and practically.

Testing TPC-C on CockroachDB

This year, CockroachDB released its TPC-C performance results. However, unfortunately, they did not use a tool recognized by the database industry that implements the TPC-C standard for testing. Instead, they used their own implementation of a TPC-C tool. The compliance level of this tool was not recognized. In the white paper officially released by them, it is also mentioned that this TPC-C cannot be compared with the TPC-C standard.

Therefore, I thought of using a highly recognized tool in the industry for testing. Here, I chose Benchmarksql version 5.0.

Benchmarksql 5.0 supports the PostgreSQL protocol, Oracle protocol, and MySQL protocol (the MySQL protocol is supported in the code, but the author hasn’t fully tested it, so the official documentation doesn’t mention MySQL). Among these, the PostgreSQL protocol is supported by CockroachDB.

Test Preparation

After preparing the Benchmarksql code, don’t rush into testing. There are three main pitfalls here that need to be addressed first.

CockroachDB does not support adding a primary key after table creation. Therefore, you need to include the primary key when creating the table. Specifically, in the run folder under the root directory of the Benchmarksql code, create a sql.cdb folder. Copy tableCreates.sql and indexCreates.sql from the sql.common folder at the same level into sql.cdb. Then move the primary keys in indexCreates.sql into the table creation statements in tableCreates.sql. For how to define indexes while creating tables, please refer to the database documentation syntax via Google.
CockroachDB is a “strongly typed” database. This is my own way of describing it. It has a rather peculiar behavior: when you add different data types (e.g., int + float), it will report an error saying, “InternalError: unsupported binary operator: + ”. Generally, databases don’t behave like this; most would perform some implicit conversions, or in other words, they are very tolerant of SQL writers. But CockroachDB is unique in that if you don’t specify the type, it reports an error. This greatly reduces the burden of type inference in its internal implementation.

This behavior causes Benchmarksql to fail to run the tests properly. The solution is to add the required type at the position where the error occurs. For example, change update t set i = i + ?; (the ? is generally filled in using prepare/execute) to update t set i = i + ?::DECIMAL;. Yes, CockroachDB specifies types explicitly by adding :: at the end. But strangely, not all additions require type specification.
CockroachDB does not support SELECT FOR UPDATE. This is the easiest to solve: comment out all FOR UPDATE clauses in Benchmarksql. CockroachDB itself supports the serializable isolation level; lacking FOR UPDATE doesn’t affect consistency.

Starting the Test

After overcoming the pitfalls mentioned above, you can proceed with the normal testing process: creating the database, creating tables and indexes, importing data, and testing. You can refer to Benchmarksql’s HOW-TO-RUN.txt.

Test Results

On my test machine with 40 cores, 128 GB of memory, and SSD, under 100 warehouses, the tpmC is approximately 5,000. This is about one-tenth of PostgreSQL 10 on the same machine. PostgreSQL can reach around 500,000 tpmC.

How to Test CockroachDB Performance Using Sysbench

Mon, 11 Jun 2018 13:50:00 +0800

Compiling Sysbench with pgsql Support

CockroachDB uses the PostgreSQL protocol. If you want to use Sysbench for testing, you need to enable pg protocol support in Sysbench. Sysbench already supports the pg protocol, but it is not enabled by default during compilation. You can configure it with the following command:

./configure --with-pgsql

Of course, preliminary work involves downloading the Sysbench source code and installing the necessary PostgreSQL header files required for compilation (you can use yum or sudo to install them).

Testing

The testing method is no different from testing MySQL or PostgreSQL; you can test any of the create, read, update, delete (CRUD) operations you like. The only thing to note is to set auto_inc to off.

This is because CockroachDB’s auto-increment behavior is different from PostgreSQL’s. It generates a unique id, but it does not guarantee that the ids are sequential or incremental. This is fine when inserting data. However, during delete, update, or query operations, since all SQL statements use id as the condition for these operations, you may encounter situations where data cannot be found.

That is:

When auto_inc = on (which is the default value in Sysbench)

Table Structure

CREATE TABLE sbtest1 (
   id INT NOT NULL DEFAULT unique_rowid(),
   k INTEGER NOT NULL DEFAULT 0:::INT,
   c STRING(120) NOT NULL DEFAULT '':::STRING,
   pad STRING(60) NOT NULL DEFAULT '':::STRING,
   CONSTRAINT ""primary"" PRIMARY KEY (id ASC),
   INDEX k_1 (k ASC),
   FAMILY ""primary"" (id, k, c, pad)
)

Data

root@:26257/sbtest> SELECT id FROM sbtest1 ORDER BY id LIMIT 1;
+--------------------+
|         id         |
+--------------------+
| 354033003848892419 |
+--------------------+

As you can see, the data does not start from 1, nor is it sequential. Normally, the id in a Sysbench table should be within the range [1, table_size].

SQL

UPDATE sbtest%u SET k = k + 1 WHERE id = ?

Taking the UPDATE statement as an example, id is used as the query condition. Sysbench assumes that this id should be between [1, table_size], but in reality, it’s not.

Example of Correct Testing Command Line

sysbench --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=26257 --pgsql-user=root --pgsql-db=sbtest \
        --time=180 --threads=50 --report-interval=10 --tables=32 --table-size=10000000 \
        oltp_update_index \
        --sum_ranges=50 --distinct_ranges=50 --range_size=100 --simple_ranges=100 --order_ranges=100 \
        --index_updates=100 --non_index_updates=10 --auto_inc=off prepare/run/cleanup

INSERT Testing

Let’s discuss the INSERT test separately. The INSERT test refers to Sysbench’s oltp_insert. The characteristic of this test is that when auto_inc is on, data is inserted during the prepare phase of the test; otherwise, only the table is created without inserting data. Because when auto_inc is on, after the prepare phase, during the run phase, the inserted data will not cause conflicts due to the guarantee of the auto-increment column. When auto_inc is off, the id of the data inserted during the run phase is randomly assigned, which aligns with some actual testing scenarios.

For CockroachDB, when testing INSERT operations with auto_inc set to off, after the prepare phase, during the run phase of data insertion, you can observe the monitoring metrics (by connecting to CockroachDB’s HTTP port) under the “Distribution” section in “KV Transactions”. You’ll notice a large number of “Fast-path Committed” transactions. This indicates that transactions are committed using one-phase commit (1PC). That is, the data involved in the transaction does not span across CockroachDB nodes, so there’s no need to ensure consistency through two-phase commit transactions. This is an optimization in CockroachDB, which is very effective in INSERT tests and can deliver excellent performance.

If auto_inc is on, although for other tests that require read-before-write operations, the results in CockroachDB might be inflated, it is still fair for the INSERT test. If time permits, you can supplement the tests to see the differences.

How to View CMU DB Group's OLTP-Bench

Fri, 23 Feb 2018 00:00:00 +0000

Introduction to OLTP-Bench

OLTP-Bench is an open-source benchmarking tool platform for OLTP scenarios from CMU’s DB Group. It was designed to provide a simple, easy-to-use, and extensible testing platform.

It connects to databases via the JDBC interface, supporting the following test suites:

TPC-C
Wikipedia
Synthetic Resource Stresser
Twitter
Epinions.com
TATP
AuctionMark
SEATS
YCSB
JPAB (Hibernate)
CH-benCHmark
Voter (Japanese “American Idol”)
SIBench (Snapshot Isolation)
SmallBank
LinkBench

Detailed project information can be found here, and the GitHub page is here.

The project introduction page includes three papers published by the authors, with the one from 2013 being the most important, also linked on the GitHub page.

Based on the GitHub page, the project does not seem to have a high level of attention and has not been very active recently. Most issues and pull requests come from within CMU.

OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases

The paper “OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases” can be regarded as the most detailed introduction to this project.

In the first and second chapters, the authors introduce the motivation for creating this framework, which is to integrate multiple test sets and provide features that simple benchmarking tools do not have, while offering excellent extensibility to attract developers to support more databases.

From the activity on GitHub, it is evident that this extensibility is more about adding database support rather than test sets. However, the number of supported test suites is already quite extensive.

Chapter three introduces the architectural design, with a focus on test suite management, load generators, SQL syntax conversion, multi-client scenarios (similar to multiple sysbench instances stressing a single MySQL), and result collection.

Chapter four discusses the supported test suites. I’m only familiar with TPCC and YCSB. The authors classify them from three perspectives:

Transaction-focused, such as TPCC and SmallBank
Internet applications, like LinkBench and Wikipedia
Specialized tests, such as YCSB and SIBench

Further details can be seen in the table: [table]

Chapter five describes the demo deployment environment, with subsequent sections introducing the demo’s features.

Chapter six uses the demo from the previous chapter to introduce features, analyzed as follows:

Rate control. It seems odd for a benchmarking tool to perform rate control, as the conventional understanding is to push performance as high as possible to gauge system limits. The paper provides an example using the Wikipedia test suite, increasing by 25 TPS every 10 seconds to observe database latency changes.
Tagging different transactions in the same test suite for separate statistics – using TPCC as an example to statistically categorize transactions from different stages.
Modifying load content, like switching from read-only to write-only loads.
Changing the method for load randomness.
Monitoring server status alongside database monitoring by deploying an OLTP-Bench monitor on the server.
Running multiple test suites simultaneously, such as running TPCC and YCSB concurrently.
Multi-client usage, mentioned in chapter three.
Repeatability. To prove OLTP-Bench results are genuine and reliable, the authors tested PG’s SSI performance using SIBench from the tool on similarly configured machines, achieving results consistent with those in PG’s SSI paper.

In summary, rate control and transaction tagging stand out as novel features, while the rest are not particularly special.

Chapter seven is arguably the most valuable part of the article, discussing cloud environments where users might only have database access and not server control. Users may struggle to assess the cost-effectiveness of different cloud database services or configurations due to charges encompassing CPU, storage, network, and asynchronous sync in some architectures. Thus, using benchmarking tools to derive performance and subsequently calculate cost-effectiveness is particularly worthwhile. This chapter compares varying perspectives: different service providers, configurations, comparing databases on the same configuration, and presents the cost-effectiveness outcomes.

In chapter eight, the authors compare OLTP-Bench with other similar tools, providing a favorable self-assessment.

Chapter nine outlines the authors’ future plans, including support for pure NoSQL, additional databases’ proprietary SQL syntax, generating real-world load distributions from production data, and support for stored procedures.

In conclusion, as the authors mentioned, this is an integrative framework where ease of use and extensibility are key.

Usage Summary

OLTP-Bench is relatively simple to install and use, especially the deployment. Its cross-platform nature provides a better user experience compared to traditional tpcc and sysbench. Usage is relatively straightforward due to the plethora of test configuration templates provided, allowing easy initiation of tests with simple configuration file modifications. The test results are stable, although certain features mentioned in papers, like server status monitoring, still require exploration.

I tested all 15 test suites on MySQL 5.7 and TiDB, obtaining the following results: [table]

Its usability is quite evident. As for the ease of secondary development, it should be relatively simple, considering the entire OLTP-Bench project is not particularly large, with around 40,000 lines of code.

Other

tpch: While the framework’s code appears to support tpch, it proved unusable during practical tests, likely due to incomplete implementation and thus excluded from the README.
Referring to future work mentioned in chapter nine of the paper, especially “generating load to match production data distribution,” this remains unimplemented, as seen in the codebase.