Drizzle / dbqp updates

Just wanted to blog about some of the latest updates to dbqp.  We just merged some interesting changes into Drizzle (just in time for the impending Fremont beta).  In additional to general code cleanup / reorganization, we have the following goodies:

Randgen in the Drizzle tree

One of the biggest things is that the random query generator (aka randgen) is now part of the Drizzle tree.  While I did some of the work here, the major drivers of this happening were Brian and Stewart:

  1. Brian makes a fair argument that the easier / more convenient it is to run a test, the greater the likelihood of it being run.  Additional tools to install, etc = not so much.  Having something right there and ready to go = win!
  2. Stewart is also a fan of convenience, lotsa testing, and working smarter, not harder.  As a result, he did the initial legwork on merging the randgen.  I do suspect there is still much for me to learn about properly bzr joining trees and whatnot, but we’ll get it right soon enough ; )

This doesn’t mean we won’t be contributing any changes we make back to the main randgen project / branch, it is strictly to facilitate more testing for Drizzle.  As we already have our randgen tests packaged into dbqp-runnable suites, running these tests is even easier : )


Another request fulfilled in this update is the ability to use Stewart’s libeatmydata to speed up testing.  By default, dbqp uses shared memory as a workdir, similar to mysql-test-run’s –mem option (this can be bypassed in dbqp with –no-shm, fyi).  However, this isn’t always perfect or desirable to do.

An alternative is to use libeatmydata, which disables fsync() calls.  As the name implies, you don’t want to use it if care about your data, but for general testing purposes, it can greatly speed up test execution.

If you have the library installed / on your machine, you can use it like so:  ./dbqp –libeatmydata [–libeatmydata-path ] …

By default, libeatmydata-path is /usr/local/lib/libeatmydata.so (as if you used make install)

Multiple server types

IMHO, this is one of the coolest new tricks.  dbqp can now handle more than just Drizzle servers / source!  The ultimate idea is to allow tests that utilize more than one type / version of a server to have more interesting tests : )  This will be useful for scenarios like testing Drizzledump migration as we can feed in one (or more) MySQL servers and a Drizzle tree and make sure we can migrate data from all of them.

We also intend to utilize dbqp for testing a variety of Percona products, and it is kind of handy to be able to run the code you are testing ; )  I already have the tool running Percona / MySQL servers and have some randgen tests working:

$ ./dbqp.py --default_server_type=mysql --basedir=/percona-server/Percona-Server --mode=randgen
Setting --no-secure-file-priv=True for randgen usage...
20111013-163443 INFO Linking workdir /dbqp/workdir to /dev/shm/dbqp_workdir_pcrews_9dbc7e8a-2872-45a9-8a07-f347f6184246
20111013-163443 INFO Using mysql source tree:
20111013-163443 INFO basedir: /percona-server/Percona-Server
20111013-163443 INFO clientbindir: /percona-server/Percona-Server/client
20111013-163443 INFO testdir: /dbqp
20111013-163443 INFO server_version: 5.5.16-rel21.0
20111013-163443 INFO server_compile_os: Linux
20111013-163443 INFO server_platform: x86_64
20111013-163443 INFO server_comment: (Percona Server with XtraDB (GPL), Release rel21.0, Revision 188)
20111013-163443 INFO Using default-storage-engine: innodb
20111013-163443 INFO Using testing mode: randgen
20111013-163443 INFO Processing test suites...
20111013-163443 INFO Found 5 test(s) for execution
20111013-163443 INFO Creating 1 bot(s)
20111013-163449 INFO Taking clean db snapshot...
20111013-163452 INFO bot0 server:
20111013-163452 INFO NAME: s0
20111013-163452 INFO MASTER_PORT: 9307
20111013-163452 INFO SOCKET_FILE: /dbqp/workdir/bot0/s0/var/s0.sock
20111013-163452 INFO VARDIR: /dbqp/workdir/bot0/s0/var
20111013-163452 INFO STATUS: 1
20111013-163506 ===============================================================
20111013-163506 TEST NAME [ RESULT ] TIME (ms)
20111013-163506 ===============================================================
20111013-163506 main.blob [ pass ] 8624
20111013-163516 main.create_drop [ pass ] 2862
20111013-163524 main.many_indexes [ pass ] 1429
20111013-163547 main.optimizer_subquery [ pass ] 17153
20111013-163558 main.outer_join [ pass ] 4243
20111013-163558 ===============================================================
20111013-163558 INFO Test execution complete in 69 seconds
20111013-163558 INFO Summary report:
20111013-163558 INFO Executed 5/5 test cases, 100.00 percent
20111013-163558 INFO STATUS: PASS, 5/5 test cases, 100.00 percent executed
20111013-163558 INFO Spent 34 / 69 seconds on: TEST(s)
20111013-163558 INFO Test execution complete
20111013-163558 INFO Stopping all running servers...

Expect to see this up and running tests against Percona Server in the next week or so.  I’ll be writing more about this soon.

Native / unittest mode

This hasn’t made it into the Drizzle tree yet.  To ease merging the code with Percona Server / Xtrabackup, I’ve created a separate launchpad project.  One of the things we needed was the ability to write complex tests directly.  It is currently easy to plug new tools into dbqp, but we essentially needed a new tool for certain testing needs.

Our solution for this was to allow dbqp to run python unittest modules.  We still have a bit of work to do before we have some demo tests ready, but we will be creating some expanded Xtrabackup tests using this system very soon.  So far, it is turning out to be pretty neat:

./dbqp.py --default_server_type=mysql --basedir=/percona-server/Percona-Server --mode=native
20111013-190744 INFO Killing pid 1747 from /dbqp/workdir/bot0/s0/var/run/s0.pid
20111013-190744 INFO Linking workdir /dbqp/workdir to /dev/shm/dbqp_workdir_pcrews_9dbc7e8a-2872-45a9-8a07-f347f6184246
20111013-190744 INFO Using mysql source tree:
20111013-190744 INFO basedir: /percona-server/Percona-Server
20111013-190744 INFO clientbindir: /percona-server/Percona-Server/client
20111013-190744 INFO testdir: /dbqp
20111013-190744 INFO server_version: 5.5.16-rel21.0
20111013-190744 INFO server_compile_os: Linux
20111013-190744 INFO server_platform: x86_64
20111013-190744 INFO server_comment: (Percona Server with XtraDB (GPL), Release rel21.0, Revision 188)
20111013-190744 INFO Using default-storage-engine: innodb
20111013-190744 INFO Using testing mode: native
20111013-190744 INFO Processing test suites...
20111013-190744 INFO Found 1 test(s) for execution
20111013-190744 INFO Creating 1 bot(s)
20111013-190749 INFO Taking clean db snapshot...
20111013-190750 INFO bot0 server:
20111013-190750 INFO NAME: s0
20111013-190750 INFO MASTER_PORT: 9306
20111013-190750 INFO SOCKET_FILE: /dbqp/workdir/bot0/s0/var/s0.sock
20111013-190750 INFO VARDIR: /dbqp/workdir/bot0/s0/var
20111013-190750 INFO STATUS: 1
20111013-190756 ===============================================================
20111013-190756 TEST NAME [ RESULT ] TIME (ms)
20111013-190756 ===============================================================
20111013-190756 main.example_test [ pass ] 1
20111013-190756 test_choice (example_test.TestSequenceFunctions) ... ok
20111013-190756 test_sample (example_test.TestSequenceFunctions) ... ok
20111013-190756 test_shuffle (example_test.TestSequenceFunctions) ... ok
20111013-190756 ----------------------------------------------------------------------
20111013-190756 Ran 3 tests in 0.000s
20111013-190756 OK
20111013-190756 ===============================================================
20111013-190756 INFO Test execution complete in 6 seconds
20111013-190756 INFO Summary report:
20111013-190756 INFO Executed 1/1 test cases, 100.00 percent
20111013-190756 INFO STATUS: PASS, 1/1 test cases, 100.00 percent executed
20111013-190756 INFO Spent 0 / 6 seconds on: TEST(s)
20111013-190756 INFO Test execution complete
20111013-190756 INFO Stopping all running servers...

This really only scratches the surface of what can happen, but I’ll be writing more in-depth articles on what kind of tricks we can pull off as the code gets more polished.

Three non-testing bits:

1)  Percona Live London is just around the corner and members of the Drizzle team will be there.

2)  We are *this* close to Fremont beta being ready.  The contributions and feedback have been most welcome.  Any additional testing / etc are most appreciated.

3)  Drizzle is now part of the SPI!


Drizzle multi-master testing!

So, it has been a while since I’ve blogged.  As some of you may have read, I have a new job and Stewart and I have been busy planning all kinds of testing goodness for Percona >: ) (I’ve also been recovering from trying to keep up with Stewart!)

Rest assured, gentle readers, that I have not forgotten everyone’s favorite modular, community-driven database ; )  Not by a long-shot.  I have some major improvements to dbqp getting ready for a merge (think randgen in-tree / additional testing modes / multiple basedirs of multiple types).  Additionally, I’ve been cooking up some code to test the mighty Mr. Shrews’ multi-master code (mwa ha ha!)

What I’ve done is allow for a new option to be used with a test’s .cnf file (this is a dbqp thing, won’t work with standard drizzle-test-run).  If the runner sees this request, it will generate a multi-master config file from the specified servers’ individual slave.cnf files. 

Here is a sample config:

servers = [[--innodb.replication-log],[--innodb.replication-log],[--plugin-add=slave --slave.config-file=$MASTER_SERVER_SLAVE_CONFIG]]

# we tell the system that we want
# to generate a multi-master cnf file
# for the 3rd server to use, that
# has the first two servers as masters
# the final file is written to the first
# server's general slave.cnf file
gen_multi_master_cnf= 0,1

A good rundown of the file’s contents can be found on Shrews’ blog here, but the end result looks like this:




I tried cooking up a basic test case where we spin up 3 servers – 2 masters and one slave.  One master 1, we create table t1:

CREATE TABLE t1 (a int not null auto_increment, primary key(a));

On master 2, table t2:

CREATE TABLE t2 (a int not null auto_increment, primary key(a));

We insert some records into both tables, then check that our slave has everything! Sounds simple, right?

Sigh. If only. It seems that we are running into some issues when we try to record the test – you can read the bug here

We see some interesting output in the slave’s logs before it crashes:

$ cat workdir/bot0/s2/var/log/s2.err
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: 127 rollback segment(s) active.
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
(SQLSTATE 00000) Duplicate entry '772-1' for key 'PRIMARY'
Failure while executing:
INSERT INTO `sys_replication`.`queue` (`master_id`, `trx_id`, `seg_id`, `commit_order`, `originating_server_uuid`, `originating_commit_id`, `msg`) VALUES (2, 772, 1, 1, 'ac9c8ac0-8f10-474b-9bbd-b61d2cdb2b93' , 1, 'transaction_context {
server_id: 1
transaction_id: 772
start_timestamp: 1317760732106016
end_timestamp: 1317760732106017
event {
segment_id: 1
end_segment: true

Replication slave: Unable to insert into queue.
Replication slave: drizzle_state_read:lost connection to server (EOF)
Lost connection to master. Reconnecting.
Replication slave: drizzle_state_connect:could not connect
111004 16:39:05 InnoDB: Starting shutdown...

Additionally, you can just try the setup with –start-and-exit:

$ ./dbqp --suite=slave --start-and-exit multi_master_basic
20111004-170033 INFO Using Drizzle source tree:

20111004-170033 INFO Taking clean db snapshot...
20111004-170033 INFO Taking clean db snapshot...
20111004-170033 INFO Taking clean db snapshot...
20111004-170035 INFO bot0 server:
20111004-170035 INFO NAME: s0
20111004-170035 INFO MASTER_PORT: 9306
20111004-170035 INFO DRIZZLE_TCP_PORT: 9307
20111004-170035 INFO MC_PORT: 9308
20111004-170035 INFO PBMS_PORT: 9309
20111004-170035 INFO RABBITMQ_NODE_PORT: 9310
20111004-170035 INFO VARDIR: /drizzle_mm_test/tests/workdir/bot0/s0/var
20111004-170035 INFO STATUS: 1
20111004-170035 INFO bot0 server:
20111004-170035 INFO NAME: s1
20111004-170035 INFO MASTER_PORT: 9312
20111004-170035 INFO DRIZZLE_TCP_PORT: 9313
20111004-170035 INFO MC_PORT: 9314
20111004-170035 INFO PBMS_PORT: 9315
20111004-170035 INFO RABBITMQ_NODE_PORT: 9316
20111004-170035 INFO VARDIR: /drizzle_mm_test/tests/workdir/bot0/s1/var
20111004-170035 INFO STATUS: 1
20111004-170035 INFO bot0 server:
20111004-170035 INFO NAME: s2
20111004-170035 INFO MASTER_PORT: 9318
20111004-170035 INFO DRIZZLE_TCP_PORT: 9319
20111004-170035 INFO MC_PORT: 9320
20111004-170035 INFO PBMS_PORT: 9321
20111004-170035 INFO RABBITMQ_NODE_PORT: 9322
20111004-170035 INFO VARDIR: /drizzle_mm_test/tests/workdir/bot0/s2/var
20111004-170035 INFO STATUS: 1
20111004-170035 INFO User specified --start-and-exit. dbqp.py exiting and leaving servers running...
pcrews@mister:/drizzle_mm_test/tests$ ps -al
0 S 1000 18652 1 2 80 0 - 112094 poll_s pts/2 00:00:00 lt-drizzled
0 S 1000 18688 1 3 80 0 - 112096 poll_s pts/2 00:00:00 lt-drizzled
0 S 1000 18721 1 3 80 0 - 156326 poll_s pts/2 00:00:00 lt-drizzled
0 R 1000 18780 15985 0 80 0 - 3375 - pts/2 00:00:00 ps
0 S 1000 32463 30047 0 80 0 - 11272 poll_s pts/1 00:00:01 ssh

From here, we can connect to the slave and check out sys_replication.applier_state:

$ drizzle -uroot -p9318 test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the Drizzle client.. Commands end with ; or \g.
Your Drizzle connection id is 216
Connection protocol: mysql
Server version: 2011.09.26.2427 Source distribution (drizzle_mm_test)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

drizzle> use sys_replication;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Schema changed
drizzle> show tables;
| Tables_in_sys_replication |
| applier_state |
| io_state |
| queue |
3 rows in set (0.001641 sec)

drizzle> select * from applier_state;
| master_id | last_applied_commit_id | originating_server_uuid | originating_commit_id | status | error_msg |
| 1 | 0 | f716781f-8c00-4b81-82c6-62039136d616 | 0 | RUNNING | |
| 2 | 3 | df7f2f6e-dba4-43ea-a674-fa4a3709865b | 3 | RUNNING | |
2 rows in set (0.000928 sec)

drizzle> select * from io_state;
| master_id | status | error_msg |
| 1 | STOPPED | |
| 2 | RUNNING | |
2 rows in set (0.000839 sec)


So, it looks like the slave knows about both masters, but for some reason, the applier from master 1 is stopped : (
At any rate, there is a bug open on this and it could be something in my config(?) It’s been a while since I’ve played with replication and I know there has been some tinkering under the hood since then : )

The branch with the test code can be found here:

At the very least, we can now create tests that use this feature, which will help ensure that it stays on the path of solid code in the future! How about anyone out there? Has anyone been using multi-master? If so, can you share any setups / tests? Extra information would be most appreciated : )