Tools for running and automating distributed loadtests with Funkload
Project description
The Bench Master 3000
The purpose of this package is to help you run huge, parallelised load tests using FunkLoad across multiple servers and/or processors. If you require only moderately grunty load tests you will probably be better off using FunkLoad on its own.
This document may still provide some helpful tips and insights, even if you end up using “plain” FunkLoad, but please read the FunkLoad documentation as well. It is a very powerful and flexible tool. You certainly need to be familiar with its various scripts and services to get the most out of Bench Master.
Theory of operation
Bench Master facilities the deployment of load tests to a number of servers (or a number of aliases for a single server, thus allowing better use of multi-core or multi-processor systems), each with its own load test runner. When all load test runners have completed their benchmarks, the results are collated back to the master, which will produce a consolidated report.
There is exactly one “bench master”, and one or more “bench nodes”. The master communicates with the nodes over SSH, using the pssh library to parallelise operations. The nodes can be different physical machines, although it is possible to use the same machine more than once (in which case multiple concurrent SSH connections will be made). It is also possible for the same machine - even the same installation of Bench Master - to serve as master and one or more nodes. (In this case, the master will in effect establish one or more SSH connections to localhost.)
Load tests are created/recorded using FunkLoad as normal. This will result in two files, which must be kept in the same directory:
<Testname>.conf, which is used to configure the load test runner, the FunkLoad monitoring server, etc.
test_<Testname>.py, which contains the actual test as a subclass of the FunkLoadTestCase, with a single test method test_<testName>.
There is one extension available when using Bench Master: You can get the node number (an integer greater than or equal to 0, represented in the config as a string) by doing:
node_num = self.conf_get('bench', 'node_num', '0')
This is useful for disambiguating the nodes: Since all nodes execute the test run in parallel, you need to be careful that two tests don’t lay claim to the same resource (say, a login name) at the same time.
Note that although it is possible to put FunkLoad tests in a package, FunkLoad only really considers modules. It is perfectly feasible to dump your test and .conf file pair into any directory, even without an __init__.py.
Pre-requisites
First, some requirements:
Bench Master only works on Unix-type operating systems.
You need at least FunkLoad 0.12 and PSSH 2.1.1 installed. These will get installed as dependencies of the benchmaster package.
You must have ssh and scp installed on the bench master
You must also have gnuplot installed on the bench master, so that it can generate its reports. It is normally best to install this using your operating system’s package manager.
All bench nodes must have the SSH daemon running, accepting connections from the bench master.
Hint: The SSH daemon on your bench node(s) may accept a limited number of inbound connections. If you are using many “virtual” nodes, you need to be careful that the latter connections will not be rejected. If you are using OpenSSH, you should set the MaxStartups and MaxSessions options in the sshd_config file to a sufficiently high number.
The bench master must be able to SSH to the node without entering a password or being prompted to accept the remote host’s signature. In practice, that means that the user the bench master is running as must have its public key listed in the authorized_keys file of the user on the remote bench node(s), and that each remote node’s signature must have been recorded in the known_hosts file on the bench master.
All bench nodes should be setup identically. In particular, the installation and working directories for the bench node runner should be in the same path on all hosts.
Optionally, you can also configure the SSH daemon on the bench node(s) to accept various environment variables that will be sent by the bench master. These are not required for normal operation, but can be useful in writing and debugging load tests. The variables are:
- PSSH_NODENUM
The node’s number. The first node will have number 0.
- PSSH_BENCH_RUN
The name of the current bench run. You can set this when invoking the bench master. It defaults to a unique named based on the current date and time and the test name.
- PSSH_BENCH_WORKING_DIR
The top level bench node working directory.
To use these, you need to add the following to the sshd_config file on each node:
AcceptEnv PSSH_*
Installation
Bench Master can be installed using zc.buildout, easy_install or pip. It will install two console scripts, bench-master and bench-node, which control its execution.
Here is a buildout.cfg file that will create an isolated environment that can be used as a bench master and/or node:
[buildout] parts = bench-tools versions = versions # Note: Change versions as required or remove versions block to get # latest versions always [versions] benchmaster = 1.0b1 funkload = 1.12.0 pssh = 2.1.1 [bench-tools] recipe = zc.recipe.egg eggs = funkload benchmaster
Note: It is good practice to “pin” versions as shown above. However, you should check which versions are appropriate at the time you read this. Alternatively, you can skip the [versions] block entirely to get the latests versions of everything, but bear in mind that this could cause problems in the future if an incompatible version of some dependency is released to PyPI.
To keep things simple, you are recommended to create identical installations on all servers involved in the bench run. For example, you could put the buildout.cfg file above in /opt/bench, and then run:
$ buildout bootstrap --distribute # or download a bootstrap.py and run that $ bin/buildout
Hint: You should ensure that all bench node installations are owned by an appropriate user - usually the one that the bench master will use to log in remotely over SSH.
You should now have the scripts bench-master, bench-node and the various fl-* scripts from FunkLoad in the bin directory.
If you prefer to use pip or easy_install, you can just install the benchmaster egg. You are recommended to do so in a virtualenv however:
$ virtualenv --no-site-package bench $ cd bench $ bin/easy_install benchmaster
Recording tests
To record tests, you can use the fl-record script that’s installed with FunkLoad. To use it, you also need to install TCPWatch. The tcpwatch binary needs to be in the system path so that fl-record can find it. Alternatively, you can set the TCPWATCH environment variable to point to the binary itself.
For example, you can use a buildout like this:
[buildout] parts = bench-tools versions = versions # Note: Change versions as required or remove versions block to get # latest versions always [versions] benchmaster = 1.0b1 funkload = 1.12.0 pssh = 2.1.1 tcpwatch = 1.3.1 [bench-tools] recipe = zc.recipe.egg:scripts eggs = docutils funkload tcpwatch benchmaster initialization = import os os.environ['TCPWATCH'] = "${buildout:bin-directory}/tcpwatch"
Once TCPWatch is installed, you can start the recorder with:
$ bin/fl-record TestName
Note: You should use “CamelCase” for the test name. This ensures the generated code follows the conventions expected by the bench-master program. If you use another naming convention, you may have to specify the test name as an optional command line argument to the bench-master command - see below.
Now open a web browser, and configure it to use an HTTP proxy of 127.0.0.1 port 8090 (see the fl-record output and documentation for details).
At this point, you can visit the website you want to test and perform the actions you want to load test. It helps to plan your steps in advance, since “stray” clicks or reloads can make the test less repeatable or useful.
When you’re done, go back to the terminal running fl-record and press Ctrl+C. It should generate two files: test_TestName.py and TestName.conf.
You may want to move these out of the current directory, e.g:
$ mkdir bench-tests $ mv test_* bench-tests/ $ mv *.conf bench-tests/
At this point, you should edit the two generated files to make sure they will be valid and useful as load tests. The .conf file contains comments where you may want to fill in details, such as a more descriptive name.
You also need to consider what will happen if your test case is run over and over, in parallel, from multiple bench nodes. If it is simply fetching a few pages, you will probably be fine with the recorded tests. However, if you are doing anything user-specific (such as logging in or submitting details) you need to ensure that the parallel tests will not interfere with one another.
To disambiguate between tests, you can use the node_num configuration variable. You can get this in a FunkLoadTestCase with:
node_num = self.conf_get('bench', 'node_num', '0')
Note also that Funkload comes with a credentials server (see the fl-credential-ctl script), which you can use to maintain a single database of credentials across multiple threads and multiple test runs.
Executing tests
You can use the fl-run-test script to execute all the tests in a given module. This is not so useful for load testing, but very useful when writing and debugging tests. It can also serve as a simple functional testing tool.:
$ bin/fl-run-test test_Testname.py
This runs the test in the given module. Note that the test and the corresponding .conf file needs to be in the current directory. If you’ve moved the tests to a dedicated directory, e.g. bench-tests/ as suggested above, you can do:
$ cd bench-tests $ ../bin/fl-run-test test_Testname.py
See the FunkLoad documentation for more details.
To run a simple bench test on one node, you can use the fl-run-bench. (This is ultimately what the Bench Master does on each node):
$ bin/fl-bench-test test_Testname.py Testname.test_TestName
Note that we have to specify the test case class (Testname) and method (test_TestName), since the bench runner will only run a single test, even if you’ve put more than one test in the module.
At this point, it’s useful to start understanding the parameters that control the bench runner.
- Cycles
The bench runner will run multiple cycles of the given test. Each cycle is run at a particular level of concurrency, which translates to a number of threads being used in that cycle. This is effectively a simulation of having multiple concurrent users executing the load test scenario.
You can specify cycles with the -c command line argument. For example, -c 10:20:30 will execute three cycles with 10, 20 and 30 threads, respectively. The default cycles are configured in the .conf file for the load test.
To get useful load tests, you typically want to run multiple cycles with a fixed step increase. This makes the outputs from the test easier to compare.
- Duration
The bench runner will run each cycle for a fixed duration, given in seconds. Again, the default is found in the .conf file, but can be overridden with the -D command line argument, e.g. -D 300 to run each cycle for 5 minutes.
Each thread in a given cycle will try to execute the test as many times as it can until the test comes to an end (i.e. duration seconds have passed). Incomplete tests are aborted, and not counted towards the load test results.
In addition, you can set a min and max sleep time between requests (the actual sleep time is a random value between the two), a sleep time between test executions, and a thread startup delay. These help simulate some degree of randomness and delays, and can sometimes be a useful way to allow the server to reach a “steady state” before the real tests kick in. See the output of the -h option to the fl-run-bench command, and the comments in the .conf file, as well as the FunkLoad documentation for details.
Hint: Experiment with different duration and concurrency settings. Since each thread will execute the test repeatedly within the test duration, you can increase the number of tests being executed by increasing either parameter. High concurrency with a low duration simulates short bursts of high load. Longer duration simulates sustained load. The latter is often more revealing, since a short duration may cause FunkLoad to omit tests that hadn’t completed within the duration, and which would in fact never have completed due to a gateway timeout on the server.
Once you have tested that your bench test works in “plain” FunkLoad, you can deploy it using Bench Master.
To illustrate such a deployment, we’ll assume that:
The pre-requisites above are installed.
You have installed Bench Master on two machines, 192.168.0.10 and 192.168.0.11, using a buildout like the one suggested above, in /opt/bench. Hence, the binaries are in /opt/bench/bin/.
The bench master is running as the user bench. This user exists on both hosts, and has ownership over /opt/bench and all children of this directory.
An SSH key has been set up so that bench can SSH from the master to the node and from the master to itself (we’ll use bm.example.com as a bench node as well as a master) without requiring a password.
Hint: Test this manually first, and ensure that the remote host is known, i.e. gets recorded in the known_hosts file.
The test is found in /opt/bench/bench-tests/test_Testname.py, with the corresponding Testname.conf in the same directory.
First, we need to create a nodes configuration file. This tells the bench master which nodes to deploy the tests to. We’ll call this nodes.pssh, and keep it in the top level /opt/bench directory. It should contain entries like:
load-test-00 bench load-test-01 bench load-test-02 bench load-test-03 bench
The first token is the host name. The second is the username to use. Note that we are using “pseudo” host names here. This is because we want to use the same physical server more than once. These can be resolved to actual IP addresses in the /etc/hosts file, for example with:
load-test-00 192.168.245.10 load-test-01 192.168.245.11 load-test-02 192.168.245.10 load-test-03 192.168.245.11
Here, we have opted to run two bench nodes on each physical machine. This type of setup is appropriate for dual core/processor machines, where one Python process can make full use of each core.
We can now execute the test:
$ mkdir bench-output $ bin/bench-master -s /opt/bench/bin/bench-node -w bench-output -c 10:20:30 -D 300 nodes.pssh bench-tests/test_Testname.py
In this command:
The -s option gives the path to the bench-node script on the remote bench node. (It does not need to exist on the master, except when the master server is not also being used as a node.) It can be omitted if bench-node is in the PATH for the given user on the remote server. This is used in an ssh invocation to execute the bench-node script on each host.
The -w option is the working directory for the bench master. This will eventually contain the final bench report, as well as information recorded during the bench run. The information specific to each bench run is collated under an intermediary directory with the name of the bench run. The default is to use the current date/time and the test name.
The -c option overrides the concurrency setting from the .conf file for the bench test on each node. Here, we will run three cycles, with 10, 20 and 30 concurrent threads on each node, respectively. This means that, with four nodes, the true concurrency is 40, 80, and 120 threads, respectively.
The -D option overrides the duration in a similar manner. Hence, in each cycle, each thread on each node will attempt to execute the test as many times as it can for 5 minutes (300 seconds). Since the nodes run in parallel, this means that the overall duration for each cycle is 5 minutes.
The two positional arguments give the path to the nodes file and the test module. In this case, it is OK to give a relative or absolute path - it does not have to be in the current directory.
In addition to these options, you can specify:
-n, to give a name to the bench run, which is used as the directory name under the bench master working directory. The default is to use a name built from the current date and time, plus the test name. Note that if you use the same name in more than one invocation, temporary files may be overwritten, but all final reports will be kept.
--num-nodes, to limit the number of nodes being used. By default, all nodes in the nodes file are used. If you want to use only the first 2 nodes, say, you can specify --num-nodes=2.
-x, to indicate the path to the working directory on each node. The default is to use /tmp/bench-node, which will be created if it does not exist.
-u, to override the base URL for the test, found in the test .conf file.
-v, to see more output.
Finally, you may need to give the test name (i.e. the name given to fl-record when recording the test) as a third positional argument, after the test module .py file. This is necessary if you did not use “CamelCase” (or all lower-case) when recording the test. For example, if the test method in a module test_FooBar.py is test_foo_bar(), you can use:
$ bin/bench-master -s /opt/bench/bin/bench-node -w bench-output -c 10:20:30 -D 300 nodes.pssh bench-tests/test_FooBar.py foo_bar
See the output of bin/bench-master -h for more details.
Observing execution
While the bench master is running, it will:
Create a workspace on each node and remotely copy into it the test, plus a .conf file tailored to each node. The parent directory for the workspace on each node can be set with the -x command line option. Inside this directory, a subdirectory is created for each labelled test run.
Note: You can set the label with the -n option. This is useful for grouping tests together. The default is to create a unique label based on the current date/time and the test name.
Inside each test run subdirectory, a subdirectory is created specifically for each node. This allows multiple “virtual” nodes (i.e. multiple node processes) in the same installation on the same physical machine.
Execute the bench-node script on each node. This in turn runs the FunkLoad bench test runner.
Monitor the standard output and standard error on each node. These can be found in the bench master’s workspace in the out/ and err/ directories, in text files corresponding to each node. You can tail these files to get near-realtime output from each node.
The err/ directory will help you understand if something has gone wrong on the remote node. The out/ directory is useful for checking on the progress of each node.
When the tests are running, you will see various progress indicators under three headings: “Starting threads”, “Logging for <duration> seconds”, and “Stopping threads”. For each, you may see a number of “.”, “E” or “F” characters. A “.” means success, an “E” an error, and an “F” a failure.
Failures in starting and stopping threads may indicate that FunkLoad can’t launch enough threads to get the concurrency indicated in the cycle. Whether this is a problem depends on what concurrency you really need to generate an effective load test. FunkLoad will continue with the test regardless.
The “Logging for <duration> seconds” line is the most important. This is the time when FunkLoad is actually executing tests. An error here indicates a problem in the test code itself, and should be rare. An “F” means that a page could not be fetched, or that a test assertion failed. In load test scenarios, this can mean that your server has stopped responding or is timing out. The report (see below) will give you more details about what types of failures and errors were reported.
Viewing the report
Once the remote nodes are all completed, the bench master will:
Collate FunkLoad’s output XML files from each node into the results/ subdirectory of the bench master workspace.
Merge these files into a single results file.
Build an HTML report from this result.
The final HTML report will appear in the report/ directory of the bench master workspace. If this is on a remote server, you may want to serve this from a web server capable of serving static HTML files, to make it easy to view reports.
For example, you could start a simple HTTP server on port 8000 with:
$ cd bench-output $ python -m SimpleHTTPServer 8000
Of course, you could use Apache, nginx or any other web server as well, and you can use the -w option to the bench-master script to put all results in a directory already configured to be served as web content.
Hint: If you are running several related load tests, e.g. testing the effect of a series of planned changes to your infrastructure, it can be useful to group these reports together. If you run several bench runs with the same name, as given by the -n option, Bench Master will group all reports together under a single top-level directory (<bench name>/reports/*). Note that if you do this, the files in the out/ and err/ directories will be overwritten each time.
The first time you see a FunkLoad report, you may be a bit overwhelmed by the amount of detail. Some of the more important things to look for are:
Look for any errors or failures. These are reported against each cycle, as well as for individual pages, and summarised at the end of the report. Bear in mind that an error - particular an HTTP error like 503 or 504 - at very high concurrency can simply indicate that you have “maxed out” your server. If this occurs at a concurrency that is much higher than what you would expect in practice, it may not be a problem.
Look at the overall throughput for the whole bench run, and see how it changes with the cycle concurrency. FunkLoad presents this in three metrics:
- Successful tests per second
This is the number of complete tests that could be executed divided by the duration, aggregated across all cycles.
If your test simulates a user completing a particular process (e.g. purchasing a product or creating some content) over a number of steps, this can be a useful “business” metric (e.g. “we could sell X products in 10 minutes”).
- Successful pages per second
This is the number of complete web pages that could be downloaded, including resources such as stylesheets and images, per second.
Note that FunkLoad will cache resources in the same way as web browsers, so a given thread may only fetch certain resources once in a given cycle.
This is a useful metric to understand how many “page impressions” your visitors could receive when your site is operating under load. It is most meaningful if your test only fetches one page, or fetches pages that are roughly equivalent in terms of server processing time.
- Successful requests per second
This is a measure of raw throughput, i.e. how many requests were completed per second.
This metric does not necessarily relate to how visitors experience your website, since it counts a web page and its dependent resources as separate requests.
Look for overall trends. How does performance degrade as concurrency is ramped up? Are there any “hard” limits where performance markedly drops off, or the site stops responding altogether? If so, try to think of what this may be, e.g. operating system limits on the number of threads or processors or open file descriptors.
Look at the individual pages being fetched. Are any pages particularly slow? How do the different pages degrade in performance as concurrency is increased?
Please refer to the FunkLoad documentation for more details.
Server monitoring
It is important to be able to understand what is going on during a load test run. If the results are not what you expected or hoped for, you need to know where to start tweaking.
Some important considerations include:
What is the CPU, memory and disk I/O load on the server(s) hosting the application? Where is the bottleneck?
What is the capacity of the networking infrastructure between the bench nodes and the server being tested? Are you constrained by bandwidth? Are there any firewalls or other equipment performing intensive scanning of requests, or even actively throttling throughput?
What is happening on the bench nodes? Are they failing to generate enough load, or failing to cope with the volume of data coming back from the server?
In all cases, you need monitoring. This can include:
Running top, free and other “point-in-time” monitoring tools on the server.
Using tools like monit to observe multiple processes in real time.
Looking at log files from the various services involved in your application, both in real time as the tests are running, and after the test has completed.
Using FunkLoad’s built in monitoring tools.
Note: The FunkLoad monitoring server only works on Linux hosts.
The FunkLoad monitor is useful not at least because it is incorporated in the final report. You can see what happened to the load overage, memory and network throughput on any number of servers as the load test progressed.
FunkLoad monitoring requires that a monitoring server is running on each host to be monitored. The monitor control program, fl-monitor-ctl, is installed with FunkLoad. You can use the Bench Master buildout shown above to install it, or simply install FunkLoad in a virtualenv or buildout on each host.
Once you have installed fl-monitor-ctl, you will need a configuration file on each host. For example:
[server] # configuration used by monitord host = localhost port = 8008 # sleeptime between monitoring in second # note that load average is updated by the system only every 5s interval = .5 # network interface to monitor lo, eth0 interface = eth0 [client] # configuration used by monitorctl host = localhost port = 8008
You should adjust the host name, network interface and port as required.
You can then start the monitor on each host with:
$ bin/fl-monitor-ctl monitor.conf startd
With the monitors in place, you need to tell FunkLoad which hosts to monitor. This is done in the .conf for the test, in the [monitor] section. For example:
[monitor] hosts = bench-master.example.com bench-node.example.com www.example.com [bench-master.example.com] description = Bench master port = 8008 [bench-node.example.com] description = Bench node port = 8008 [www.example.com] description = Web server port = 8008
In this example, we have opted to monitor the bench master and nodes, as well as the web server being tested. Monitoring the bench nodes is important, since it helps identify whether they are generating load as normal.
With this configuration in place, you should get monitoring in your FunkLoad reports.
Note: If you are using bench-master to run the tests, it will ensure that only the first node collects monitoring statistics. It does not make sense to simultaneously monitor a number of hosts from all nodes.
Manually generating reports
When FunkLoad executes a bench run, it will write an XML file with all relevant test results and statistics. Bench Master collects these XML files from each node and collates them into a single file. (The individual files are all kept in the results/ directory under the bench run directory inside the bench-master working directory.) It then uses the FunkLoad report generator to create an HTML report.
If you want to create a report manually, you can use the fl-build-report script.
For example, to get a plain-text version of the report, you can use:
$ bin/fl-build-report bench-output/<run name>/results/*.xml
To get an HTML version of the same, you can use the –html option:
$ bin/fl-build-report --html bench-output/<run name>/results/*.xml
The report will be placed in the current directory. Use the -o flag to indicate a different output directory.
Differential reporting
You can use load testing to understand the performance impact of your changes. To make it easier to compare load test runs, FunkLoad provides differential reporting.
For example, let’s say you have created a “baseline” report in bench-output/baseline/reports/test_foo-20100101. You may then add a new server, and re-run the same test in bench-master, this time with the name “new-server”. The resulting report may be bench-output/new-server/reports/test_foo-20100201. You can create a differential between the two with:
$ bin/fl-build-report --diff bench-output/baseline/reports/test_foo-20100101 bench-output/new-server/reports/test_foo-20100201
Of course, this works on reports generated with “plain” FunkLoad as well.
Hint: For differential reports to be meaningful, you need to have the same test, executing with the same number of cycles, the same cycle concurrency, and the same duration in each both reports.
Credits
Originally created by Julian Coyne <julian.coyne@unified.com.au>. Refactored into a separate package by Martin Aspeli <optilude@gmail.com>.
Changelog
1.0b1 - 2010-07-26
Initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.