Friday, June 27, 2014

Software Engineering: Automated Software Testing 101

My official title at Google is Software Engineer in Test Tools and Infrastructure, which means I spend most of my time writing automated test infrastructure and consulting with teams to help them test their products.  For the last couple years that I've been working in this role I've learned a lot about testing.  I'd like to disseminate some of this knowledge.

There are many ways to test your software--way too many to cover in one short blog post--, but some of them are more important and widely applicable than others.  In this article I'll talk about the major types of automated software testing and why they are important.

Why and When to Test
Firstly, why do we need to test software?  Software testing can sometimes seem less important than other tasks.  For example, is there any reason to spend expensive engineer-hours writing tests, when you already have design reviews, code reviews, and manual code inspection to assure product quality?  Additionally, the fact that bugs can still exist in well-tested code is disheartening.

Despite the drawbacks, testing should be done early and often.  Testing your software increases product quality (and in doing so, lowers maintenance costs), aids development speed (testing takes time, but bugs are easier to fix when found earlier), proves requirements have been met, and can reveal design flaws.  It's usually worth the time to write at least a few small tests.
"The problem with quick and dirty, as some people have said, is that dirty remains long after quick has been forgotten."
— Steve McConnell, Software Project Survival Guide
(By the way, the Software Project Survival Guide is a pretty good book for those in the software field. I recommend it for engineers and especially for project managers).

This article focuses on automated tests because it's my area of expertise, manual tests are pretty easy to figure out, and because automated tests have several advantages over manual tests: automated tests are generally faster to run, can be run more frequently when part of a continuous build, are easier to use for unit and integration testing, and free up engineers to do other things.  If you're writing a small program for yourself, relying solely on manual system tests is fine.  For major projects, you'll need automated tests.

What to Test
So by now you've been persuaded by my convincing words that testing should be done early and often. But what are we testing?  If we test everything, at what point do we write tests for tests?

Tests should be created for most things, especially complex and critical functionality.  If you've created a function that does nothing but add a digit to the end of a string, it's not critical to write a test for that function. Most other stuff should be tested.

While tests themselves can break, tests don't usually need their own tests because of their simplicity and because a false failure will be visible to your team when the test runs.  A falsely-passing test can be a disaster, but is an infrequent occurrence for well-written tests that were working to begin with.  The exception is to write tests for test infrastructure.  Sufficiently complex tests are usually built on some sort of testing infrastructure (JUnit, WebDriver, a fake database, etc.) that can easily contain bugs, and accordingly, they should be tested.

Who Does the Testing?
Who writes tests: software engineers or a QA/test team?  The answer is that it depends.  For small teams creating small projects, it is sufficient to have engineers test their own code.  After all, it can be difficult to obtain a test guy or gal, and there isn't much code to test, so writing tests is quick and easy.  But even for large teams creating large projects, it can be beneficial to have the programmers write tests.  They can gain insight into their own code, have a better idea of what can break, and are more familiar what the issue is when a test breaks.  As someone who writes tests for other teams' projects, I've often encountered teams that have no idea why a test is broken or how to fix it, even though they know the project code better than I do.  That issue diminishes when the team has a hand in writing the tests.

Alternatively, a separate test team gains a different perspective of the code.  Unlike the programmers that wrote the code, a test person's ego is not affected when a test finds a bug.  In fact, finding bugs is ego-boosting for the test team.  The drawback is that a test team may not write tests that sufficiently cover the weak points of the dev team's code.

So the answer is: either way is pretty good.  Get a test team when writing tests becomes burdensome for the project programmers.  This can happen when the software is extremely hard to test for whatever reason, or when it's complex enough to require special test infrastructure.

Test-Driven Development
So tests should be written early.  But how early?  Some programmers believe in test-driven development, where you write the test before writing the code.  I think that's a bit drastic, but not always a bad idea.  APIs and features may change slightly when actually writing code, so your tests will likely have to change anyway.  You don't necessarily need to write a test before the code is written, but you should have a test plan.  Writing tests should be done simultaneously when writing code; that's early enough to catch initial bugs and late enough to prevent overhauling tests.

Now, on to types of tests...

Unit Testing
First and foremost, if you're going to have any automated tests (i.e. you're not writing a small program for yourself), you should have unit tests.  The reason is that unit tests are the easy and quick to write and run, unlike other tests.  Unit tests are small tests, usually white-box style, that test a class, a few classes, or another small portion of code.  Using mocks or stubs is perfectly fine for unit tests.

Because unit tests are so lightweight and quick to run, they should be run often, preferably as part of a continuous integration process.  This allows the unit tests to catch newly introduced bugs quickly.  Several solutions for continuous integration exist, like Jenkins. (I have not used Jenkins myself).

Integration Testing
The downside of unit testing is that important parts like databases or dependencies on other binaries* are mocked out or nonexistent.  You don't get a good picture of how the software performs under real conditions.  In order to verify the different subsystems of a program, you should write integration tests.  Integration tests are large tests that test multiple binaries, or one binary that uses external resources.  Integration tests are helpful in catching integration errors (often API or design bugs).  Even if two pieces of a project are "bug-free," they may not integrate well, resulting in miscommunication that can only be caught by integration tests.  Integration tests can also find speed or memory issues that can't be found by simple unit tests.

Since integration tests are large and slow, it can be difficult to find a test framework that will bring up all the resources (a.k.a the environment) needed for the test and then tear them down after the test is done.  One workaround is to leave up servers or other external resources indefinitely and have the tests clean up any permanent effects in the environment after the tests finish.  This is a dangerous game, as the long-running environment can get into a bad state, giving inaccurate test results.  Sometimes the only way to run integration tests is to manually bring up and down the environment.  In this case, it might be worth it to manually run the integration tests as well.

System Testing
To get a complete picture of how the software will actually perform, system tests are required.  System tests are large tests, black-box style, that bringing up an actual environment and simulating an end-user using your product.  Not only do they help reveal memory and speed issues that may only occur in a real-world environment, they also may reveal UI and usability issues.

System tests are often run manually since it's hard to find good frameworks that can both setup a large environment and give human-like input (often by manipulating a GUI).  However, many GUI-manipulating tools exist.  I've used AutoIt to manipulate Windows GUIs with excellent results and I've also used WebDriver to manipulate webpages with very good results.  Combined with integration-test-style frameworks, you can potentially automate your system tests.

In summation:
  • Write tests early and often
  • Run unit, integration and system tests, automating them if you can
  • Run automated tests as part of a continuous integration process if you can

Following these suggestions will help you get excellent code coverage and excellent feature coverage, which will result in stable, easy-to-maintain, and on-schedule software.

Update: Added What and Who testing sections.

*Binaries are individually-compiled programs. They may not do much alone, but work in conjunction with other programs to create useful output.  If they do useful work by themselves, they are standalone binaries, or executables.