Agile practitioners are familiar with test-driven development (TDD) at the unit test level, and many also apply TDD at the integration test level. This paper explores the benefits of taking TDD to the functional test level. What problems would this solve? How would it affect project methodology? Which roles on an Agile team would be affected? Are suitable tools available to support it?
The paper is based on my own experiences and observations at my present place of employment. The names have been changed to protect the innocent.
TDD has proven to be among the most powerful tools for delivering business value to our customers, as we have grown the Agile software development practice at our company. It is really true, as Dan North puts it, that TDD is "not about the tests, it's about seeing how little you actually need to do and how cleanly you can do it."
Alan Francis offers a simpler statement of the same wisdom when he writes, "A cleaner understanding of the behaviour you require before you start to implement it is [an advantage of TDD]. An 'executable specification' is yet another way to think about what you're doing. All of these effects are Good Things, but one major advantage of TDD is the ability to answer the question 'how will I know I am done?'. We can aswer that question because we have a test that will fail if we are not done and pass if we are."
As part of a comprehensive Agile project methodology, TDD directly supports all four values listed in the Agile Manifesto. In the past two and half years, we have completed a number of projects following the Agile approach and using the XP methodology. All the projects have been successful, but some have been better than others.
We have applied TDD unevenly during these projects. Some project teams used TDD more extensively and rigorously than others. Some teams unit tested all layers of the application while others assumed some layers were out of scope (typically the UI). Some teams extended the concept beyond the unit test level and built automated integration test suites. Each project team did certain things very well, and failed to do other things very well.
While every project was different, with respect to TDD they all have one characteristic in common: The areas where the teams experienced the most problems with completing and deploying the application were the areas that were not built with a rigorous test-driven approach.
Problems did not always wait until deployment before manifesting themselves. Throughout each project, testers exercised the code for completed stories to verify that functional requirements had been met, and to look for any other problems. Invariably, they discovered a large number of defects in the UI layer. The nature of the problems indicates two general causes: (a) Requirements were missed in the hand-off from business analysts to developers, and (b) UI features and usability details were poorly implemented, since developers did not use TDD for the UI code. The result was that the defect list tended to grow from iteration to iteration, and each project ended up with a backlog of defects they had to scramble to fix in a short time. This did not make for a very smooth conclusion to each development effort.
By learning from these experiences, we can ensure future projects gain the maximum benefit possible from TDD. Ignoring the lessons will, of course, have a different and predictable effect.
The idea to take TDD to the functional level resulted from work I did on behalf of a large project that was nearing completion. The group responsible for writing executable functional test scripts was separate from the project development team. Using tools that support after-the-fact functional testing by recording user actions on a UI, they were tasked with creating a set of functional tests that would be used to verify the application met all its functional requirements prior to going into production, and that could be archived and used in future projects as a regression test suite.
I was asked to help them because the tool they were using did not work properly, and they were having a lot of difficulty automating the test scripts. Over a course of several weeks, I worked extensively with that tool and eventually tested a total of eleven different testing tools in search of one that would meet our needs. None was entirely adequate, but we ultimately settled on a tool, worked with the vendor to correct a few bugs, and the team was able to develop the functional test suite.
Some of the problems the test team encountered were not due to bugs in the tool, but to the way in which the application code had been written. The development team had made the assumption that the UI layer was out of their scope to test. They did not use a TDD approach to develop the UI code. The general quality of the UI code was below that of the rest of the application, and the UI code was not written in a way that facilitated testing.
This was a webapp for intranet use, and the UI consisted of HTML documents displayed in a web browser. The elements in the HTML documents contained no id or name attributes the testing tool could use to identify the elements whose values needed to be set and/or validated. In most cases, we were reduced to scanning an HTML document to find specific strings. This was a very fragile approach to automated testing, since a cosmetic change to the document could break a functional test.
Thus, the first problem this exercise exposed was that the quality of our UI code was poor. I say this not merely because it was difficult to write automated functional tests against the HTML documents. I noted a number of other issues that I believe were the result of carelessness in writing the UI layer. None of the HTML documents generated by the application would pass validation. Indeed, developers never bothered to run their documents through a validator. Documents contained markup styles incompatible with their own DOCTYPE specifications, stray HTML tags such as </div> and <td> outside any meaningful context, and other elementary problems. Even the basic principles of Agile development were ignored at the UI level; for instance, the corporate logo that appears at the top of each document was coded redundantly in HTML rather than being defined once in CSS, in violation of the Agile principle to "do the simplest thing that could possibly work."
My observation is that developers (and I include myself in this assessment) tend to work in a way that makes their own lives easy. We use programming techniques that facilitate unit testing because we know we are going to write tests first. When we apply TDD at the integration test level, we use programming techniques that facilitate it. But this team did not use TDD at the functional level, and so their code was not written to facilitate automated functional testing. They didn't bother because it would not have made their lives any easier. So, the first benefit of functional test driven development (FTDD) must be higher quality UI code, because developers will take the same care in writing the UI layer as they do in the rest of the application, in order to make their own lives easy.
I mentioned that the group responsible for writing the automated functional tests was separate from the development team. That meant a communication gap was built into the structure of the project team. The test group consisted of the same people who worked with the customer representatives to identify the functional requirements; we call them business analysts. To try and bridge this inherent communication gap, one of their tasks was to write descriptions of the functional requirements to hand off to the developers. The end result was that a story card was supplemented with a non-Agile document we called a "story narrative." As time passed, the story narratives took on more and more detail, until finally they were essentially no different from the exhaustive and confusing requirements specification documents produced on predictive projects, and they were understood about as well by developers.
It seemed to me we were slipping back into a predictive methodology because of the inherent communication gap built into the structure of the project team. To compensate, we were adding formal documentation and explicit hand-off steps in the development process. We were stepping away from adaptive development and moving back toward predictive development.
FTDD addresses this problem in a simple and elegant way. The business analysts already make notes about functional requirements. At our company, they usually use a spreadsheet to summarize the functional requirements for a story or a set of related stories. They may also have some hand-written notes, or an informal text document. By eliminating the separation between analysts and developers, we can have a developer pair with an analyst to develop an executable functional test script directly from the analyst's notes, spreadsheets, and verbal explanations. That functional test script then becomes an enforceable specification of functional requirements the developer can use as a guide when writing code. It works the same way as TDD at the unit or integration test levels, but at a slightly larger scope: When the test passes, the functional requirement has been met.
This approach offers several benefits:
FTDD offers benefits beyond the scope of any single development project. One of those is in the area of regression testing.
One of our organizational goals is to develop a repository of regression test suites for all production applications, so that future enhancement projects will have a solid starting point from which to begin development. When FTDD is used, the resulting executable functional test suites become de facto regression test suites from the moment they are checked into the repository. The first step in any subsequent enhancement project (or in any production fix activity) is to check everything out and run the functional test suite. That serves as the initial regression test. As development proceeds, the same functional test suite performs regression testing as changes are checked in. Regression testing becomes an automatic, seamless, and effortless part of the normal development process.
Another organizational goal is to focus our human resources directly on customer service to the fullest extent possible. Generally, this means we regard any role that directly serves a customer to be a value-producing role, and any role whose purpose is to facilitate other IT roles to be overhead. A certain amount of overhead is unavoidable, of course, but we want to reverse the historical trend in IT organizations toward excessive stratification and compartmentalization, since that tends to separate the providers of services from the consumers of services.
In most traditional IT organizations, there is a team whose job is to run tests on applications. Usually engaged after development is complete, they run a battery of tests that may include regression tests, load tests, burst tests, longevity tests, security tests, failover tests, performance tests, and coexistence tests with other applications sharing the same server, network, or database resources. Either the same group or another group may be responsible for examining the application to ensure it follows corporate standards. Much of that work is outside the scope of software development as such, but some of it can be transferred to development teams by extending the TDD concept to larger scopes than just unit and integration testing.
FTDD can alleviate the need for separate groups to validate applications against corporate standards and to perform several types of testing listed above. Corporate standards can simply be included as requirements and can be verified through automated tests. Functional and regression testing is automatically handled, as noted above. Load, burst, longevity, failover, and performance testing can also be done within the scope of a development project. This approach reduces the number of people engaged in secondary, "overhead" roles in the IT organization, and frees them to work directly on customer service activities.
In some organizations, this can result in the elimination of an entire group or department, which would otherwise be dedicated to regression testing and/or quality assurance.
Functional tests exercise the UI by mimicking end users. There are a lot of good reasons to write the UI code last, so there may be no code to test for some time after the executable functional test is first developed. Some of the reasons to write UI code last are explained pretty well by Keith Ray, among others. So, wouldn't FTDD mess up the automated build process? If development began with an executable functional test that fails because there's no code behind it, wouldn't that cause every single build to fail, until near the end of the iteration? What's the value of instantly notifying the development team of a build failure in that case? None that I can see.
The good news is it isn't necessary to run the executable functional test suite automatically with every build. Because functional tests exercise the code at a larger scope than unit or integration tests, they only need to be executed occasionally as development progresses. They serve as an indication of progress and a reminder of functionality yet to be built. In the end, they will serve as regression tests.
To gain the maximum benefit from FTDD, develop the executable functional test first and let it fail. When it fails in the expected ways, then you know you have a pretty firm basis to start development. At that point, turn your attention to TDD at the unit test level, just as you would have done ordinarily. Just as a unit test tells you when you are finished with a unit and an integration test tells you when you are finished with a related set of units, a functional test tells you when you are finished building a distinct piece of application functionality.
Tests of larger scope are executed with less frequency than tests of smaller scope. Unit tests are executed frequently throughout the day by all developers; integration tests are run automatically by the continuous integration server with every check-in and build; functional tests are executed occasionally and not automatically.
Functional tests will play a large role at the start of an iteration, during the time they are developed jointly with business analysts, and at the end of an iteration when you need to verify all the functional requirements have been satisfied. There is no need to run them with every build throughout the iteration.
If we are to use TDD at the functional level in the same way we do at the unit level, we must have tools that allow us to write executable functional tests before any application code exists. On the project that prompted this idea, the business analysts were using a tool that could only generate test scripts by recording user actions on an existing UI. The tool was suitable only for after-the-fact testing, as in a predictive development methdology. Are suitable tools available for FTDD?
The answer depends on what tools are used to develop the application. The functional tests are likely to be written in the same language and run in the same environment as the application code. This isn't absolutely necessary, but many details are simpler this way.
The project in question at our company built a webapp using Java and some of the usual Open Source suspects such as Hibernate, WebWork, and so forth. I tested eleven different testing tools to see whether they could support FTDD. None supports it especially well. Each has its own unique problems. Some tools are really designed specifically for after-the-fact testing, although their vendors like to use words like "agile" and "test driven" to sell their products. Others can support FTDD, but have one or more "gotchas" that stop the show (usually having to do with support for JavaScript).
Personally, I am not satisfied with the present state of the art in testing tools to support FTDD in the Java world. There are a few promising candidates, such as Selenium (supported by ThoughtWorks) and actiWATE (supported by actiMind), but they have their limitations. Some tools typically used for unit testing of the UI layer can be put to use for functional testing, including HtmlUnit, WebUnit, and HttpUnit. They all suffer from limitations in JavaScript support, and none supports the asynchronous XMLHttpRequest call which has been gaining in popularity recently.
Despite their limitations, you can still use one of these products to gain the benefits of FTDD. You must expect to spend a little more development time on functional test scripts than ought to be necessary. I expect that as the products mature, this will cease to be an issue.
If you are doing Microsoft .NET development, you are in a better position with regard to tools for FTDD. Chapter 12 of the book Test-Driven Development in Microsoft .NET by James Newkirk and Alexei Vorontsov steps you through a practical example of how to write an executable functional test for a web UI component in .NET. The test script can be written before any actual code exists, and can provide a straightforward guide to the developer in writing the code.
Ruby on Rails is fast becoming a mainstream development tool for serious webapps. The Ruby language has been intelligently designed to support TDD, and Rails builds on that foundation. Unit testing is much simpler with Ruby and Rails than in most other languages. Unsurprisingly, it is straightforward to extend TDD to the functional test level. No supplemental tools beyond Ruby are needed. A Guide to Testing the Rails provides basic information about tesing Rails applications, and How to Functional Test a Create Action in Ruby on Rails steps you through the process of writing a functional test for a UI component.
Not just any old program can be used for FTDD. Testing tools that can support FTDD must be designed with adaptive development in mind. Tools designed to support predictive methodologies have a different purpose, are applied at different points in the development cycle, and are used by different personnel than those designed for adaptive methodologies.
Testing tools for predictive projects are often aimed at non-technical users. Testing is performed after code has been developed to a "stable" state. Testing personnel record user actions on the UI and save the resulting scripts. Little, if any customization of the scripts is done. The purpose of testing is to validate functionality, not to guide development.
FTDD on an adaptive project is entirely different. Tests are written before the code is written. Developers write test scripts in a programming language. The purpose of testing is to guide development and enforce requirements directly, not to validate the functionality of completed code.
If you want to use FTDD, you must be cognizant of these differences and be sure to assign tasks to the appropriate personnel. A functional test script written before any application code exists must be co-developed by a person who understands the functional requirements very well and a person who understands the programming language very well. It is unrealistic to look for a "point and click" sort of tool for FTDD.
FTDD offers solutions to a number of technical, procedural, and organizational problems commonly encountered in software development shops. To be successful with FTDD, you must understand its purpose, value, and trade-offs; assign the appropriate personnel to do the work; include the appropriate steps in the development process; choose appropriate testing tools; and rigorously apply an adaptive methodology such as XP or Scrum.
From those efforts you can reduce organizational overhead and free resources to work directly on customer service, simplify your development process, improve intra-team communication, raise the level of quality of your code, minimize the number of defects created during development, reduce the number of hidden production problems that occur after development is complete, and deliver greater value to your customers at lower cost. Last but not least, you can make your life as a developer just a little easier.