How testable is your or your team’s code?

How can you design your code so that it is testable?

Are you struggling to test code because you did not write it with testing in mind?

If so, you and your team need to design your code with testing in mind.

We all need and want code that is easy to test to get faster, quicker, and cheaper feedback on our changes.

So, here are some tips and techniques to help improve the testability of your code;

First, what do we mean by testability?

A quick google will provide you with lots of definitions, but a simple starting point is; Testability is the degree to which and how easily you can test the software.

If the testability of the software is high, then testing should be easier and more efficient.

In other words, you can think of testability as “how easy is it to test?”

To gauge your software’s testability, consider how efficiently you can test it in terms of these two factors; the number of tests you need to run to be confident it functions as expected and does not exhibit any unexpected issues. And the level or layer at which these tests can be developed and executed.

A larger number of tests or the majority of your tests being at a high level (e.g., UI, E2E, system-level), as opposed to at the unit or integration level, would typically indicate that your code is not very testable.

So, how can we improve the testability of our code?

What are the three main factors?

  1. Observability – the extent to which we can see that the code functions as expected and not doing anything unexpected.
  2. Controllability – the extent to which you can control your software. Often by controlling the inputs, state, or data on which each component operates.
  3. Isolate-ability – the extent to which you can test your code in isolation. 

In more detail;

Observability

Can you observe the results of any decisions or state changes in your running code?

Good examples;
Code modules or functions that return clear state or status
Bad examples;
Code modules or functions that modify state or status but do not return or share anything observable with the calling code

What are you logging?

Good examples;
Effective use of logging levels for different information types, e.g., INFO, WARN, DEBUG, ERROR.
Bad examples;
Nothing logged
You are only logging errors.

Can you query state, status, or data easily?

Good examples;
State or status is stored or queryable at all times in a DB or via an API call.
Bad examples;
State or status is maintained only in the code’s running memory, requiring sophisticated runtime debugging tools to observe it.

Controllability

Can you efficiently drive different data values into your code?

Good examples;
Easy to call your code component and pass the data you want it to use
Easy to provide data via a defined interface or data source, e.g., an API or DB.
Bad examples;
No ability to pass data into your code
Data is provided by a dependency that makes it hard or impossible to control.

Can you quickly change the current state or status of your code?

Good examples;
Easy to call your code component and pass the state you want it to have at any point in time
Easy to provide state via a defined interface or source, e.g., an API or DB
Bad examples;
No ability to pass state into your code or to set state before your code is executed
The state is provided by a dependency that makes it hard or impossible to control.

Can you control your inputs?

Good examples;
Easy to change inputs or to use mock, fake, or stubbed inputs, e.g., switch out the source of input between real/live and fake/controlled ones.
Bad examples;
Inputs are hardcoded or otherwise unchangeable.

Isolate-ability

Can you isolate your code components and test them on their own?

Good examples;
Your code component can be easily called or executed on its own. Its’ inputs controlled, outputs observed independently of other code components, such as a unit of code that can be called and passed data or state, and that returns data or state that can be easily verified.
Micro-services tend to be easy to test in isolation.
Bad examples;
Code contains many dependencies that are hard/impossible to control, meaning that it has to be run in an environment where the dependencies are met/running.
Code monoliths tend to be hard to test in isolation.

How easy is it for you to mock, fake, or stub out any dependencies for your code?

Good examples;
Easy to inject data via in-memory DB, file, or substitute a URL or other resource pointer for a fake one
Bad examples;
Hardcoded dependencies.

Other attributes which tend to impact testability include;

Separation of concerns – the extent to which each code component you want to test has a single, well-defined responsibility.

Can you easily understand or define what each code component does or is what its’ responsibility is?

Good examples;
Clear and typically singular responsibility for a piece of code
The state is changed, or a decision is made based on data or conditions assessed by the code.
Bad examples;
Code monoliths or code blocks in which you make multiple decisions or changes of state.

Understandability or readability – how easy is it to understand the intent of the code and how you achieve that intent.

Can you quickly understand what the code is doing?

Good examples;
Clean code, naming helps make intention(s) specific and meaningful,  commenting to explain any complex or non-obvious code.
Bad examples;
Obfuscated code using short naming that is not explicit or clear. Abstractions that don’t help readability or clarity Enumerations that do not help readability or clarity.

Takeaway

The key takeaway here is to think about how you will test your code BEFORE you write it. Doing so will help you ensure your code is; observable – so that you can see the result, state change, and data that occurs when your code executes.
Controllable – that you can easily control the input data or state for your code to use or work with
Isolatable – that you can easily verify your code in isolation, that you do not need to execute multiple dependencies together to test your code

I often find myself working with teams and code that is was not developed with testability in mind, and this means it is hard to add unit tests without first refactoring the code to make it more testable. It can be hard to know whether or not the code components behave as expected without using or employing sophisticated and often expensive tools or debuggers to observe it or control state or data during execution. Because it is difficult to isolate the code components, you often need to exercise the code and run tests in an integrated or almost production-like environment. Meaning that your feedback cycle is much longer, the time between you making the code change and getting any feedback on whether that change was right or not.

Putting the cart before the horse

Or the solution before we understand the problem

How often do you jump to a solution or search for a tool before you really truly understand the problem you are trying to solve?

I know I do this often …

I am trying to help a friend, colleague, or someone I lead, by telling them a solution to their problem before they have finished telling me what it is.

I have gone to a meeting convinced I know what we are going to talk about and the solution I will propose.

I have also searched for test tools that can help me test something, without having really defined clearly what or how I need to test.

Put simply, we need to fully and clearly understand the problem or problems we are seeking solutions for, because we cannot possibly evaluate any solution well without understanding the what, how, and why it will solve the problem or problems for us.

I can now realize and recognize when I am putting the cart before the horse, so, I can also see others do this.

I am often attending meetings where there are one or more participants who are eager to share the solution they have to a problem we are not all agreeing exists, or at least not all understanding the same way.

In addition, I am often asked what I think about tool X or solution Y without being asked if I have understood what problem needs to be solved by either.

Worse, I am being told a solution for a problem that I have not fully stated needs solving, or that I have fully understood myself.

Or, I am being recommended or told to use a tool when it is not clear what part of, (if indeed), any part of the solution the tool will address.

Ideally the horse should come before the cart. We should clearly and fully understand the problem or problems we need to solve before we seek solutions.

So, how can we do that effectively and consistently?

Here are some simple suggestions;

Meetings:

If the meeting has no agenda – ask what the agenda is, what do we want to get out of this meeting? What are the desired outcomes?

If the meeting seems to be going off course – bring it back by asking if we are still trying to solve the problem we agreed at the beginning or if we are now solving a different one? It may be that some feel the initial problem has been solved, and we are moving on to another. Others may feel that in trying to solve the first problem we identified a second and are now busy trying to solve that. Either way, calling this out will hopefully lead to a (re)statement of the problem that everyone should be currently seeking and evaluating solutions to.

Emails:

If you are sent an email asking you to use, take part in, or help with solving something, but the problem statement is missing or unclear, ask for clarity. I will often try to help the email sender by asking convergent questions, for example;    

I think you are asking me to solve either problem a) or problem b), but I may have misunderstood and you could be asking about c) which I am unclear about, please can you let me know which it is?

If I have guessed correctly and it is either a or b then they can simply reply with a or b and do not need to take the time to explain either fully, as I am letting them know I believe I understand both well enough to proceed with whatever was asked in the original email.


In-person:

Often this will be your boss or a colleague tapping you on the shoulder, catching you in the corridor, or by the water cooler. Asking what you think of tool X or somebody’s solution or idea. If you are not 100% sure of the context or problem that the tool, solution, or idea is in relation to, ask for clarity. For example;    

I am not sure which problem tool X, solution, or idea Y is trying to solve, can you please remind me before I can offer you an opinion?

In summary:

Make sure you clearly understand the problem before seeking or evaluating any solutions.

Wait for the person to finish explaining the problem they have, or how they see it from their perspective.

Ensure every meeting you attend has a clear agenda and desired outcomes. What are the problems we are trying to solve, what are the objectives – so that we can all understand if the meeting is on track and if we should even be there.

Don’t respond to questions about solutions or ideas unless you are sure you are clear on the problems they are addressing – you will not be able to provide good answers or evaluations if you are not clear.

Observation

With reference to my original post on The Streetlight Effect; I still see this effect today.

Recently, I was discussing the testing of a mobile app with an experienced QA, who is a very experienced black box tester. I am currently encouraging and coaching this tester to learn more about programming, code design, code architecture etc as I think any knowledge in these areas can make you a more effective and efficient tester.

This tester was seeing some problems when interacting with the mobile app UI and was going to raise a bug with the developer. I asked this tester if they were observing the mobile app. requests and responses to and from the backend API, i.e. the requests and data being sent and received either across wifi or cellular data between the mobile app. and our cloud-based backend server API. The tester was not sure how to look at that traffic. The streetlight effect – I will look here (the UI) for my keys (any issues) because I know how to look here (the UI). So I shared a how-to I had written years ago for Charles Proxy which will allow the tester to inspect the https requests and contents along with timings and more, by proxying all requests made from, and responses to, the mobile app. through his laptop. This enabled the tester to see the requests that were failing as well as the resulting error message – the tester was then able to raise a much more detailed bug and the developer was able to get straight to the problem and fix it quickly. (As opposed to a bug like, when I do this in the app, I don’t get the UI screen/data I was expecting to see displayed. Where the developer would have to first reproduce the problem, watch the traffic themselves to pinpoint the problem – taking much longer and possibly with some back and forth to clarify reproduction steps etc.)


Another example is another very talented tester who is also a strong coder so has great white box testing skills. This tester was digging into some performance issues. Trying to understand why a request for a large dataset from an API was taking so long (when the system was otherwise quiescent), and would often fail if there was any other activity (API GETs and POSTs to requests for data and to store data). We are using AWS and there is a myriad of tools and monitoring capabilities to learn and get your head around. This tester was able to extract the time taken to complete the request for data and plot this against the size of the data extracted. If you think about this visually, this tester is looking at this from a black box perspective, making a request knowing what was requested and extracting the start time of the request from the log, then extracting the completion/response time from the same log, then plotting this against the size of data returned. (The tester was increasing the data stored and thus retrieved between each test/request). In this case, being capable of white box testing and understanding the code and system architecture this tester knew that there were several key components involved in servicing this request, but was not observing any of them. In order to understand what is really going on, and in this case be able to pinpoint the parts of the system that was taking a long time to service the request and which ones would fail when there was any other activity. Most performance issues are as a result of some form of resource exhaustion, e.g. CPU, Memory, Input-Output (IO), threads, connections etc. So, we really want to be able to see how these resources are being consumed when we interact with the system, as this can lead us to understand that our CPU usage is spiking to 100% when we do something and thus cannot do more when more requests come in, or that our memory spikes up and never comes back down to the pre-request level once the request is complete – in other words some memory is not being freed leading to a resource leakage (we will exhaust this resource over time). In this case, learning how to observe the individual service docker container resources and the database resources will likely lead us fairly quickly to where the problem(s) or weak link(s) in our data (chain) are for the request we are making.

In conclusion, what we need to do more often is to ask questions like;

  • What could I watch or monitor to see, in more detail, what is happening/going wrong?
  • What is the data flow – what path does the data takes through our system and what components are involved?
  • How is the system architected and how do all of the components communicate
  • How can I observe the communications between the system components?

If any of these questions result in a “I don’t know” or similar, then ask your colleagues for help, you are likely to learn something new, even if that something is that your colleagues also don’t know the answers to some of these questions.

A BDD worked example – login page

I have used this example as a workshop to introduce BDD to a wide variety of folks at different companies. I like this example as it is deceptively simple, everyone knows how to deliver a login page, right? The reality is that we all have different ideas about what should and shouldn’t be on a login page and how it should look etc. So it does serve as a simple but very illustrative example of how using a Behaviour Driven Design approach can really help to clarify requirements, and engage the thoughts, experiences, and knowledge of all the participants to ensure what you deliver will be what was really desired. Also, that it will be both testable and tested as the high-level acceptance tests are defined up front.

Introduce roles and abbreviations

First of all, I want to introduce the roles that will be part of the discussion along with the abbreviations for those roles used in this example. Each role can then speak their part as an example of how the discussion could go for this example.
  • PM – Product Manager, our proxy for the customer, bringing the ‘what the customer wants or needs’ definitions to the team
  • DL – Development Lead
  • IxD– Interaction Designer, bringing the UI look and feel, the usability and customer workflow understanding to the team. Helping to ensure we have a consistent style, content, and customer workflows.
  • QA– Quality Assurance person (either QAE or SET) who will be ensuring we deliver the story with high quality, building it right and building the right thing
  • Dev– Developer(s), responsible for the actual implementation of the story, the code that will provide the desired functionality.
  • Implementation team– typically composed of a developer and a QA person, but can include Interaction designer, development or QA pairs.
  • Amigos– the group of people required to analyse a story – typically the PM as the customer proxy and the implementation team

Introducing the Story

The Login Page
Bring the story into ‘In Analysis’
What does the story look like at this point? (This is an example using a tool called Mingle)

The BDD discussion begins

As is fairly typical at this stage, the story does not contain a lot of detail and is kind of vague in its description
We start the discussion:
PM or Dev Lead presents the story
The 4 amigos (PM + Implementation team (IxD, Dev, and QA)) discuss and ask clarifying questions to understand the story in detail, exposing and discussing any business risks as they go
PM/DL: This story is to deliver a login page. Fairly standard login page, username and password fields and a submit button. (The what). This will be the login page for our administrators. (The who). Once they login here they will have access to the dashboard and all the administrator functionality. (The why)
QA/Dev: Do we have a mockup?
IxD: Yep, looks like this (I encourage mockups to be cheap and for me, nothing beats a whiteboard diagram for cheap, flexible and efficient;
QA/Dev: So is the button text ‘login’ or ‘submit’?
IxD: I think ‘login’ is more intuitive
QA/Dev: Is it a username or an email address?
IxD: I was assuming it was an email address
PM: Yep, we will need to use an email address, we will want this to work with our single sign-on feature coming later and that will use an email address
QA: Can I assume we will use our standard code for validating an email address?
Dev: Erm, do we have a standard email address validation code?
QA: Yes, I believe the architecture team has a regular expression they standardised on
QA: Do we want to provide any client-side validation of the password? Or should we just send it to the server for validation against the username? i.e. should we ensure it is at least 8 characters long, contains at least one special character and at least one upper case character?
PM: No, we will have checked that when we set the password at admin user creation time or when they update it themselves. Let’s just have the server side validate it against the email address. Besides if we provide guidance on how a password will be composed then an attacker can update the dictionary they are using for brute forcing so that it follows the rules.
QA: How do we want to tell the user that either their email or password is not valid? Text on the page? Red? A popup? Do we want to clear the fields?
IxD: First I think we should have ghost text in the email address field to provide an example of a correctly formatted email address. For the error, I think we should have red text above both boxes, and we should leave the fields populated, let me provide an updated mockup;
QA: Do we want to show a different message for an invalid email address, i.e. one that fails email address validation rather than a check to see if that email address is a user in our system?
IxD: Yes, I think we should help the user to avoid typos, how about red text above the username field for this too. Here is an updated mockup;

QA: Does the error text and the text on the page need to be localised?
PM: Yes, we need to support the existing 14 languages for the Admin users
Dev: So, we should use the browser context to set the locale and display localised text if we support that locale and a fallback if we don’t?
PM: Yep, we might have a different setting later if we allow users to select a preferred locale that we store as part of their profile, but at login time we don’t know who they are so we should just use the browser locale.
QA: Cool, so the localised text will be supplied as part of the page render based on the initial request to the login page URL, including any text for error messages?
Dev: Yeah that’s the way we usually do it, so we can just send error codes back and the client side code can then render the appropriate message. Of course, the email address validation will be checked at both client and server so we can provide quick feedback to the user if they don’t provide a valid email address format in that field, but also guard against someone hitting the server directly with an invalid formatted email address.
QA: Nice, we should make sure we have unit tests covering the validation on both client and server then, I can add a single invalid test for each to show the error message (UI) and return code (server)
Dev: Should we have a ‘forgot password’ link and functionality?
PM: Yes, but I don’t think we have email functionality built yet, so we will defer that to a future story
QA: Should we have a timeout for responses from the server? i.e. how should we deal with the server being busy or unresponsive?
Dev: Yes, we should have a timeout value in the client code that will display a message to try again later, do we have a mockup for that?
IxD: Agreed, let’s allow 10 seconds for the timeout and I think we should show a message to try again later if we timeout or if we get a 50X back from the server. Here is a mockup for how that text should be displayed;
QA: Do we need to support logging in on mobile devices? i.e. should this page follow a responsive design pattern?
PM: Yes, we need to support tablets right now and may need to support phone devices in the future, if we go responsive now then both should work.
QA: But we will only need to test on tablets for now, right?
PM: Yep, we will add testing stories for phone device testing later if we need them.
IxD: Responsive should be easy enough, but I may need to think about the length of the fields and the text we will need to display, particularly in different languages.
QA: What about accessibility? Do we need to support a WCAG level for this?
PM: Hmm, well we should but I think we will defer that to a future story. Let’s try to keep it in mind so that we don’t have to re-design later
Dev: What about functionality to enable maintenance notifications? i.e. the ability to add text to inform admins of upcoming maintenance or outages?
PM: Again, I want to defer that to a future story, I will sync up with Production IT to understand the requirements for that
QA: Do we need to limit the number of attempts to login so we can avoid brute force security attacks?
PM: Hmm, yes I think we should allow 3 attempts and then lock the account for maybe 5 minutes?
IxD: Actually I think 5 attempts would be better
PM: OK let’s go with 5 attempts and 5 minutes wait time
Dev: Do we want to log each and every login attempt (both successful and unsuccessful) or just the ones that result in a lock on the account?
PM: Hmm, I think we may need to log all attempts along with success, failure or lock so that we can provide an audit log if the customer needs it or if we need to show anything for a security audit.
QA: How do we want to show the lock message when 5 unsuccessful attempts have been made?
IxD: I am thinking red text again, but below the 2 boxes and to the left of the button this time, here is an updated mockup;
QA: How are we determining 5 attempts to login? Attempts using the same email address? Coming from the same IP? Some form of session identifier e.g. cookie?
Dev: Well the simplest is to set a session id in a cookie when an attempt is made on the server side and then to count how many attempts are made with this session id
QA: I presume that means someone malicious could simply brute force by creating ‘cookie-less’ requests?
Dev: Yeah, maybe we need to think about that one some more or talk to the security team.
QA: What should happen if the user attempts another login when we have locked them out?
.
.

.

QA: So based on all of that, what do we need as examples to accept this story with? I am thinking something like this;
Given a valid email address and password when I select login then I should be authenticated and taken to the dashboard page
Given an invalid email address or password when I select login then  should see a message indicating that my login attempt was unsuccessful
Given a badly formatted email address when I focus outside of the email address text field then I should see a message indicating that I have entered an incorrectly formatted email address
Given I am entering an invalid email address or password for the 5th time, when I select login then I should see a message indicating that I must wait 5 minutes before trying to login again
Dev: Should we include an AT for the server busy/down error message too?
QA: Is that required for acceptance? It is not something the user is in control of or can directly impact (without removing their network connection)
PM: I agree, we should test for it but I don’t think we need to include that in the Acceptance Tests
PM: I want to be sure this will look and work well on a tablet, so can we make sure we test that?
QA: Sure, we can do desk checks on an iPad if you like? But we will automate the ATs using Selenium and test with our most popular customer browsers for the admin interface
PM: Great
IxD: I am a bit concerned with how the localised text will look, can we make sure we test that too?
QA: Sure, we will test that the locale gets set and fallback as expected and we will do some basic checks with pseudo loc to make sure we don’t have overlap, truncation etc, but how about we include some different languages in the desk checks and with iPad to make sure you are ok with the the look and feel?
IxD: Sounds good
Dev: What about an AT for the audit logging?
PM: This is not a formal requirement from our customers or security team yet, so I want it tested but it does not need to be an AT as it is not a must-have part of the specification.
QA: No need for any regression tests here as this is all new code and not dependent on anything else.
PM: Do you need a new environment to test with?
QA: We already have the pipeline setup so we can just deploy to that from the CI system and test there, so no, we should be ok.
PM: What do we think are the biggest risks?
QA: Well I think security is the biggest as this is a login page, but we will mitigate that with validation in the client and server side plus testing focused on circumventing security, including the lockout to prevent brute forcing. The next biggest is email validation, this is notoriously problematic as most email clients do not conform to the RFCs. We will mitigate this by using our standard email validation to be consistent and to have one place to change if customers complain. We can also monitor the audit logs to see if people regularly use different methods of commenting or ‘tagging’ their emails that we should allow for. I am not really concerned about performance (very little traffic between server and client) and we will do regular desk checks for usability and style including the error messages, localised text, and responsive design.
What does the story card look like now?
QA: We need to get more specific with our examples, we have captured the top level ideas and behaviours but we really want to provide examples (or specification by example) so that it is 100% clear how this will behave and so we know how we will demonstrate this to you for acceptance, so how about;
Given I have entered valid.email@mydomain.com and Password1 (valid pass) as the password
When I login
Then I should be presented with the dashboard page
Given I have entered invalid.email@mydomain.com and Password1
When I login
Then I am presented with an error message in red text saying “Invalid email or password”
Given I have entered valid.email@notopleveldomain as the email address
When I change focus from the email field
Then I am presented with an error message in red text saying “invalid email address”
Given I am entering invalid credentials for the 5th time in a row
When I login
Then I am presented with an error message in red text saying “Too many failed login attempts, please wait 5 minutes before trying to login again”
PM: OK, we know the scope now and we have Acceptance Tests defined, so what do we think is the size of this story is?
At this point, we have clarified and agreed on the scope, and have a common understanding of what ‘done’ looks like in the form of some high-level acceptance tests. It is reasonable to guesstimate the size of the story at this point. But note we are much more likely to be able to guess more accurately once we have talked through the design in a bit more detail – the how we will solve this need in code discussion.

Don’t have time to do it right, but have time to do it twice?

Why don’t we have time to do it right, but somehow we do have time to do it twice?

My paraphrase of a quote by John Wooden (quote number 5)

How come we often think it is better to rush into something we don’t understand and hack at it than to take the time to understand what it is we are really trying to achieve and then think about how we will achieve that before we start coding?

Do we not learn from our mistakes?

Do we not see, measure or understand the cost of halting a developer, who is by now working on another story, to get them to switch context and think back to the rushed coding they did on the previous story. To get them to diagnose the root cause of the bug we just discovered? Or see, measure or understand the time it takes to diagnose, resolve then rebuild and re-test this change through the entire pipeline? With a very real risk that when handed back to the tester that she will find another issue as the fix was rushed and we did not have enough coverage in our pipeline of automated checks to discover the regression that was introduced.

I have seen this pattern throughout my life and have been guilty of it myself. So what do I do about it?

Well I try to discipline myself, but what I do for my teams is to use BDD to ensure we have shared and common understanding of what story we are about to do, changes we are about to make, additions we are introducing. To ensure we all understand who needs these changes and why they are important to them. (Who the customer is and why they care). Then we, (the three amigos – slide 9), will be able to agree some high level examples of what ‘done’ looks like for this piece of work. These examples will be in the the form of tests that will adequately specify and thus prove we have delivered what was required. We call these the acceptance tests. They are defined before any coding is started. Hence the ‘driven development’ part of BDD. Where possible these tests are automated and are required to be passed, (via automation or manual checking), before the story is pulled by QA, (we use Kanban), to do a final exploratory test.

We are not perfect at this, but it really does mean that we can get a story accepted more often than not on its first pass through the pipeline. If it doesn’t we all understand the work well enough to learn from the mistake(s) and improve.

My Agile QA Manifesto and Testing Principles

My Agile QA Manifesto

With reference to the original Agile manifesto I present my thoughts on an extension for agile QA or an agile testing manifesto;

  • Prevention over goalkeeping
  • Risk based test coverage over systematic test coverage
  • Tester skill over test detail
  • Automation over manual (for checking/repetition)

While there is value in the items on the right, I value the items on the left more

Testing Principles

And to follow that, a set of principles I try to follow and try to instill into those that work with me;

  • Fail fast/provide fast feedback
  • Test at the lowest layer
  • Test first (TDD/BDD)
  • Risk based testing for efficiency
  • Focus on tester skill and domain knowledge
  • Drive for automation for repeated checking (regression)
  • Learn from your mistakes – don’t repeat them

Fork And Ambush

Fork and ambush is my short way of describing the outcome of non-collaborative work on a story or the implementation of a requirement. I used to see this all the time in more waterfall like environments, but I am sad to say I still see and hear of this in more agile environments too.

The scenario is as follows;

Someone, usually the customer proxy, in my current company this is a Product Manager, provides a requirement, in our case a story along the lines of “As a …. I would like … so that I can …”
The Developer would then take this story and go back to his or her desk and start developing the code to deliver the functionality for the story.
The QA or tester would take this story and go back to his or her desk and start thinking about test cases that should be executed against the story.

This is the fork (both going their separate ways to think about the story individually)

Some time later …

When the story is developed the developer notifies the tester and testing begins.

This is effectively the ‘ambush’ part.

The tester is often trying to find fault with the developer by hunting for bugs in the developed code.
Often a bug turns out to be a difference of interpretation (of the story) between the developer and the tester

dev_diff_to_test

In the worst case the customer proxy (e.g. Product Manager), comes along to diffuse the argument and informs them both that they are both wrong and what has been delivered is not what was required and the tests are also incorrect.

So, how can this be better?

Using a more BDD like approach, where the three amigos (Developer, Tester and Product Manager) discuss the requirement first.
Making sure each of them understands what is required (scope of work), who it is for (who the customer is), and why it is important to them.
They confirm this understanding by defining collaboratively the set of tests that will be used to prove what has been delivered is acceptable to the customer (what they needed and working in the way they need it to).
Then if the developer and tester fork at all then they both have a clear understanding of what is required.
The developer can ensure the developed code passes the acceptance tests.
The tester can ensure that in addition to passing the acceptance tests the developed code does not do anything unexpected and conforms to any non functional requirements that may have also been discussed etc.

The conclusion then is more likely to be a successful delivery as the work would not be accepted if the acceptance tests do not pass, and the developer and tester are on the same page as the Product Manager and can all see how they can work together towards the common goal.

Behaviour Driven Development – An Introduction

This is a presentation given by me and Marc Karbowiak  at a local test meetup group called YVR Testing on the 2nd April 2014

PDF version of slides: BDD Intro

Intro

Disclaimers first, I am not an expert in Behaviour Driven Design (BDD), in fact I am just starting down this particular learning path. I have however been testing, and to a lesser extent, automating tests for many years. So I have learnt enough to know that this approach is a great one to try, as I can see how it will help to address many of the issues we experience. In particular it will clearly help prevent the types of issues that arise from misunderstandings, assumptions and ambiguity in our requirements.
Marc and I wanted to share some of our early experiences and those of other and better folks that precede us in learning BDD, as we feel strongly enough about this approach that we want to encourage and inspire others to learn and adopt BDD practices.

(slide 2) Setting the scene

We have probably all experienced ambiguous requirements, which means we have probably all experienced problems in our products as a result.
Some simple examples;

  • Product Manager (PM) & Interaction Designer (IxD) require a text box to have a 100 character limit.
  • Test automator leaves the requirements discussion, goes back to his desk and writes tests that will drive the UI to write one character and assert that the UI shows character count of 1
  • Developer leaves the same discussion, goes back to his desk and writes code which will show 100 characters and count down every time a character is entered.
  • The test fails and then a discussion ensues to figure out which one understood the requirement appropriately.

I have experienced many of these types of situations, quite often where the PM is saying that neither the tester or the developer understood them correctly. In other words that both developer and tester misunderstood or mis-interpreted the requirements, and now both need to go and re-do or refactor their work to deliver what the PM really wanted.

(slide 3) Are you often testing at the end of the cycle?

Perhaps you are in a waterfall like development lifecycle, or a fragile lifecycle?
Do the testers know ahead of time what they will test? Did they work that out from either a written requirements document or a requirements discussion at the beginning?

Or are the testers working with the developers, product managers and in our case interaction designers on a regular (daily?) basis to ensure we are always on the same page and on track to deliver what is really required?

(slide 4) Are you in an agile like environment?

In which case do you all speak the same language? Do you have a domain specific language that is understood and used by all?
I have been in many requirements discussions, story kickoffs or similar where it really seems like we are talking different languages.
In my current role we have a lot of domain specific terms which are either overused, (used to mean more than one thing depending on context), or terms that mean different things to different people. We recently had a problem where one team used the term ‘system variables’ to mean a specific kind of data we store about a member of one of our insight communities, another team wanted to use the same term to refer to data we capture and store about a survey respondent’s computer system (e.g. browser & version, screen resolution and browser locale)
As a test why not ask 10 different people what a test plan is and see if you get 10 different answers.

(slide 5) Deadline approaching?

Does this mean you usually cut a few corners, rush your work, ignore some aspects of your process that perhaps you don’t think provide good value for the time they take?
So, why don’t we have time to do it right but we have time to do it twice?
(I forget where I first heard that phrase but it is a really powerful one for me)
We often cut corners or rush to meet a deadline, knowing really that we will have to come back and ‘fix it up’ or pay down some ‘technical debt’ later. And of course the cost of that will be higher than the cost of doing it the first time round.

(slide 6) A well known illustration of;

a) What the senior developer/designed designed
b) What got delivered
c) How it was installed at the customer site
d) What the customer really wanted

This is typically the result of some of the ways we have been working and the approaches we take.

(slide 7) So how can we be better?

We can adopt the test first approach – combine the red-green-refactor pattern of TDD with behaviours to get BDD
Defining a test first and then writing the code to pass that test provides many benefits;

  • Only write the code needed to pass the test (no waste)
  • Ensures code is testable – cannot pass test otherwise – requires observability etc
  • Test effectively documents the code
  • Safety net – test is added to CI system ensuring we never regress this code (if we do the test fails) – future code refactoring is done with confidence as we will know, as quickly as it takes to run these tests, if we have made a mistake

(slide 8) Where do I start?

Another way to describe the test first pattern is the Acceptance Test Driven Development approach.
This fantastic diagram is borrowed from (add refs and links)
The fours Ds
Follows the TDD red green refactor cycle as shown in the middle, i.e. this is iterative

(slide 9) Discuss

Ensure you have representation from all of the key roles to discuss the story or item of work. Typically we refer to the 3 amigos – Product Management (or Business Analyst or if possible the customer), Development and QA. At Vision Critical we also often have a 4th amigo – an interaction designer (IxD)
The discussion needs to unify the language, ensuring we are all discussing the same thing and have common and shared understanding
For example ensure we don’t mix development terms with product or customer terms, try to define a domain specific language (DSL) that we can all share and understand.

(slide 10) Distill

Discussion should then produce examples of the behaviour you want from the product or system. These examples are effectively how you will test you have developed what was really needed.
Use a ubiquitous language and structure to define these – Given – When – Then
Facilitates clear communication as well as structure that is easy to read and simple to follow

(slide 11) Develop

First develop the automation that asserts the behaviours (automate the tests first)
Then develop the code (production code) to pass those tests

(slide 12) Demo

Demonstrate the working code using the automated tests
Review the behaviour specifications with the customer or product manager
Add the tests to your CI system (keep running the tests to ensure fast feedback and a consistent safety net of tests)
Time to celebrate and perhaps retrospect on the story and capture anything you learnt and ideas for improvement so that you can apply all to the next story

(slide 13) Repeat the cycle for all the stories, learning an improving as you go.

(slide 14) Results

Hopefully you will have delivered what the customer really wanted and gained some additional benefits;

  • Executable specifications that can always be trusted to be true (otherwise your tests will be failing)
  • Automated regression tests that provide a safety net and fast feedback on those regressions – try to have these tests run as frequently as possible
  • Testable and thus maintainable code, not only can you look back at these tests in 6 months and know how the code works, but you also know the code is testable as it was written to pass tests

You will hopefully also feel proud of what you have achieved and will be recognised for that – if you are the only team doing this is will show!

(slide 15) What BDD is not

This all sounds great, so where do I get hold of this silver bullet or pink glittery unicorn?
Well BDD is not one of those, in fact it takes a lot of work to do it well, but it is worth it

(slide 16) How to get started?

There are a number of different BDD frameworks for the mainstream development languages, here are a few

  • SpecFlow is for .Net
  • Cucumber is mostly for Ruby
  • JBehave is for Java
  • Behat is for PHP

(slide 17) Gherkin anyone?

As well as being a pickled cucumber …
This provides the common and ubiquitous language that facilitates the simple and clear communication of behaviour
Here is an example of a feature, which contains a number of scenarios (tests for that feature)
The feature description here describes the feature and the context, in this case the problem the feature is trying to solve

(slide 18) Background

Background is a special keyword in Gherkin
In this case I am showing an example from the Cucumber Book that uses the background to setup the test preconditions – the Given for the following scenario(s)

(slide 19) Scenario

Here are the When and Then sections of this scenario
Very readable, understandable and clear

(slide 20) A question of style

A lot of folks (myself included) start writing scenarios in a similar way to how we would write test cases or code, by detailing all the steps that we need to execute in order to setup the product under test as well as perform the test and check the results.
This is an imperative style and it is not very readable, at least not when you are testing something more trivial than adding two whole numbers.
So we need to focus on a more declarative style, try to tell the story of the behaviours we want the product to have
This means hiding all of the details that are not relevant to the behaviour and keeping only the details that are important to the behaviour or the intent of the test

(slide 21) An imperative example

Lots of inconsequential detail here which means that the intent of the test is lost in the noise. We specify the email address and password but have to assume this means these are valid. Do we really need to know we clicked the login button? How does that help us understand if the product behaves correctly when we provide valid login credentials?

(slide 22) A declarative example

Hopefully you can all see this is much more readable and very clearly talks about what is important to the behaviour
This test is not about what makes a valid or invalid email address or password, it is about what happens when those are valid and the user is able to successfully login

(slide 23) DRY vs DAMP

Aim to tell a story rather than focusing on re-usability
An example here would be that we have steps like those in the imperative example;

Given I am on the login page
When I enter email as “example.user@mybiz.com”
And password as “Password1”
And I click the login button

Because these steps detail how I login we can easily re-use the same steps throughout the scenarios to login before executing more steps that are designed to assert a new behaviour, for example;

Given I am on the login page
When I enter email as “example.user@mybiz.com”
And password as “Password1”
And I click the login button
And I click the Start New Project button on the dashboard page
Then I should see a blank project
And I should be able to edit the project

Instead of re-using these steps this could have been written as;

Given I am on the dashboard page (it is not important for this behaviour to know what steps you took to login or what exact credentials you used)
When I start a new project
Then I should be able to edit my new project

(slide 24) Scenario Table Example

Using data tables to test with multiple values that will not read well if all written on one line of a Given, When or Then
These enhance readability by keeping the data clear but separate from the declarative and meaningful phrases

(slide 26) Scenario Outline Example

Sometimes you need to effectively test using the same steps but with a variety of test inputs, a scenario outline helps to avoid repeating steps each with different data values
In this case each row of the table is essentially one Given – When – Then scenario and will be executed sequentially

(slide 28) Hooks

These are really useful, they can simply call some code to setup or tear down your tests and are controlled using methods called Before and After

(slide 29) Tags

Use these to label your features and scenarios within features
You can then execute only those features or scenarios that are tagged a certain way
Or filter out tests with a different tag
We use tags to group tests by team, to run certain subsets of tests in a certain environment, and now we are trying to use tags as a way of recording and reporting test coverage by labelling features and scenarios by the code area that they cover

And that wraps it up for the presentation part.