Observation – QA Matters

With reference to my original post on The Streetlight Effect; I still see this effect today.

Recently, I was discussing the testing of a mobile app with an experienced QA, who is a very experienced black box tester. I am currently encouraging and coaching this tester to learn more about programming, code design, code architecture etc as I think any knowledge in these areas can make you a more effective and efficient tester.

This tester was seeing some problems when interacting with the mobile app UI and was going to raise a bug with the developer. I asked this tester if they were observing the mobile app. requests and responses to and from the backend API, i.e. the requests and data being sent and received either across wifi or cellular data between the mobile app. and our cloud-based backend server API. The tester was not sure how to look at that traffic. The streetlight effect – I will look here (the UI) for my keys (any issues) because I know how to look here (the UI). So I shared a how-to I had written years ago for Charles Proxy which will allow the tester to inspect the https requests and contents along with timings and more, by proxying all requests made from, and responses to, the mobile app. through his laptop. This enabled the tester to see the requests that were failing as well as the resulting error message – the tester was then able to raise a much more detailed bug and the developer was able to get straight to the problem and fix it quickly. (As opposed to a bug like, when I do this in the app, I don’t get the UI screen/data I was expecting to see displayed. Where the developer would have to first reproduce the problem, watch the traffic themselves to pinpoint the problem – taking much longer and possibly with some back and forth to clarify reproduction steps etc.)

Another example is another very talented tester who is also a strong coder so has great white box testing skills. This tester was digging into some performance issues. Trying to understand why a request for a large dataset from an API was taking so long (when the system was otherwise quiescent), and would often fail if there was any other activity (API GETs and POSTs to requests for data and to store data). We are using AWS and there is a myriad of tools and monitoring capabilities to learn and get your head around. This tester was able to extract the time taken to complete the request for data and plot this against the size of the data extracted. If you think about this visually, this tester is looking at this from a black box perspective, making a request knowing what was requested and extracting the start time of the request from the log, then extracting the completion/response time from the same log, then plotting this against the size of data returned. (The tester was increasing the data stored and thus retrieved between each test/request). In this case, being capable of white box testing and understanding the code and system architecture this tester knew that there were several key components involved in servicing this request, but was not observing any of them. In order to understand what is really going on, and in this case be able to pinpoint the parts of the system that was taking a long time to service the request and which ones would fail when there was any other activity. Most performance issues are as a result of some form of resource exhaustion, e.g. CPU, Memory, Input-Output (IO), threads, connections etc. So, we really want to be able to see how these resources are being consumed when we interact with the system, as this can lead us to understand that our CPU usage is spiking to 100% when we do something and thus cannot do more when more requests come in, or that our memory spikes up and never comes back down to the pre-request level once the request is complete – in other words some memory is not being freed leading to a resource leakage (we will exhaust this resource over time). In this case, learning how to observe the individual service docker container resources and the database resources will likely lead us fairly quickly to where the problem(s) or weak link(s) in our data (chain) are for the request we are making.

In conclusion, what we need to do more often is to ask questions like;

What could I watch or monitor to see, in more detail, what is happening/going wrong?
What is the data flow – what path does the data takes through our system and what components are involved?
How is the system architected and how do all of the components communicate
How can I observe the communications between the system components?

If any of these questions result in a “I don’t know” or similar, then ask your colleagues for help, you are likely to learn something new, even if that something is that your colleagues also don’t know the answers to some of these questions.

Author: Stuart Ashman

I am currently working as the Director of QA at Vision Critical a market research software and services company. I have been working in a variety of roles involving testing and quality assurance for over 20 years. I started off testing flight deck instruments and progressed through GSM network operations software, Unix Operating Systems and Lights Out Management Firmware, into Anti Virus and Anti-Spam software and HW appliances, finally spending a short period of time testing cloud provisioning and control software before entering into my current position. View all posts by Stuart Ashman