Quality Bits

Performance Testing with Nicole van der Hoeven

November 28, 2023 Lina Zubyte Season 2 Episode 7
Quality Bits
Performance Testing with Nicole van der Hoeven
Show Notes Transcript

Performance testing for many people in tech is a daunting task: there's an overwhelming amount of tools, frameworks, or even different types of performance testing. Where can we start? Is it all about faster applications? Are there ever cases where speed takes a back seat?

This episode's guest Nicole van der Hoeven is a Senior Developer Advocate at Grafana Labs. With over 10 years of experience in software development and a focus on performance testing, Nicole has plenty of full of learnings stories to share about performance testing (and not only). In this episode, you'll learn more about performance testing, learn about some resources worth checking out, and get a reminder on why any type of testing is all about... empathy.

Find Nicole on:

Mentions and resources:

If you liked this episode, you may as well enjoy this past episode of Quality Bits:
Observability, Monitoring, and Platform Engineering with Abigail Bangser https://www.buzzsprout.com/2037134/12664243

Follow Quality Bits host Lina Zubyte on:

Follow Quality Bits on your favorite listening platform and Twitter: https://twitter.com/qualitybitstech to stay updated with future content.

If you like this podcast and would like to support its making, feel free to buy me a coffee:

Thank you for listening! ✨

Lina Zubyte (00:06):
Hi everyone. Welcome to Quality Bits, a podcast about building high quality products and teams. I'm your host, Lina Zubyte. In this episode I'm talking to Nicole van der Hoeven. Nicole is a developer advocate at Grafana Labs. She often speaks about productivity things like Obsidian, as well as she knows a lot about performance. What I like the most talking to her is that she keeps reminding us that testing is a lot about empathy and understanding our users. So in this conversation, even though we're talking about performance testing, what it is, what benefits we have there, we also have lots of stories as well as a reminder to get to know the user. Enjoy this conversation.

Hi Nicole. Welcome to Quality Bits.

Nicole van der Hoeven (01:13):
Hi Lina. Thank you for having me. I'm glad to be here.

Lina Zubyte (01:16):
I'm so glad to get to talk to you. I've been following your work for quite a bit as well as I started using Obsidian because of you.

Nicole van der Hoeven (01:27):
That's great.

Lina Zubyte (01:27):
I'm getting this out of the way because you may mention it, you are known as someone who likes Obsidian and it works. You influence more people to use it.

Nicole van der Hoeven (01:38):
Yeah, I definitely love Obsidian. It's funny that you just got it out of the way. I'm sure it would've come up anyway,

Lina Zubyte (01:46):
So how would you shortly describe yourself?

Nicole van der Hoeven (01:48):
For those who don't know me, I'm Nicole van der Hoeven. I'm a senior developer advocate at Grafana Labs. I've been in performance testing for I think over 12 years now. Enough to lose track of time. I am also a YouTuber in my personal life, I guess I love, I talk about obsidian and productivity and personal knowledge management. I'm a big note taker and yeah, I love to travel. I love slow travel and languages and people and all that stuff.

Lina Zubyte (02:28):
Music to my ears, slow travel. It's something fairly recent to me. What is the latest fun or random fact you heard that you didn't know about?

Nicole van der Hoeven (02:39):
Oh, okay. I just learned that the word Munchen for the German word Munich, it actually is in relation to the monks that were there initially. So the city of Munich was initially just for monks. They had a whole marketplace, but it was only for monks. I learned this on a walking tour.

Lina Zubyte (03:05):
Wow, that's a great fact. I mean, I do know that quite a few things are influenced by monks like cappuccino even or Capuchin...

Nicole van der Hoeven (03:13):
And beer!

Lina Zubyte (03:13):
Monkeys, right? They are from the monk stuff. So to get back to the whole performance testing topic, how would you define performance testing? What is it? It's such a word that we throw around sometimes. How can we understand it in easier terms?

Nicole van der Hoeven (03:30):
Yeah, so I'm not going to give you a textbook definition. I think that that's easy to look up. I think in general application performance or just performance in general is the measure of how well something works for the intended purpose and testing is just verifying that, and I think that that vagueness or that ambiguity leads a lot of people to define it in different ways. I think a lot of people, for example, think of performance testing as being speed based and certainly speed can be part of performance, but I also want to kind of move away from it just being speed based. I want to move away from it being direct opposition to functional testing. We hear that a lot, functional and non-functional testing and performance is often lumped in with non-functional testing. I would like to challenge that because I think that even things like speed or something, how something works is functional. If something falls over because of load, how can you say that's not functional? It is a functional issue.

Lina Zubyte (04:40):
I really like that because I think we sometimes say also functional bug or it's like a functionality we're implementing and then somehow people differentiate these strange terms and once I was in a discussion where they were really opposed of using the word bug and people wanted to use defect, then third group wanted to use issue and I was like, oh my goodness, are we going to semantics of everything?

Nicole van der Hoeven (05:05):
Yeah, I think that we can get lost in the weeds when we talk a lot about the terminologies. For example, a lot of people classify performance under reliability. Some people classify reliability as being under performance. I don't think it really matters as long as you're talking about the same thing. I mean words are words and they're different in every language, but as long as everyone is focused on quality and especially aspects of quality that are a little bit trickier to measure, then I think that you're on the right track.

Lina Zubyte (05:42):
What would you say are benefits of performance testing, testing? Why should we put in effort and time to do that?

Nicole van der Hoeven (05:51):
I think that the benefits of performance testing is the same as the benefit for any kind of testing. It's really about verifying user experience, but maybe aspects of user experience that are difficult to reproduce with a single user when what we call manual testing is carried out. That's usually just one person going through a certain number of steps and there is an element to performance of looking more holistically at the entire application as a system. It's kind of like an ecosystem. You could track one animal or you could track that animal and that herd of animals and how they interact with their environment and how they interact with the predators in the chain and the things that are below them on the food chain. It's the entire system. I think in general, you can think of performance in terms of an acronym that I got from one of my ex managers, Tim Koopmans.

He coined Sears, S-E-A-R-S, which I think is also a department store in the US, but it stands for: scalability, elasticity, availability, reliability and speed. Those are kind of useful frames of reference I think for performance. And one of the benefits of doing performance testing is that you actually get to see what your application does in a very realistic situation. Sometimes that is under load, sometimes that is just in a production like environment and I think that we should look into performance and put an effort towards improving it because if you don't, then you won't really know how other people view your application. It's all about getting that user experience dialed in.

Lina Zubyte (07:50):
So there's lots of types of performance testing and people also sometimes may confuse them or mix them up when they use the word performance, they may mean a lot of different things and there's soak testing, there's load testing, stress testing, there's lots of different kinds. How do you choose what you should start with or what you need in a certain situation?

Nicole van der Hoeven (08:17):
I think it should be primarily risk-based because it really depends. I think you should always start with why. With any kind of testing you need to know why you're doing the testing. It's not good enough to just say, oh, well performance testing is a checkbox and we need to check it somehow, so let's just do something and call it performance, and then that's our job. I think it's more useful to start out with what are the actual risks here and the risks are different for every application, something that might be more mission critical. If you are testing something to do with anything medical really it might be more prudent to focus on those things that doctors need access to immediately, for example, versus if you are trying to performance test an entertainment system then maybe it's not going to be as mission critical. There's also the difference between is it something that is public facing or is it an intranet that is going to be used by only people within the company and doesn't have any risk of damage to your reputation of the company? So I think you should start with what are the risks here and then you should test those first and that might be something like, oh, well last month we already had a production incident that caused a lot of customers to complain. Maybe you should start with that. So every application's going to be different and there's always going to be a period of time where you are just trying to learn what the risks are for that particular industry and application.

Lina Zubyte (10:03):
A lot of companies tend to just have it as a checkbox that we should do performance. It's important without understanding the reason on why, and then it becomes sort of a reactive process. There's an outage in production and then we're suddenly realizing why it matters and why we should put in effort. But it could be a good starting point because then you're like, okay, this is the situation from which we can learn from and this is what we can build on when it comes to performance.

Nicole van der Hoeven (10:36):
I think that in a case, I think people usually want some sort of clear cut answer, okay, first time on a new brand new project, let's say you know nothing. Here's what it is. And that was my disclaimer to say there is no silver bullet. However, there are some places that you can start: you can start by mapping out the entire system, what does a user do? Really understand what typical users do for your application and kind of trail them if you can, and go through historical metrics and see historically what pages are the ones that are the most accessed? What's the history of production incidents like, even customer support, I think they're underutilized. I think it's super important for any performance engineer to go to customer support representatives and say, what are the sort of operational complaints that you've had that we maybe need to address? And start combining a list of the entire picture of where your application is at and then go from there.

Lina Zubyte (11:46):
I think that's a very good advice for making sure you have good benchmarks because that is essential for any kind of testing we want to do. What are we aiming for? What are we comparing the results with? What does it mean that something is better or worse quality? We have to establish the baseline.

Nicole van der Hoeven (12:06):
And if it's an application that hasn't been released yet, there's still some things that you can do. For example, you can go to competitors' websites and benchmark the performance of their sites. I wouldn't suggest load testing any infrastructure that you don't own or control, but I think that any performance engineer worth their salt will also look at that perspective of it. If in the absence of any other information, I think it can be useful to see what you're up against.

Lina Zubyte (12:37):
Yeah, absolutely. So performance testing as a topic has been in your life for quite a while now. How did you get into it? What sparked your curiosity to learn more about performance testing and you're still working with that, so what made you stay?

Nicole van der Hoeven (12:55):
Yeah, I love it. I guess I've just always been like a minmaxer, I don't know if you know that term, but in video gaming, it describes a type of gamer that is always optimizing, is always respecing or changing what their character's like to ecat a little bit more productivity. I mean you can see this as a theme in my life, in my personal life, I talk about productivity and note taking and how to make the most of things, and I've always been interested in that in all aspects of my life, trying to optimize everything and that can come with disadvantages as well. I don't think that, I really mean nobody really knows about testing. I think when they're entering uni, I certainly didn't. So I came to tech in a non-traditional way. My degree is in economics and I minored in math, so a very different course of my career there that I didn't end up taking.

I was always interested in technology and I really loved it because when the first time that I heard about testing in general, that was the first time that I knew there was anything to computer science beyond development, and I always thought, oh, development is going to mean that I'm just talking to machines and not to people, and I love the idea of testing being a bridge between people and machines, really trying to bring the human element to something that can be quite technical and that's the part that I love about it and the performance part, I think I kind of just lucked out on that and really connected with my interviewer for the first job, the first tech job that I had, and I loved it. I think I just took to it quite naturally.

Lina Zubyte (14:54):
What are some of the learnings or lessons that you had that you wish you learned earlier when it comes to performance testing?

Nicole van der Hoeven (15:04):
I think that very early on I was very focused on technical things. It's like what language should I learn next? What tools should I learn next? What framework should I get certified in? I think that that is a very common mistake. I don't want to say it's not like those things are wrong to do. Those things are important too, but what I wish that I'd learned sooner was that the most important quality that a performance tester or any tester can have is empathy. I think fundamentally our job is about understanding how people use software and really relentlessly optimizing with that in mind, prioritizing user experience over anything. But in order to be able to do that, you also need to know how they use it because how you might think they use it or how a developer or someone from the business might think that a user uses things may not be how they actually do.

So a lot of the job of a performance tester is really honing in on real use cases and not the logical use cases because humans aren't entirely logical. Another thing that I wish that I had learned earlier is that sometimes it's not about optimizing the machines. Sometimes you also need to think about the human aspect that's involved. So for example, I've been in a situation where I was on a team where we were trying to improve the performance of a particular process and that process generally to kind of sum it up, it created these letters at the end of the process that were letters that were sent to customers and we had performance tuned it so much and really made significant improvements that we didn't realize at the end because those letters were sent out and they were a little bit, not controversial, but it inspired a lot of customers to call and complain because it was about a legal change.

What we didn't optimize for was the customer support. They were a team of three people. They could not keep up with the number of calls that they were getting because we were sending out these letters so quickly we didn't account for that. We were focused on the technical aspects of it on the computer parts, but that's not all that performance is. What's the point in having those letters out so quickly if the customers aren't able to really talk to someone from the company about them? So what we actually ended up doing, which is mind boggling and really quite a failure on our part, is we had to institute a delay in the process that we had performance tuned because we didn't want so many letters to be sent out that quickly after all. So this is a classic example of why you should start with a why.

The reason for performance testing was not as we thought to just make it as fast as possible. Really what we failed to see was that the why was to give customers a chance to voice their complaints and a chance to understand the implications of these legal changes, and we failed to take that into account. I feel like this is something that I've seen over and over, so I'm a little sensitive to it and I now start to see it in other things. Like I know when during the time of the coronavirus in the Netherlands, there was this hotline where you could call and make an appointment and you could get tested, and I don't know really what the system was like. I heard that it wasn't great, but even when they started to improve it, I personally had the experience of getting to the testing center and it being completely blocked in terms of traffic. Again, that's a case where it's like, yes, your system was able to make appointments, but nobody could get to their appointments in time because the physical location of where the testing center was was so blocked up with traffic that we couldn't actually get to that point. So it's like again, another case where you might have performance tuned the system that doesn't have any bearing on what the reality is when there are humans and physical logistics involved.

Lina Zubyte (19:50):
That's such a powerful message, which comes to this one point of empathy and seeing a bigger picture, not only having a tunnel vision about a certain technology or tool or it's just my part and my box and I just do this task and I don't care about anything else. I think more testers, QAs, anyone doing testing, humans in tech should keep it in mind that it's also very beautiful because whatever we're doing is likely a part of a bigger system. So we do have a little bit of an impact and we may be affecting with our work something else. And I think in general about talking about performance tools and performance testing, most of the tutorials that we have about performance testing are from tool vendors. So it's not that much talking about the why or the reasons why we're doing it. It's more like, oh, learn this or that and this is this technical kind of aspect that that's the main thing that matters. But the hardest part very often is to understand why we're doing it, what are we comparing it with, what kind of testing should be doing in our case in general.

Nicole van der Hoeven (21:02):
Yeah, and I guess I should also say here that I do have a bias because I'm employed by Grafana Labs, which does have a performance testing tool k6, however, I want to talk about a perfect example of why speed isn't always the best thing or what you should aim for. This is a little bit of a secret about our SaaS platform, k6 Cloud, and it speaks to the psychology of humans being more important than just speed. Because on our own site and we are a performance testing service, there's a little animation that happens once you set up your test and you choose a number of users. Then you click run, and by this point you already have a script and everything. There's a nice animation of servers and traffic going to servers and getting sent to the cloud and all that. What people don't know is we actually stay on that page longer than we really need to just to finish playing that animation.

So that is the opposite of what most people think of when they think of performance testing as something that's super focused on speed and at the exclusion of everything else. What we actually found is that it gave people nice feelings to watch this very satisfying animation to its conclusion, and it's only a few seconds, but keeping them on that screen longer actually made the experience better. So are you going to go with the experience that's qualitatively better or are you going to go with the experience that's quantitatively better but qualitatively worse? In that case, it doesn't have anything to do with the tool at all. It has to do with the humans that are using it. I think that's really missing sometimes. We always speak of performance in terms of response times as if experience can be quantified all the time into a number.

Lina Zubyte (23:07):
Wow, that was a good lesson. Experience cannot be quantified just in a number. And as you say, even chatbots for example, they also have this waiting aspect. I worked once on a chatbot development and that's exactly what we had to implement, which is sometimes not being too quick with a response because a human is not ready for that. They may want a little bit of a break that something is happening that gives us an impression that something is going on behind and we expect it somehow. We don't want it to be all mechanical and without a heart.

Nicole van der Hoeven (23:48):
Yeah, I think relentless optimization is a trap and there are some transactions, for example, if you were applying for a mortgage, so I've worked for the National Australia Bank where we did handle different applications in different stages for a mortgage, and that was a bit of an interesting one because while we wanted to tune the entire process in general, there were some parts where, for example, when you're getting the approval or disapproval or whatever for your mortgage, people don't trust when you get a response in less than a second because you've just gone through this entire process of putting in your bank details and your employment history and your assets, would you really trust if you got an approval in 300 milliseconds? It's a lot of money that you're potentially thinking about. So it's that is different for every industry and for every use case.

There are some cases where it's actually better if it's a little bit longer, and that comes with more trust on the part of the user that like, okay, you really did safeguard my information and you did the necessary checks and you're not just giving me an answer just to give me an answer. It's so interesting. I still really love figuring out those things. I know when I was working for a gambling company, an entertainment company in Australia, we had to do with horse racing, and you would think that the peak load for the traffic for that site would be maybe hours to days before us people placed their bets when actually 80% of the traffic came within the last maybe five minutes before the start of a race because we don't want to spend money earlier than we really need to. And another thing with that use case is that initially when we were doing the load testing for that, what we didn't account for is that there's another spike in traffic just before the race ends because people are reloading their account pages looking to see if they've already gotten paid.

That's not something that you would necessarily think of unless you are watching user behavior. And that's when we get into things like observability because with testing, it's very much dealing with lab data. You're not dealing with real users when you're doing the testing and you still have to perform those tests with an element of humility thinking, I'm not going to be able to capture everything and there are still going to be some behaviors that I may just have to spot in production. And that's the importance of field data and observability for that data. You kind of really need both sides of the coin there.

Lina Zubyte (26:55):
Some people may say that if we have monitoring or observability practices, why do we need performance testing or the opposite? What are your thoughts on that?

Nicole van der Hoeven (27:07):
To me, that's kind of like saying, well, I live in the Netherlands and we have great medical care, so why should I exercise? Why should I care about what I put into my body? Because I know that if I get sick, then I can just go to the hospital and they can operate on me. I think it's silly to discount either. It is good to have something in place for emergency situations, something to monitor your health after something's wrong, but it's equally silly not to try to prevent that. And I think that it's a false binary because people tend to put them against each other observability versus testing, but it's not in opposition at all. I think they actually work quite well hand in hand. It makes sense to me that you would want to try to prevent something and account for all the situations that you can possibly predict and maybe some that you don't predict, but it's also not enough because as I said, humans are irrational. Things happen in production that you might not ever be able to expect, and it's hubris to believe that you can possibly test for every situation.

Lina Zubyte (28:31):
How would you define the benchmarks if you do not have observability and monitoring? If it's a product that's open source.

Nicole van der Hoeven (28:39):
I think it depends what kind of performance you want benchmarks for. Do you know if it's front end performance or backend or do you not know either?

Lina Zubyte (28:51):
Everything, right? It's like we should care about it, but how and what matters? It's really hard to understand what matters when you don't have data and when you don't know your customers as well, so you don't know how they may be using it.

Nicole van der Hoeven (29:07):
So I think it would be good to start with some user interviews maybe, and I mean if you think that you are a user for it, maybe write down the things that you do and where you would go and the kind of user flow, and then seek to verify that behavior or contradict it with other users that you might identify in the community. And then you can go based on risk and see what are people using this for and what is the most critical part of it? Then zero in on that. And then are there other things that maybe are not open source that are web apps perhaps that do that same thing that is so mission critical to you and Google has some great statistics per industry on these sorts of things. I mean, three seconds is the typical response time that they're looking for, but it also depends on is it a web app or is it something that is run locally on people's machines? That's going to change it as well. Maybe if it's something that's going to be published, then you likely will need some form of load testing, but if it's really just something that people download, then maybe you just need to performance test it based on one user.

Lina Zubyte (30:30):
Yeah, I guess also to add to that, we could take a look at known performance issues in the past and use them as a base for defining the benchmarks. So you mentioned Google Statistics, which sounds like a good resource, but if we wanted to start in performance testing, are there any courses, books, anything that you would recommend to learn more about it?

Nicole van der Hoeven (30:55):
Yeah, so I cut my teeth on Steve Souders book. He has two books actually. One is High Performance Websites and the other one is even Faster websites: that's more based on front-end performance. I think that front-end performance is a good place to start because it's like the Pareto rule. It's 80% of the gains that you can get are actually on the front end. And so I think that if you're just starting out that those are really good resources so you can understand that performance is not just about necessarily making the underlying infrastructure fast, but also how it's presented. Simple things like JavaScript being loaded at the start of a process versus at the end that can already affect how things are rendered, but after that, maybe go into backend performance testing. And for that, I would start getting into site reliability engineering, and the Google book on this topic is a very critical piece of work because Google has a lot of practice at making high performing infrastructure, and I think that there's a lot that you can learn just on YouTube. I am a little bit dubious about formal certifications, I guess. I think that a lot of formal education is just about regurgitating what you've memorized when performance testing is nothing like that in practice.

Lina Zubyte (32:33):
Thank you so much. I'm sure that I have some resources to check out right now and the listeners as well. So to wrap up this conversation where we just scratch the top of the iceberg when it comes to performance testing, but I think it was really interesting to learn more about what matters, which is people and trying to see the bigger picture, asking ourselves why we're doing it. To come full circle, what is the one piece of advice you would give for building high quality products and teams?

Nicole van der Hoeven (33:07):
I think I already kind of talked about this, and it's to hone your empathy for each other, for the users of your product, for the builders of your product. I think the whole tech industry is kind of daunting sometimes, and there's a tendency for people to go into it thinking the most important thing for me to learn is technical things, when actually the entire industry would do better with a lot more empathy.

Lina Zubyte (33:37):
I really like that. Thank you so much for your time.

Nicole van der Hoeven (33:41):
Thank you for having me. It was great to talk to you.

Lina Zubyte (33:44):
That's it for today's episode. Thank you so much for listening. Check out the episode notes for any references we've made in this episode, and until the next time, do not forget to continue caring about and building those high quality products and teams. Bye.