The Testing Show: Automation and Defining “Done”
The Testing Show: Automation and Defining “Done” (Part 1)
Have you wondered how your team could better utilize its automation resources? Does your definition of “Done” include new automation efforts for stories that are in flight? How about when changes to functionality (or new additions) cause your old tests to stop working? Do we play continuous catch up, or is there a better way to applying automation efforts?
Angie Jones of Lexus Nexus joins us to talk about better ways to have those automation discussions, who should be responsible for what, and how everyone on the team can contribute to automation efforts (hint, you don’t need to be a coder to help make great automation, but it certainly helps).
Also, this week we delve into Spotify taking over hard drives with continuous writes that could shave years off of their operation life, and are Uber’s autonomous vehicles even close to ready for prime time?
This is part one of a two part series. Come back in two weeks when we continue our conversation with Angie.
- Spotify bug is killing users’ desktop hard drives
- Tooth of the Weasel: Building Quality
- Uber admits to self-driving car ‘problem’ in bike lanes as safety concerns mount
- Google is spinning its self-driving car project out into its own company
- Uber says it’s reviewing incident of self-driving car running a red light
- The Clojure Programming Language
- Lisp (programming language)
- Uber supposedly found a legal loophole within San Francisco that allowed them to run these tests
- Mark up the world using beacons
- What is Apple iBeacon? Here’s what you need to know
- How to Get Automation Included in Your Definition of Done | Angie Jones
- The Forgotten Layer of the Test Automation Pyramid
[This is part one of a two part episode with Angie Jones of Lexus Nexus. We will have part two of our conversation with Angie in two weeks.]
MICHAEL LARSEN: Hello and welcome to the testing show. I’m Michael Larsen your show producer, and today we’d like to welcome Perze Ababa.
PERZE ABABA: Hello, everyone.
MICHAEL LARSEN: Matt Heusser.
MATTHEW HEUSSER: Good morning, or good time zone you are in.
All: [Laughter] Sorry, I can’t.
MICHAEL LARSEN: Jess Ingrassellino
JESSICA INGRASSELLINO: Hey there.
MICHAEL LARSEN: And we’d like to welcome our special guest, Angie Jones. Angie, please introduce yourself.
ANJIE JONES: Hi everyone. I am a consultant automation engineer. I work over at Lexus Nexus in Raleigh, North Carolina, in the US. What I do is, I consult with several teams there on providing an automation strategy. Not only that, but I am also a practitioner, so I sit in with the teams and help them get going, build up automation frameworks, train their testers, educate the team on automation, that sort of thing.
MICHAEL LARSEN: Fantastic. All right, well then let’s get started with our news segment. Matt, you got something to share?
MATTHEW HEUSSER: So many things. I want to start with direct testing, and move backwards a little bit. Spotify writing to your hard drive[i]. Spotify is an app, it seems to work fine, there seemed to be no significant problems, and yeah it writes massive amounts of data to your hard drive. Very specifically, it’s a large database file that seems to be mostly redundant data, and it can be 5 to 10 GB of data in less than an hour, even when the app is running idle. Leaving it running for periods longer than a day can be up to 700 GB. The reason this is a big issue is if you have an SSD hard drive and you fill it up, it can suddenly become slower. I guess the testing angle here is this is not in the requirements. It’s not described anywhere. This software works fine. And yet, there’s this problem. And I don’t think that’s unusual. Are models of what we are testing, I think, need to expand. And also, how the heck did that happen? That seems like a big red flag, and how did somebody not catch that?
MICHAEL LARSEN: I can sort of understand this if you are actively listening. It seems excessive to me, but then again, if you want to be able to buffer your stream and be able to have continuous playback, that… it’s annoying, certainly, but it doesn’t seem completely out of the question, especially if you’ve got a steady stream going, but it also sounds to me like this is happening even when you are not playing. The app just being active is doing this.
MATTHEW HEUSSER: Well, even if you are streaming, it’s streaming. Put it in memory. Buffer it. Why is it writing to your hard drive that much? There are sort of client-y server-y things which you want to cache, or whatever, but peer to peer file sharing doesn’t do that much writing to your hard drive.
ANJIE JONES: You know, I go to a lot of dev conferences, as well as testing conferences, and this is something that’s being promoted quite a bit lately in these conferences, is to write to the users disk instead of trying to work from your server, whatever. I can see how they’ve gotten into this predicament. In those same sessions, I never hear anyone give the caveat of “oh, well don’t write too much data.” You know what I mean? So they are encouraging developers to work this way for seamless operations in the online off-line type of changes.
MATTHEW HEUSSER: Yeah, my guess is that somebody took that advice and wrote a code library, which had some extra junk in it, they didn’t care about it. Spotify included that code library, possibly open-source is my guess, or there’s just some error in a for loop somewhere.
ANJIE JONES: Yeah, it’s one of those things that as a tester, like you said it’s not mentioned in the reqs, it’s not something that we are explicitly looking for, but as a tester these are probably the types of considerations that we need to make and bring forward as information. I’m not sure if they did or didn’t, but 700 gigs, I don’t know if they’ve ever even gotten to the point where they we’re playing this for a full day. Who would think of that scenario, You know what I mean?
MATTHEW HEUSSER: Yeah, who would? I guess in the old days, we would have the requirements, and we would have the project plan, and we would create our little equivalents classes, in come up with all these ways to slice And dice the application. Most modern teams involve a tester in the whole process of delivery and there’s a chance the tester could overhear or… it’s probably through code review, right? Someone should be, like, “What’s that about?” [laughter] Like, why… what is this file Mercury.db, and why are you writing it there, and what do you need it for? You bingled that for loop; it’s going to happen forever, which is, really, I think what happened.
PERZE ABABA: this is something that I’ve learned from Mark Tomlinson, with regards to understanding the data states of your application. For example, if you’re looking at five distinct sources that can affect performance by looking at where data is actually stored, or how it’s transferred. So you’re looking at storage, for example; permanent storage, where that’s where your disk drives are, or solid-state disk drives. You should be able to look into temporary storage, too, where your memory or caching solution, and you’re looking at how your data is being manipulated, if that will definitely affect performance, then you’re going to see how data is being moved, and then also how data is being displayed, so you’re pretty much looking at the whole gamut of performance. Seeing these things as part of the thing that you need to be checking, I guess that Alan Page calls this the “ilities”; these are the things that we could have caught… I mean Spotify testers would’ve been able to catch these things. Going beneath the surface of what might affect your application, not just from a functional perspective, would be a really good checklist to have when it comes to having these client/server type Applications on your desktop.
ANJIE JONES: That’s a really good point, and we’re assuming that the Spotify testers didn’t find this, and they could very well have. From my understanding of that new story, even once Spotify was notified, it took them several months to address the issue. Maybe it just was they didn’t know how to or they had to come up with the new plan of delivering content.
MATTHEW HEUSSER: it’s very possible that they found it, and somebody said, “Well, this is kind of core to our architecture, so what are you going to do?” And then once the bug became serious, and became news, then they had to rethink it. I think that’s one of the most powerful things a tester can do, prevent ”What are you going to do? Ship it anyway. Oops!”
MICHAEL LARSEN: The recorded amounts, the data that they’re talking about being measured and that’re being written are in the terabytes. Even when Spotify is idle and it’s not storing songs locally. When you were just streaming it, What are you keeping? Why would it require that? The biggest challenges they’re referring to this is that they’re literally talking about this literally taking years off the life of your hard drive.
JESSICA INGRASSELLINO: A question that I have, and I don’t know if it’s a legitimate question, because obviously I don’t know their software architecture or their database architecture; I don’t know how that’s working, or what they’re using, but I always wonder about security when you’re talking about going from anything that is stored somewhere else to something that is stored on my computer, and what does that mean, and what does that look like, and somebody wanted to perform a malicious action, and have it perpetuate very easily, it seems like a really good way to do it. I don’t know, but that would be a concern I would have from a test perspective, if I knew the architecture.
MATTHEW HEUSSER: moving on from the spotify thing, and I hope more information comes out; if we figure out what the deal is, we’ll let it be known. Uber has a self-driving car problem in bike lanes, safety concerns mount. Meanwhile, Google is talking about releasing its self-driving cars. If this works, it’s going to put a whole lot of taxicabs out of work. Until it works, it’s pretty darned dangerous.
MICHAEL LARSEN: Well one example we can talk to right now; Sharing something from The Verge from 14 December, Uber’s fleet of self-driving cars began to pick up passengers in San Francisco for the first time. One of their vehicles was caught running a red light.
MATTHEW HEUSSER: the number of independent variables that that Computer with have to actually track in order to actually drive and the amounts of technological programming… I mean, honestly, you have to write it in a functional language like Clojure or Lisp or something, and then actually having humans code review it and I understand what it’d really doing and testing it would be incredibly hard. It’s almost like the first airplane pilots; a lot of them died. A lot! Until we figured out how the heck to do it, and that was with mechanical controls.
ANJIE JONES: I saw something a bit earlier, I think it might have been the verge as well, but it was about the same Uber self driving vehicles and how they are unsafe for bicyclists, and so, there were a couple of incidents, and I believe Uber admitted this themselves that they’ve observed them making turns into the cycling lanes, so if there were cyclists in that lanes, they would’ve been hit.
MICHAEL LARSEN: I just did a little bit of searching, and at least on this particular article, they do have an update that was posted. It says “Uber appears to have completed its review, and concluded that it was the human driver, not the computer, that was at fault. These incidents were due to human error, a spokesperson said. This is why we believe so much in making the road safer by building self driving Ubers .”[laughter].
MATTHEW HEUSSER: well, yeah, of course Uber’s going to say that, right?
MICHAEL LARSEN: Right! [laughter]
PERZE ABABA: Yeah, that’s very convenient, right? [laughter]
MATTHEW HEUSSER: If you look at Uber’s public statement when the State of California said “you can’t do this is against the law”, their response was kind of like “we have an obligation to do it because it’s the right thing ”. The ultimate libertarian, ”Screw you!” sort of comment. They’ve had some success with that as a strategy. They going to a place where they’re like “nope, you need to have a taxi medallion” and they’re “ehhh, we’re going to ignore you”, and the people love it so much that they vote a resolution to change the law.
MICHAEL LARSEN: One thing that was in the article here is that the California DMV did rebuke Uber because they did not get a permit for autonomous vehicle testing. Because it’s car requires a human driver at all times, they aren’t covered under the states permitting guidelines.
MATTHEW HEUSSER: Yeah, right, and they don’t care.
MICHAEL LARSEN: [laughter] I’m not laughing because it’s funny, I’m laughing because of just the hubris of it [laughter].
MATTHEW HEUSSER: So if it was me, I would find some unincorporated nothing in Nevada, with like a house and a post office… what used to be a post office, re-incorporate it, make some local law… probably New Hampshire, maybe, someplace that’s got a lot of independent people, put big gates up and say “this is our field testing area, and you can live here if you want, and you might die, but if you don’t (die), you’ll be testing the worlds first a Autonomous vehicles in a beta-like environment”. I think that’s what they need next, is a public beta test that you know about. The freeway doesn’t go through here. There are posted signs. If you get on your bicycle, and you go on the bike path, that’s on you. Some people would. I think that’s the level of quality of these products.
MICHAEL LARSEN: images of “The Running Man” and “Death Race 2000” are going through my head right now [laughter].
MATTHEW HEUSSER: Exactly. You know, unless that happens, I don’t see how this is going to work in California. It’s also interesting because humans make mistakes, too. Humans die. Humans forget. Humans turn the wrong way on a one way street.
PERZE ABABA: well you know Uber supposedly found a legal loophole within San Francisco that allowed them to run these tests. Essentially, there is a legal definition of what an autonomous vehicle is, and Uber’s car doesn’t fall under that definition, which allowed them to “legally run these within San Francisco itself”. It might be a little bit of a stretch, but that seems to be what pushed them to run these things. On a different note, wouldn’t it be really nice if we had, like, a red light oracle somewhere that these cars can all pull information out of so that they actually know when and when not to go?
MATTHEW HEUSSER: like you put your red lights on the grid, and you put wireless on the car and the car can hit an API and say “yep, this is a green light, I can go now.”
PERZE ABABA: right. The Cars can also kind of talk to each other. I think this is the concept between whether it’s Android Beacons or iBeacons that’s been out for a couple of years now, but for some reason it’s been very difficult to scale up. It might be a very naïve suggestion, but the technology is definitely out there.
JESSICA INGRASSELLINO: I actually didn’t begin testing for a while for probably about nine months at one of the startups where I worked, and it’s an interesting technology, but it definitely needs work. It’s been probably two years since I’ve worked with it, but in terms of the begins themselves and what they can do, and then of course the applications and geo-fencing and all of the stuff around communicating with the beacons, like it’s still really in its infancy. It’ll be interesting to see how they choose to improve that and I’d be curious if we could talk to the people who are testing it now.
MATTHEW HEUSSER: it might be worth trying to reach out to them. Their language and strategies might be so wildly different than the way we think, but maybe we can learn from each other. Let’s talk about Angie. Angie Jones, I hear about you on the Twitters, I read about you at conferences. I think we went to TestBash, but really didn’t get to talk or something.
ANJIE JONES: Were you at TestBash in Philly?
MATTHEW HEUSSER: No New York.
ANJIE JONES: Oh, OK.
PERZE ABABA: So I was at TestBash Philly, and I didn’t get the chance to shake Angie’s hand there, but I don’t know if she remembers me because there were so many people.
ANJIE JONES: Yeah, I do remember.
MATTHEW HEUSSER: you gave a talk at TestBash, right?
ANJIE JONES: I did. I spoke on “in sprint automation”. The title of the talk was “how to get automation included in your definition of done”
JESSICA INGRASSELLINO: I wasn’t at TestBash Philly, but Angie, I have seen two of your talks now, two of the titles, and I am so curious, and having worked in a very similar space, I’m really interested to hear the methods that you are using to do that. I’m just so excited to hear you share that with us.
ANJIE JONES: Yeah, yeah… I did that talk at TestBash and then straight from Philly I flew out to London for the Selenium Conference and did the talk there. They have it on YouTube, so folks can check it out there, but I basically gave three techniques that people can use, very practical techniques, that people were able to walk away with an they can apply right there on their day jobs on Monday morning. So I talked about “Being Strategic” in how you automate, so I looked at Mike Cohn’s Testing Pyramid, the Automation Pyramid and talked about how, when we’re presented with things that we need to automate, how easi it is to kind of stay in that UI zone. I talked about ways that you could make hybrid tests that might expand a couple of those zones so that everything isn’t so UI heavy. So that’s one way to kind of move quicker and have your tests faster and less brittle, which is important an Agile environment. I also talked about communication. Our automation engineers in a lot of places that I have worked in the past have been kind of isolated and working in a silo where are they are their own team and they’re not involved in this whole Agile transformation that companies have taken on. I talked about how to embed the automation engineer in those Agile teams, and what does that look like? What considerations need to be made, and how can they work with the different team members to get what they need from an automation perspective, as well as inform and educate the team members on automation practices. The third part I talked about was to automate incrementally, taking on kind of the same Agile staging that done with development, and doing that in automation, so using concepts like TDD and programming by intention within automation practices, to ensure that you are not taking on too much within the sprint, and you’re coding was necessary to get by.
MATTHEW HEUSSER: My experience with successful tooling, usually we come up with the things that will be automated before the story… the story kickoff, I guess, is the right term. For each story, each isolated incremental unit of work. Then it goes into test, and as a tester I do all the testing and we also do the tooling and then we can call the story done. I’ve had some success with that. It sounds like you are talking about something a little bit different, and I’m wondering if you could go into a little more depth and how what you are recommending or suggesting is different than that.
ANJIE JONES: Having the automation engineer is embedded, my argument is that they hear a lot of what is going on, and they can make better decisions about what it is that we are going to automate. I’m not an advocate of the “automate all the things” type of strategy, so I spoke against that, especially when you need to automate in sprint. I spoke about automating what’s important, and the way you would determine that is by being embedded in these teams and communicating with the different players, such as the business analyst, the testers, and the developers, and gathering information as a tester to figure out what should be automated, what are the risky areas that we should automate, and to planning that into the sprint cycle.
MICHAEL LARSEN: Yeah I found that to be one of the more challenging aspects. Our company is a relatively small group. Our development team is all of eight developers, and we have four testers, one of which is a fully 100% dedicated automation engineer. It does seem like it’s always a game of catch-up. What are the suggestions you might have for a team like mine? What can we do? Do we jump in and kind of take on some of that extra busywork so that that automation tester can move forward? I hope I’m making sense [laughter].
ANJIE JONES: Yeah, perfect sense. There’s a lot that goes on in an automation engineer’s role besides just scripting test. Things like maintaining the code, so I always say that automation code is living breathing code that we need to update and maintain, just as we would production code. That’s something that people don’t necessarily calculate in their measurements when they are trying to estimate how long it is going to take to do the automation of this new work. Again, having the automation engineers embedded, they can hear the types of changes coming down the pipeline and prepare for that. “Oh, well I know that I will have to update all of these scripts that we have, so let’s build in some time for maintenance. I suggest that automation engineers delegate some of these extra things out, so they can handle the in sprint automation, but there are some other things that other people can handle. If you’re a developer, and you’re changing the features that will now break these tests, why can’t it be your responsibility as a developer to go back and update those tests? Of course, put your automation folks on your code reviews and everything, so that we make sure that you didn’t screw stuff up and just to make the test pass just to pass. That’s something that the developer can own, especially if you are working in a continuous integration build or something where you see these tests fail. I encourage developers to own that piece first looking at why the tests are failing. Let’s assume the reason is because there’s been a code change, and not because of the tests are fragile or not working correctly. The other piece of that is the monitoring of build that may not be continuous integration. Maybe they’re run a couple of times a day or something like that, not on every check-in. This is something that I encourage testers to get involved with, as we are “shifting left” and encouraging them to be more technical and get involved with automation. It’s not just about coding and scripting. They can do things like triage failures that come from automation runs, learning how to become comfortable with these automation reports, reading the code and figuring out what went wrong and opening a bug if need be, and then maybe even updating the code if they feel comfortable enough to do so.
MATTHEW HEUSSER: How do you decide what tests to automate, and how do you track the work? Some of the companies that I have worked with actually have automation tasks. Some of that is often just “the automator looks at the story and comes up with some stuff.” Other ones actually have a separate -and I don’t like this- but they have a separate automation group or person, and then the tester comes up with the test cases, which have to be documented so the automator can automate them. I really don’t like that for lots of reasons. It causes a lot of waste and duplication. In your world, Angie, how do you decide what’s going to be automated, how do you track it, so it’s not done until the automation runs?
ANJIE JONES: So the latter used to be my world. That was what I saw a lot with the isolate automation teams. In my talk I speak about getting away from that type of mindset, for all the reasons that you mentioned. I’m now doing a more Agile friendly approach to automation engineering. Part of that whole communication part that I speak of talks to “how do you decide what to automate?” That’s a very popular question that I get a lot. It’s working with the people on the team. You being embedded in the team you’re going to hear a lot of information that you didn’t necessarily have when you were isolated out of the team. That goes into your decision about how you choose which ones to automate. Other areas, like working with your business analyst, asking them the questions like “How are our customers using this? It’s not uncommon for me to ask the business analyst to pull me out some analytics about what’s it being used and how it’s being used. Then I am able to make better, wiser decisions based on that. If we don’t have a lot of traffic in a certain area, I’m not going to write a whole bunch of automated tests around that, because that’s just more stuff that I have to maintain, and it’s going to make our continuous integration builds longer, with feedback that’s not necessarily crucial to check-ins. Focusing on the risk areas is my first piece of advice. You get that, again, from the business analyst, but you can also get that from your testers, so as they’re exploring the application, they know a lot about what areas are problematic. They can give you some insights on that and then the developers as well, so you listen for things in the standups where they are saying “I have to go in and mess with this part of the code that’s like spaghetti, and I hate going into that area of code.” That’s a red flag to me; maybe we need some additional automated testing around that area if it’s that fragile.
PERZE ABABA: The ones that I’ve seen with regards to the challenges when introducing automation to a new team… if you’re introducing automation for the first time, you can actually see cultural shift kind of change within the team. There’s a lot of complication that you have to deal with when it comes to introducing automation. Even when looking into which ones we should be able to automate first and which tests we can rely on what automation oracles do we have that we can use and reuse, but I think at the end of it all, it’s really how the team kind of processes information. From what we’ve seen, I’ve been in teams where we had a really strong independent automation team that just focuses on regression, also looking into where in the application can we judiciously use technology to solve some of these, whether it’s a testing problem or it’s a testability problem. The challenge really is, if not everybody is kind of onboard, if not everybody understands the ramifications of a failure, and whether we can debug that failure really fast, things will start falling behind. So having these small introductions in how we can deal with this particular problem, and as we go through the products coverage outline, where we can just identify “this one we can automate because we understand this, this one we can’t, because it’s just a complete black box for us”. So having that deeper understanding and how we can actually react to a failure is then very key. For me, personally, I see a lot of teams fail when it comes to automation, because it is looked at as if we are just replacing humans executing these things, and then there is no response when something fails. It just remains failure. I really like how Angie framed this; I remember in her TestBash talk, the developer also needs to take ownership. If it’s a testability feature that breaks, because the developer introduced something, there has to be some sort of a shallow check, or just something that can easily be triggered the moment hey developer pushes code for a pull request, for example, that can validate that change, and say “Oh, you actually broke this appointment creation API, and now, 60% of our tests will fail because we can’t create any new appointments.” It’s this type of feedback that we can give to the developers and having, really, some sort of a one is to many, or one to one relationship in showing that, because you broke this piece, we are blocked in these areas on what we can test, or what we can check. Of course, the developers will have to respond and fixing that before we even merge that into master, or push that into a different branch for building within your CI pipeline.
[This is this is the end of part one. Come back and join us for part two of our conversation with Angie Jones in two weeks.]