GTAC 2011。谷歌測試工具的最新進展 (GTAC 2011: The Latest in Google Test Tools)

字幕列表影片播放

GTAC - Day 2 Ibrahim El Far
and Jason Arbon October 27, 2011
>>James Whittaker: I actually think we have two speakers for this next one, a pair of
Googlers. Ibrahim El Far and Jason Arbon actually both
work for me. There were no special privileges getting them in this event. They were all
still voted for. But these are the two people that if they do have a bad presentation, I
can actually do something about it. So just saying. No pressure. We're continuing our
theme of browser automation. And I think this is going to be completely different from a
lot of the other discussions. But I heard the sauce guy, Adam, say something about BITE.
That's part of what we're going to demo here. And maybe Sauce will pick it up and do something
with it, because we have open sourced this. So I think Ibrahim is going to start.
Ibrahim actually kind of followed me. We were in Florida for a little while. And then we
both went to Microsoft. And then -- >>Ibrahim El Far: You followed me into Microsoft.
>>James Whittaker: I followed you. That's true. You went to Microsoft first and then
I followed you. And then I went to Google and you followed me.
And eventually we're going to have to stop this stuff. People will talk.
[ Laughter ] >>James Whittaker: And Jason Arbon is also
another former Microsofty-become-Googler. And, I don't know, maybe I should stop saying
that. I said former Microsofter. When your badge
-- the joke around Google is you don't know you're fired until you come to work and your
badge doesn't work. That's when you know. Okay. Let's welcome our new speakers.
[ Applause ] >>Jason Arbon: He's got -- Welcome to the
cloud; right? >>Ibrahim El Far: Excellent. Fantastic.
So hi. My name is Ibrahim. I've been at Google for about a year and a half. I'm a manager
of a team of about -- until recently, of seven or eight people who basically just do test
tools. We're going to talk about two of these tools today.
And I work closely with Jason. Jason is the visionary and the man behind most of these
tools. So, basically, if you dislike any of the ideas,
just blame him -- >>Jason Arbon: In implementation, it's him.
>>Ibrahim El Far: Yes, indeed. So I figured, though, I will want to just
start with a silent presentation reflecting on some of the things that we saw yesterday.
[ Laughter ] >>Ibrahim El Far: So, anyway, --
[ Applause ] >>Ibrahim El Far: So, just to give -- I want
just to gauge a little bit what kind of audience we have here. So how many of you do UI testing?
By -- oh, cool. How many of you do JavaScript, like write
JavaScript on a daily basis? Oh, excellent. Interesting.
I always found it interesting, by the way, that a lot of the people who deal with Web
testing on a daily basis just don't really deal directly with JavaScript.
How many of you have -- deal with manual testers, like, work with manual testers on a daily
basis? Cool.
How many of you have filed bugs? [ Laughter ]
>>Ibrahim El Far: How many of you have spent a whole lot of time talking to developers
about convincing them you that need to fix those bugs?
How many of you had the opposite experience of just, you know, developers just fixed it
on a regular basis. Oh, fantastic. Of course, Tony has it right.
So there's always a place for testers. Can you help me with the clicking, please.
>>Jason Arbon: Yes, sir. >>Ibrahim El Far: That's why I brought in
Jason. He's basically my clicker. >>Jason Arbon: Not funny.
>>Ibrahim El Far: Yeah, pretty much. [ Laughter ]
>>Ibrahim El Far: So crowd sourcing was mentioned yesterday. And -- as a powerful sort of way
that actually scales in terms of bringing the crowd to test, sort of like hiring a dedicated
set of testers. And that is especially true for consumer applications.
So I'm not going to pretend that this works for enterprise applications. I'm not going
to pretend this works for operating systems, or even back-end stuff. But for consumers
with a lot of UI stuff, this works pretty well.
So there's always a place for highly skilled bug hunters who are actually working for organizations
whose core business is testing. Actually, even if companies, large companies, like Microsoft
or Google or et cetera, decide one day that we're -- James is right and test is dead and
we're not going to hire any more testers, I think there's always going to be a place
for testers in organizations whose core business model is about testing. We'll get into that
a little bit later in the presentation. So you can keep arguing about who needs to
be doing it or you can argue about where we need to do it, in the United States, or in
India and China, or in your company versus outside of your company. But that's really
not the point. >>Jason Arbon: We didn't practice any of this,
by the way. >>Ibrahim El Far: This is not very rehearsed.
So, really, the work is not going away. If -- you know, basically, what we need is,
we still need to test your software. We need to do it in an exploratory way or a scripted
way, doesn't matter, but it still needs to be done. Bugs still need to be filed. And
-- why are you not clicking? >>Jason Arbon: Because you don't make the
click noise. We talked about the click noise. (Clicking).
>>Ibrahim El Far: But, I mean, how are we going to go through this? You're just not
doing your job. >>Jason Arbon: I'm doing it. I'm just doing
a review level 3.1. >>Ibrahim El Far: I'm already talking about,
you know, the work that's not going away. And -- yeah, okay. Thank you.
[ Laughter ] >>Ibrahim El Far: So I'm already done with
this bullet, so it's done. >>Jason Arbon: I can quit.
>>Ibrahim El Far: So the work is not going away.
And -- I'm just going to do it myself. >>Jason Arbon: Please.
[ Laughter ] >>Jason Arbon: Management.
>>Ibrahim El Far: But we don't really -- here's the thing. I don't know if you ever hung out
with a manual tester or if you've ever been one. When I first joined Microsoft a long
time ago, I was hired as a test engineer, which technically meant that I clicked buttons.
And life was kind of miserable, actually. I had to, you know, set up my machine, install
a whole lot of software. And even in today's world, with, you know, everything being -- going
into the cloud, you still have to install a lot of tools and spend a lot of time learning
about these tools. So -- Click.
[ Laughter ] >>Ibrahim El Far: -- you also tend to spend
a lot of time away from the application. The thing about exploratory testers and manual
testers is, really, they need to spend as much time as possible just playing with the
app. And they don't. They keep switching context a lot. They keep switching and saying, "Oh,
I need to actually create a test case in the test case manager." Really? I mean, seriously?
I need to enter data, like who's the author of this test and what's the title of this
test, and here are the steps of this test. And, by the way, here's how you execute this
test. Really? I mean, come on. We're in 2011 already?
Something like that? Yeah. So that's the other job that Jason is there
for. Click on my slides, and shake his head. >>Jason Arbon: I'm ahead of you.
>>Ibrahim El Far: That's right. Thanks. So also, another really terrible thing, another
terrible experience about this is, you know, we go ahead and we know that some test cases
are, you know, going to fail or pass. But we give them to test -- because we haven't
got a chance, especially in the UI testing world, we haven't got a chance to automate
it, so what we do is we give them a whole bunch of spreadsheets and say, "Go ahead and
just run through -- run with them." And that's really unfortunate, because what
that means, instead of them exploring new functionality, they're stuck, essentially,
repeating the same thing over and over again. And then often we hire an army of them just
so we can cover things like all sorts of machine and browser combinations.
You know, they -- manual testers these days also still spend a whole lot of time testing
a whole lot of stuff that can be found by a machine. And so, really, test case managers,
bug filing systems, and record playback software, which is really central to the life of a manual
tester, I was going to say it just needs to die. But what I really mean is it just needs
to be invisible. It needs to be invisible so that the user of the software, who can
be a tester, or the developer of the software, or the manual tester, who is a dedicated manual
tester, anyone can actually -- [ Laughter ]
>>Ibrahim El Far: You know, we only have 30 minutes. So....
So, for us, when we talk about -- you know, at least on our team, when we talk about the
crowd, we actually don't really mean just people. We mean a crowd of people and a crowd
of machines. And it's really just the best of both worlds.
So this is the standard, oh, testing is labor-intensive. Ooh.
So let's go on to the next slide, actually. So I'm going to go through -- so I'm going
to go through just a bunch of the tools -- tool work that we've been doing. BITE stands for
browser integrated test environment. It's -- the client portions are already open source.
You can go and take a look at them. And I'm -- we're going to first take a look
at the record and playback feature. >>Jason Arbon: Your internal Google email
is popping up. >>Ibrahim El Far: Oh, yeah. Hopefully, -- oh,
well, just try to ignore that. Will that show up on video?
>>Jason Arbon: Yeah. >>Ibrahim El Far: Oh, crap. All right.
But I quit all notifiers, then. It's only the job that pays the bills.
So keep in mind the first thing that we wanted to do is to make sure that testers just don't
really -- especially exploratory testers don't have to do a lot of work to sort of enter
metadata about their test cases, and enough information for these test cases to be repeated.
So one of those core components is really a traditional kind of traditionally boring
piece of testing called record playback, except we do it in a pretty cool way.
So what you do is, all you need is to install an extension. BITE is really just a Chrome
extension, at least today. And what you do is just you start -- go ahead
and play with your -- with your application, in this case, you know, just open up the Google
page. And then as you go through this application, essentially, what BITE is doing is recording
everything that you're doing. And without you necessarily writing any of the code, what's
being recorded is both -- I don't know if you can see this very well from where you're
sitting -- but you can see both, essentially, JavaScriptish kind of code that's being generated,
but also kind of an -- a plain English sort of presentation of the test case. Depending
on who's using this tool, they might want to take a look at something a little bit more
friendly than JavaScript as a representation of your test case.
You can do things like also add validation. So as, for example, you're going ahead and
playing with your app, you can say, for example, that these are your expectations about what
to find on a certain page, like some things are a must, some things are to be ignored,
some things are optional. And then also, as you are exploring your software,
we are taking, essentially, a snapshot or a screen shot of the application step by step.
So this way, basically, now, you have three ways of looking at your test case later. You
can either look at the JavaScript code and run that. You can look at the English, plain
English representation of the test. Or you can look at a sequence of screen shots.
So -- But here's the thing: So, again, how many of you on a daily basis do UI testing
and think that UI testing is kind of easy to break? Or brittle?
Yeah. It's like -- so I've never really worked too
much on UI testing until recently. But I've -- you know, I work very closely with people
who did. And it's amazing. I mean, a lot of the time that people spend in UI testing is
about, essentially -- well, reimplementing a whole lot of these test cases.
And what we are finding is, that if you have the right set of tools, it's actually far
less expensive, far cheaper, if you have the right set of tools to simply rerecord all
the test cases than to actually develop -- you know, fix test cases or even develop them
from scratch. Just our experience. We'll talk about that some more.
So, for example, in this particular tool, what we do is, if your test case fails, what
we allow you to do is, by a simple click -- this dialogue says this step failed finding element
blah, and it actually -- it actually highlights that element.
And what you do is, once you click on that element, then you -- the new element in the
page that's equivalent, it kind of fixes it automatically and finds all the other instances
of that element in the same test case and sort of fixes that as well.
So what that means is, if someone messes with your devs or with elements on the page, it's
not the end of the world. You can simply just say, oh, here's the search -- here's the new
search button, as opposed to -- and then it'll find all the instances of the old search button
and basically replace them with the new search button, as an example.
So that's basically the record playback portion, it's basically record playback plus the ability
to fix broken tests reasonably easy. >>Jason Arbon: Can I add one thing real quick?
Thank you. The cool thing that we'll get to a little bit later is we're generating JavaScript;
right. We're not generating Java that need to be compiled local and deployed to machines.
You join these test cases, we save them automatically in the cloud. It's just saved to a server,
so you never have to worry about file management or local store or anything. And the core thing
is this stuff actually runs on Chrome OS. So it runs entirely in the browser, which
is one of the main motivating factors for this whole project. That's kind of a cool
part. >>Ibrahim El Far: How many of you when I was
going through the screen shots saw a test case?
Or the name of a test case? Or anything like that?
Actually, in the cloud, basically, as you have been -- as you're using this tool, test
cases are basically being created. So, in other words, you can basically come back later
and say, go to your test case manager that you plug into this tool and review the test
cases that you've gone through. >>Jason Arbon: And that was kind of a requirement
because James wrote these other Octomom books about testing and exploratory testing. And
people go do this and there's no record of what they did, no way to reproduce this stuff.
That's not funny, James. There's no way to reproduce this stuff; right?
There's no artifacts of what you did. There's no way to share that information.
>>Ibrahim El Far: And then another cool thing is, now that they have done the exploratory
work, these are turned essentially into regression tests.
And you can just keep running them over and again. And we'll talk about it some more later.
But, basically, we can also run them anywhere and potentially on other browsers as well,
not just on Chrome. Or on the particular operating system that
the exploratory tester happens to be running on.
>>Jason Arbon: Is it okay? Is our relationship going to survive this?
>>Ibrahim El Far: You keep on interrupting me.
>>Jason Arbon: You told me to. The inspiration was when James came to Google
originally, he had this original BS mock slide that people thought was real of, like, all
these, like, bug overlay information on top of Visual Studio and all this kind of stuff.
And I saw it and didn't realize it was a fake mock, and I wanted to build the same thing
for the browser; right? So -- Dude, the one time you have control of the
machine. [ Laughter ]
>>Jason Arbon: But just like, you know, when testers are going through this, they don't
have any context, don't know what test cases have been running, don't know what bugs have
been filed before. Just like modern pilots who deal with a lot of information, information
overload, we wanted to do a heads-up display similarly on the browser; right? So we're
kind of trying to take the bugs, the test case data, the debug data, the code coverage
data, and put that and overlay it right on the application while you're testing it to
avoid kind of just flipping between applications and different Web pages and stuff like that.
>>Ibrahim El Far: One of the ways, actually, that we -- one of the ways that testers spend
a lot of time switching context is going back and forth between their bug-tracking system,
as well as -- and their application. So the other feature of BITE is BITE bug filing,
where, again, without leaving your application and having installed only an extension with,
basically, no other thing else on your client, you're able to also file bugs.
So, for example, this is an example from Maps. So you just go to Google Maps, for example,
and look for Seattle pizza. You get a bunch of results. And then you go
ahead and click on one of those results. And let's say that you found a problem with
this particular result. So all you've got to do is click on "report a bug." And then
this is very application-specific. So what -- often, what we do is take BITE and with
a little bit of customization for a particular domain, in this instance, it's Maps. If you
look -- it's kind of hard to read -- but a popup comes out and says, "Is this a problem
with a map tile? Is this inaccurate or incorrect results? Is this wrong or incorrect directions?"
It just starts asking directions. It asks you please point where -- if it's
a visual problem or a problem with the UI, please point me where in the UI is the problem.
And then, of course, you can choose -- it's kind of hard to see, but it says the problem
I'm reporting is not related to the UI if it's not necessarily a UI problem.
So let's say you click on the -- this element that's highlighted. And now, in essence, pretty
much everything that you've -- that you need has already been kind of generated for you.
These include things like the screen shots that -- of everything that you've done, everything
that you've done so far, until -- until you are about to file the bug. You can -- you
know, including screen shots, including sort of all sorts of useful representations of
the test case, the code that is required to actually run the test case in a browser in
JavaScript. And then, of course, the capability of -- there's a button there for capability
of actually playing this -- replaying what you just did.
And then once you review this information, if you're satisfied, you say, "Bug it," and
you're done. You haven't left your application. You're still in your application. But you're
still able to also file bugs. >>Jason Arbon: Can I add something to that?
>>Ibrahim El Far: I was going to pause, anyway. >>Jason Arbon: Can I add some sugar?
Start getting worked up. The cool thing is this is the internal bug
filing tool now on Maps for internal, like, if you're on Google corp net.
And you used to get the dogfood bugs from everybody and everybody is using Maps and
these bugs were horrible. They're, like, repro steps from, like, program managers and marketing
people, and even from testers that aren't on the application.
The funny thing, if you use Maps, so people say, "Here's the repro you (indiscernible)."
"It's Maps.google.com. Right?" And they go, "Thank you."
Right? So what we're also doing in the background
is collecting a bunch of information from the DOM that's application-specific again,
lights up on Maps. That includes all the debug URLs and all that kind of context that the
developers need. So we talk about what information you need and we add it and scrape it on the
DOM. It's just a little bit of JavaScript. But the interesting thing that's going on
here is this: When people have this extension installed inside of Google, it doesn't record
anything. It's just sitting there silently; right? But when you're on Maps.google.com,
this little thing starts recording, right, as you're doing stuff. So when you go to,
like, file the bug, it actually has the script of everything you've done on Maps.google.com,
right, in that domain. So, like, you know, so, actually, the developers, when they get
the repro, it's not like I'm on this page and there's this bug. They can actually click
a link, and then we pull that JavaScript down into their version of the extension; right?
And it actually, like, goes to Maps.google.com. And then if they were browsing for 15 minutes,
it will just play the entire 15-minutes sequence of what that user did. They may not even know
what the interesting thing they did to induce that error state.
>>Ibrahim El Far: One of the holy grails you've been looking for as a developer is go to my
issue tracking or bug tracking system is just go to the bug and replay the system for the
bug to be replayed. This is a reality for a lot of scenarios right now.
>>Jason Arbon: Except it's a URL not a button. It's a feature request.
>>Ibrahim El Far: Let's not deviate off script. >>Jason Arbon: Dude.
>>Ibrahim El Far: Again. >>Jason Arbon: Saw your password.
>>Ibrahim El Far: Really? [ Laughter ]
>>Jason Arbon: I know how many characters it is -- they know how many characters it
is, too, now. The search space. >>Ibrahim El Far: The other thing that manual
testers also -- And, by the way, I say the word -- I use the word "manual testers," like,
loosely. It could be a dedicated tester again or it could be, like, just a user. It could
be, really, anyone who has essentially volunteered to test software.
So one of the issues that they also deal with is having to, essentially, file a whole lot
of duplicate bugs or not knowing where most of the bugs are, 'cause, as you know, when
there's an area where there's plenty of bugs, there's probably a whole lot more bugs in
that area. So you want to go ahead and attack that. Or vice versa, if no one has filed any
bugs in an area, you want to go and figure that out
So let's go back to the Maps scenario. >>Jason Arbon: Sometimes it's even lonely
when you're a tester and you're in your application and you don't know what's been going on. That's
what I feel like all the time. I'm lonely. >>Ibrahim El Far: I'm with you.
So, again, if you want to -- if you want to report a bug, before you report a bug, you
can also see all the bugs that are related to the application that you're seeing right
now. And then if you activate overlay, what will happen is you actually will see a highlight
of this is where other people have filed bugs in this application. And you can click on
that and find out, oh, someone actually already filed a bug in this particular -- on this
particular element, and it's probably the same issue. So it can on the spot say, okay,
this is not a bug, if process you're a developer, for example. Or you can say this is resolved
now, it's no longer an issue. You can add is comment if you're, for example, a colleague
and say, oh, yeah, I've seen this, too, but you've missed a couple of points. Maybe I
just want to add a comment. So it's -- actually, it turns, really, testing from the solitary
activity into a very sort of social. >>Jason Arbon: Now that I'm thinking about
it, maybe people just don't like me. This list of bugs here, right, isn't just
all the bugs from Maps.google.com. This is actually, like, we do relevant stuff here;
right? So as you're browsing, like, if a URL does change, we, like, filtering, we do some
smart filtering and stuff for relevance. We also look at the content of the DOM to
do the filtering, to do the smart query. So it's like there's thousands of these things.
Although we pretend there aren't any bugs in Maps. But we picked the right set that
are probably going to be relevant for the map tiles that you're viewing right now.
Click. Now it's yellow. >>Ibrahim El Far: That's exciting.
So. >>Jason Arbon: Like James' shirt.
>>Ibrahim El Far: So that's basically sort of the major features of BITE. And, again,
the exciting -- to me, it's really exciting stuff, because, really, it eliminates -- it
sort of like hides most of the tools away from you, especially test case management
and bug filing systems, and lets you focus a lot more on testing. And because of that
last feature, it also helps sort of you be a little bit more aware of what's going on
elsewhere, without your having to go back to the issue tracking software and figuring
out, oh, has someone else filed a bug or has someone else tested that area of the application?
So that's -- But that takes care -- and those tools are really necessary if you want to
go for crowd sourcing where you really have testers who are not necessarily dedicated
100% of the time or don't have time to be trained into -- you know, on specialized tools,
et cetera. But that's only half of the puzzle.
The other piece of this puzzle is really on stuff that testers do that, really, machines
ought to be doing. The particular example that we've started
working with is really just about layout. You know, we had an army of testers dedicated
to figuring out whether layout problems change over -- you know, layout -- there are layout
problems introduced over time for given applications. And, really, the problem with that is, most
of the time, you really don't have layout problems. Most of the time, you know, for
90% of the cases, 95% of the cases, you're just verifying that there's absolutely no
problem. But it's a little bit more about just doing
stuff that a machine can do or should be doing. Are the bidi testing people here? No? Yes?
Hello. So -- that was "hello" Arabic. You may have
heard, I spoke from right to left. [ Laughter ]
>>Ibrahim El Far: But the bidi stuff, actually, is a prime example of the kind of thing we're
talking about. It's like, most of the time, either you can resolve these things automatically.
And for a small percent of the time, you need sort of a human decision to figure out what's
going on. So Bots is about that.
Let's just jump into demo. So in Bots, what you do is you say, okay,
here's my application, the application I care about. Say I care about, for example, www.google.com,
because I test that stuff. You also care about a whole bunch of other
applications. And then you give it to -- you give it to
Bots. And then Bots takes care of things magically. As far as you're concerned, you don't really
necessarily know what exactly Bots does. All you know is it's trying to figure out whether,
in this case, it's testing for layout problems. But it could be testing for bidi problems,
it could be testing for very basic accessibility problems that can be detected only -- without
using the semantics of the application. And then over time, you know, what happens
is bots comes up with the results. You probably can't really see it here, but what it's doing
here is it's comparing this table, dev and canary versions of Chrome, but can you imagine
it also comparing Internet Explorer versus Chrome versus Firefox.
And what it's doing is it's telling us whether there are layout problems that were introduced
in this table version versus the dev version, versus the canary version.
And so when you drill down what you get is sort of like a before and after sort of picture,
and the tool allows you to diff between the pages where the layout change. And that's
the highlight of the diff. And what you can do then is you can click on the area that
changed and actually see the element that changed.
That saves -- it may sound a bit counterintuitive. We say most of the stuff is already taken
care of by automation, but actually it's not really. And so a lot of folks -- this is stuff
that does not come for free. And that's sort of the hope of our underlying
sort of theme of Bots. What we want is if -- if the stuff does not
require humans for the most part, if the stuff can be done by machines, then it ought to
be really free. You ought to get this for free.
And so layout problems are an example of that. And that's sort of what Bots does so far.
So this was just a very quick overview of how some of the tools work, but how does everything
sort of fit together in the grander scheme of things.
Jason? >>Jason Arbon: It does? Are you going to call
me the Octomom of test tools? That was the one thing we practiced actually and he didn't
do it. [ Laughter ]
>>Ibrahim El-Far: Can I still say it? >>Jason Arbon: No, it's not cool now.
>>Ibrahim El-Far: He is the Octomom of testing tools, by the way. Most of these tools are
his brainchild. >>Jason Arbon: I do tend to faint when I present,
so just watch out, front row. [ Laughter ]
>>Jason Arbon: It's not pretty. So let's go to -- all right. So one more comment
on the Bots again. One of the premises that Google rates is that we push new builds, and
new products hourly, sometimes minutely, that's like the new term, the hot term, but we ship
so often, right, the reality is that there's very few CL's, that's part of the whole quality
kind of attack is small number of changes, iterate very quickly, deploy, test and flight
all these things. So the problem is tests can't keep up, but
the reality is these apps don't change that much. Google.com is about -- don't quote me,
but the apps don't change that much day to day. What they're doing all the time is scanning
web pages, doing suites of regressions tasks to see if looks the same as it did yesterday
or an hour ago or a minute ago, and it's a complete waste of time, and we have really
talented, bright people doing this ridiculous work all the time.
The idea of Bots, the main value out of Bots is it just scans it. It's very similar to
what Kevin was talking about. It was cool. So we're not totally like both crazy. We're
independently crazy, right? And but the cool thing is if the stuff doesn't
change you don't have to look at it. That's the key premise of the whole thing. So let
the Bots do this crazy work and diff'ing all the time, and the only thing that requires
a human brain to evaluate because the stuff doesn't change.
Let's get into the interesting stuff. So this slide I created like an hour ago.
I don't know how it will go. So I want to say back to what James and Alberto
were saying at doomsday, whatever, yesterday, test hasn't evolved, right? And I was at Microsoft
and I started working on IE 4, which dates me now. I guess I'm an old man now. I was
working on IE 4 and guess what? I found myself working on Chrome doing the same damn thing.
And actually people were doing more manual work at Google than they were at Microsoft.
I said I thought this was the cloud and the future and I've landed in the middle of I
don't know where. Oh, man, Google is based on peer feedback,
so -- [ Laughter ]
>>Jason Arbon: That's why I've been quiet about some of these tools.
But really what happened is this is the way I visualize this stuff is that back in the
day Yahoo! was pretty cool when the web was pretty small and evolving pretty slowly, they
had humans index interesting sites and put them into this giant hierarchy of sites. It
worked pretty well for the time, right, like when I was in school.
And then Google came around, this company that pays my bills, and I love them and they
give me coffee, but they figured like this won't scale. It's changing faster, it's growing,
it's getting bigger. What do they do? They end up doing the modern search stack, which
is this crawl and index and using page rank to find surface to relevant stuff and it's
not a perfect system either. It doesn't know every single URL and they're not always in
perfect order, but really what happens is when I looked around test was still doing
the manual curation of test cases. They're very directed and in these hierarchies in
case managers and stuff. But the world had changed and the web had changed, but testing
hadn't changed yet. So I want to say that the web and searches
made this transition and actually even Yahoo! has now because of Bing. But what would this
mean if you thought about this in the context of testing? How could you actually test to
catch up and do the same transition that we missed probably a decade ago?
So this is actually Ibrahim's girlfriend in the picture. I was like, who is this person?
I don't know. I just told him that. So some early results. The thing is Google
is based on peer review for feedback and review scores and your bonus, so I've been quiet
about the data, but what I've been doing under James' semi-watch and I've scammed him into
all these experiments and funding all these large teams is measuring humans and comparing
the human activity of manual testing and automation versus Bots and versus the crowd. That's what
we've been doing. We've had a lot of infrastructure to do that.
And BITE has been played with like how do we make the testers even faster? I'm trying
to help the manual testers, too, but I want to find out what's the answer to this stuff
because no one is experimenting. They're merely checking the features that came to the build
yesterday and the day before. So I figured I would experiment a little bit while I've
got a crazy manager that would fund it. So the fundamental -- first point is -- and
I could go on for hours and I've only got a few minutes here. Bots are faster and less
expensive for basic regression testing than humans.
Who would have thunk? We all know that, but we don't use machines for that. It's been
a theme of the conference, I think. I think the Bots are the first stage of that, the
simple layout Bots. I've got some samples, but I don't have time
for those things. Also, I've compared some -- I won't protect the innocent, some of the
Chrome testing, we had these dedicated teams of manul on-site venders and full timers doing
those manual tests and they used to be doing it weekly and they'd recycle the test pass
again and again and again indefinitely, right, and doing the same thing.
And then we passed one of these builds -- I did an experiment and passed one through UTest
for crowdsourcing and guess what? We caught a lot of bugs. The same build they came out
with one or two bugs, but the crowd testers came back with like 18 bugs. Literally it
was just that. I won't talk about the money, but like it was a lot cheaper than our dedicated
testing team. And they're also faster because you can paralyze
these people where you can't paralyze a team of five people.
So that's actually very exciting. And the quality of the bug is not just more bugs.
They're actually interesting bugs. And somebody actually -- I'll probably blog
or something, I'll share a lot of that data over the next month or so.
The other is faster. We record bugs. This is where Simon Stewart -- he left, but I warned
him that I was going to bad-mouth him before he left.
The reality is we worked with the BITE record playback in the JavaScript generation, we
worked with the Chrome web store team and we thought you could try this out. Maybe it
will be working. Give us feedback and maybe we're stupid, it's a bad idea, the JavaScript
stuff instead of the WebDriver. And what happened is we came back a few days
-- we left them alone for a while and came back and we said how are you using the tool?
What's your feedback? We thought he would spend the time on -- when
the element is not there because the web page changed, we have the UX up to fix the code
on the fly kind of thing, and that's where we thought we were really clever.
But the reality is he said, I don't use that crap. He said, I just regenerate the test
case. It was like duh. It takes him like five seconds to try the feature out and we've generated
the test case, stored it for him and he can replay it at will and it's done. Once he records
it, it's persistent in the cloud and with one button you don't have to worry about all
these server farms and we have server farms that take care of this stuff. You can execute
this all in the browser itself. Anyway, we'll get on to the anti-WebDriver
stuff. How are we doing on time?
>>Ibrahim El-Far: Another five minutes. >>Jason Arbon: Okay. I will -- done. So here's
the new cycle. So I proposed this. Test isn't dead, it's just changing. There are dinosaurs
in the room. And some of them will actually become birds, right? Some of the brontosaurus
just fossilize. So this is the new workflow that I think will
be happening. These are converging. You've seen what Kevin has got and crowdsource and
people are talking about all these component. This is kind of the flow.
So a developer comes up in the number one slot and they just deploy the application.
You notice they don't have any listing of specifications or test plans or any of that
kind of junk. They just deploy the application. These Bots over here, we've done a lot of
work on running these things on Skytap actually, which rocks and they don't pay me for saying
that. But we're running the stuff on Skytap VM's. But the cool thing about it is that
the Bots run and 90, 95 percent of all pages don't change. We don't even look at them,
so we don't have any human involvement still. So the ones where we find diffs on the pages
where they're interesting and we haven't said ignore this stuff because it's dumb -- the
Bots are stupid sometimes. We say ignore it and it never does it again. But we come up
with a list of diffs. And rather than pass it to on-site dedicated testers, we pass it
off to the crowd with this BITE-enhanced overlay fancy head's up display thing. We route it
to a crowd tester. The crowd tester look at these diffs and says that looks like a feature,
that looks like a bug. They don't really know all the time. We've talked about context with
crowdsource people. They don't always have the full context of the project or why, but
they don't always need to. What happens is so -- we've done the study.
So really it's actually higher than 95%, but we'll call it 95% of all diff is identical.
The stuff that's different we filter out about 75% of that with the crowd and we've compared
that with on-site dedicated testers that know the product, know the developers, have lunch
with them, right? They're as good or better and actually faster than the dedicated testers
to filter out these diffs. So now of all the possible work that we could
have done we're down to three percent or less. And when there is an issue, like they're loaded
up, already got the browser open, they click the repro URL, it loads it up from the diff
and then they basically click on whatever is the problem. And they go next, next, next
and then the bug is filed. And that bug, we've got a feedback on from
the early deployments to the Google Maps team is better than their best tester found in
the bugs because the best tester doesn't do all the debug information and the best tester
still writes a lot of times "step one "-- it's embarrassing.
You space or do a tab. One, go to this URL, two, click this button, try to describe it.
There's none of that crap. It's faster and better data for the developers.
Then we just route this back. We route this back in the bugs dashboard back to the developer
and they just have an URL and they load the URL and they can see like, hey, nothing changed
or there's a couple of flagged bugs and they click on that and it takes them back to the
repro and to the possible bug. You realize what's going on in this whole
flow is that there's none of us in this flow. That's the cool, crazy thing. That's why I
think some dinosaurs think test is dead, but it's been automated and outsourced to the
crowd and to the machines. And this is this flow, we've actually tried
this flow once. It's early, but it actually works.
So we don't exist in this thing. That's where we've got to think about what we're going
to do. I argue we should actually participate in
making this flow better and also like work on instrument stuff, do more quality work
on the product. But also what's going on here, we found that we needed to do -- because they
were like the Bots are kind of stupid. They just find these diffs.
They are stupid. We have some interns working on like making them smarter and doing runtime
and some intelligence stuff, some pseudo AI stuff. I'm more practical than that, right?
Let them do the 90% job. But what happens is we have a continual set
of exploratory testers that are just assigned to the project. And we have a pool -- we've
tried pools of five and pools of 10. What we think best is a dedicated set up of -- for
instance, half of them are dedicated, always on the same product. They have the context,
right? And the other was able to rotate through for us. We got people with different browsers,
different opinions on stuff. They're looking for different bugs. They have different contexts.
So if you have a continual exploratory testing going on with Bots and then simple like looks
like a bug or doesn't look like a bug, you've basically solved about, you know, like almost
100% of this kind of a testing problem. Because the developer is a smart guy. A lot
of us are also developers. You can just look at the bug. The problem is we spend all this
money and time in latency, having people sit at machines doing these repetitive tasks over
and over again. I think we can eliminate a vast majority of
that with this kind of workflow. We have the tools come together and we make the crowd
testers better with the BITE, more efficient. Wow. Okay. Thank you.
This is me from five years ago actually. [ Laughter ]
>>Jason Arbon: So I'll skip most of that stuff. Can you go back, though? Yeah. To the bell
bottoms. >>Ibrahim El-Far: Wrap it up.
>>Jason Arbon: Done. There's overhead and slide switching. Those are bell bottoms.
>>Ibrahim El-Far: There's like 10 minutes left.
>>Jason Arbon: There's a lot of stuff that isn't in that slide. You can look at that
better. It ain't there. It's actually streamline processed.
The reason I say bell bottoms is that back to what James is saying, like testing hasn't
freaking changed. We get a little bit better, we're still loading up consoles, no offense
to the cool JavaScript stuff, but we still test the browser from a console. That's crazy
to me. We have a platform in the cloud. What I do is I call bell bottoms on stuff
and that gets me unhappy looks at work all the time.
I say, "why are you doing it that way," and they say, "well, that's the way I'm doing
it," I declare bell bottoms. You can use that or I don't know.
So the last thing here is if you want the context of all this crap -- sorry, I said
crap. [ Laughter ]
>>Jason Arbon: I'm too honest. [ Laughter ]
>>Jason Arbon: Yeah. My family raised me right unfortunately.
So there's a book coming out, this is the cover, the working draft of the cover, I think,
but there's all sorts of -- there's -- we talk about how we do testing, all the unit
testing, all this stuff, and it talks about a lot of these experiments. It talks about
the origin and who was doing BITE and why they did it and how we scammed James into
believing we could actually do some of these things. How when he told us not to do Bots
we had two interns do it anyway, and right in the same office and he didn't know.
[ Laughter ] >>Jason Arbon: It's actually totally true,
James. You know that, right? So that's all in the book. It's actually kind
of cool. The reason this picture is here is I thought
about all the trees James has killed with his books in his octopublishing career, and
I accidentally obscured his name on the book. We're not doing questions. I'm talking -- I'm
talking now. They like to hear me talk. [ Laughter ]
>>Jason Arbon: I want to say this is all about the cloud. This is really about cloud testing.
We talk about cloud testing. Cloud testing doesn't involve a console. It doesn't. Call
me a nut bag -- and I am a little bit. But it doesn't involve a console and it doesn't
involve Java binaries and distribute them like a pool of machines that we have to manage
and stuff like that. There's the cloud. The browser is a powerful platform. We can do
it all on the browser end. And this is -- we're more or less kind of
defunded now. We've broken the team up. People are going off to other teams and projects
and stuff like that because things in Google change fast.
But these things live on. We've open sourced them. Please participate and collaborate.
And yeah. Maybe we can do some cool web testing in the cloud, actually finally, maybe some
day do cloud testing. Done. Questions?
[ Applause ] Any questions? I'm going to get doubters.
"You're crazy. Doesn't work for my application." I'll do the -- I'll question, answer them
real quick. "It doesn't work for my application. My application
is special. I do a database or I do medical equipment." This stuff is applicable.
>>Ibrahim El-Far: How about let people ask the questions?
>>Jason Arbon: But I already answered them. [ Laughter ]
>>> Hi. Thank you for your talk. My name is Karen, from (saying name).
And, basically, you say that this tool is for the cloud. What happened with some -- with
the systems that has to interact with desktop information and then will be uploaded to the
cloud? So these tools can work together? I mean,
because sometimes you need to have Java code to simulate that you upload the information,
and then you need to verify it in the cloud. >>Jason Arbon: I would say that generally
it's a little myopic. We're focused on the browser and kind of on the Web; right because
it's Google, number one; right. But like we saw earlier, just a couple hours
ago, right, there's NativeDriver, these same APIs, right, so if you have to do desktop
stuff, you can treat it the same way. We've actually done some experiments like this where
we actually have JavaScript code and we're kicking off local -- we have a demo extension
in Chrome where we can actually run the console window inside the browser. But the script
that drove that thing was JavaScript. And it lives in the cloud and everything like
that. So rather than have everything, like, based in this client world and then, like,
tickling the browser at the last mile, right, or the last inch, it's -- you go the other
way, 'cause there's so many benefits of the cloud and we're not benefiting them.
Call me a little bit whack, but this is starting to happen, we saw it on Android and iOS, you
write as stuff as if it was a Web application using these kind of technologies. And then
if you have to do that, you have a little tiny shim you use, like a local proxy; right?
Or you can write an npm API or kind of an ACL client.
>>Ibrahim El Far: In practice, most of this stuff is taken care of by writing a bunch
of wrappers on top of -- kind of services that you can talk to from your extension.
So this way, even if you have an existing sort of legacy system, you just try to wrap
our service on top of it and interact with it as if it was just in the cloud.
In general, it's not clean. And Jason will argue that, you know, you shouldn't necessarily
care about -- you need to break away from the past altogether.
But, for example, for the teams, as a simple example, for some teams that really are insisting
on using WebDriver, for whatever reason, then what we do is we generate the WebDriver code
for those test cases as well and then allow people to run test cases.
>>Jason Arbon: That's a difference we have. I think you should never do it. I think those
rules should never cross. I think that you should do it from the cloud to start with
because of all the benefits of test execution, test distribution, reporting, portability,
all those benefits. But the reality is if you have a legacy system
and on the clients or your embedded systems and legacy code, don't do this. It wouldn't
be cost effective. But if you have a new project --
>>Ibrahim El Far: It's not an either/or proposition. >>Jason Arbon: There's this contention that
apps are moving toward the head. So we're trying to get out a little bit ahead of the
curve for answering. Anyone else?
Yeah. >>> You guys said you had about five or ten
people dedicated for unit test. How is that any different from just having five or ten
people that you have yourself if they're dedicated --
>>Jason Arbon: Cool, you have paid by U Test. They're 24/7. I think they have it on their
propaganda material. They're available 24/7. Everyone loves to build on a Friday night.
Got the last final build, Friday at 6:00. The test team, stay the weekend or not? You
don't have to worry about that. Push it to the crowd. They have this 24/7, the sun never
sets on U Test. That's one thing that's interesting. It affects real-world test cycles even at
Google. You can say you have a remote office at Hyderabad. But they have to orchestrate
handoffs. The big one that we found out with Chrome
was that you -- you know, you can only pay and dedicated, like, five or six full-time
head count -- but you can only have so many fixed people and they're fixed resources whether
you're spending them or not and then they're sitting there. The reality is if you want
-- say you have a security patch, a security issue and you want to do a full test patch
and everyone wants to push it. What do you do? Turn the dial on the crowd and say give
me 100 testers in the next hour. And you can, like, parallelize -- just talking about parallelizing
Selenium execution, you can parallelize human execution through the crowd.
You get a lot of variety; right? Like the Google test engineers are kind of myopic bunch,
they all went through the same interview process, live in the same place, always use Gmail.
And people with different context and different insight, right, from their various experiences
and other places they've tested over their lives really comes into play. That's where
we saw a lot of the value coming in from the crowd source testing, was that they would
have different perspective. They would see bugs that if they sat with our local engineers
straight on, they wouldn't see it. Is that good? Because I can go on for a little
while, but they don't pay me. Yeah.
>>> (Speaker is off microphone.) >>James Whittaker: How about now? There you
go. >>> My name is (saying name). I'm from Netflix.
Great talk. A few questions. >>Jason Arbon: Can you repeat that?
>>> Great talk. >>Jason Arbon: Thank you. Sorry.
[ Laughter ] [ Applause ]
>>> I totally believe in the vision. The first question is, how do you test your
test tools to make sure they're finding the real bugs?
>>Jason Arbon: So we -- ironically, like everyone else does, we didn't at first to be frank.
We didn't test these things. Partly because it was we called it an experiment, because
we're lazy in testing. We got away for it in for a little while. What happened is one
of our developers, an SET, software engineer in test who is doing feature work said, oh,
crap, we pushed a bad build. Now we started the test thing. We just instituted better
coding practices and making sure we did exploratory testing every time we did a release. We essentially
dogfooded that process; right? And we used BITE to file some bugs, actually, on our own
system, and it was working. So it was real organic, bottoms-up. And we only really started
testing it -- as the grim reaper was saying earlier, is that we only test it when we need
to know if it's working, right when it goes -- right at the last second before it goes
to a customer, or a little bit after when you're embarrassed. That's when we actually
added testing to the tools. >>> Second question. How do you ensure that
your crowd source testers are not filing dupe bugs?
>>Ibrahim El Far: We actually -- there's a mitigation part and sort of a back-end piece
to this. The -- we mitigate some of this by the bug
overlay that you have seen earlier. Essentially, what we do is we -- just like when you're
driving in a street and you see this thing that says, "Your speed is 35 miles an hour"
in a 30 zone, you basically get to see, okay, there are bugs in this area, there are already
preexisting bugs. That's part of it. The other piece is we cannot own the cloud that also
has the issues that we track. So that's -- we haven't done this yet, but we do plan -- since
the format of these bugs is kind of uniform, we can pretty much do a DIFF and figure out
if a bug is -- already exists. >>Jason Arbon: When we did our experiments
with U Test, we actually gave them BITE; right? So they were filing bugs with BITE. So they
could actually see the existing bugs that were already filed. That's the whole purpose
of the overlay. And then also, U Test specifically has some PMs associated with stuff. So if
your team is, like, really anti-dupes and you don't have a tester on the other thing
that doesn't want to dedupe things. They have program manager that will filter the stuff
for you. You just pay them a little bit more. >>James Whittaker: Okay. I will repeat. Great
talk. But now, unfortunately, the time is over.
But so we are going into a break. These two guys are going to be here for the rest of
the day. They're also going to be here for the beer and brats tonight.
So you guys, I know you guys got a lot of questions. Go ahead and continue to ask them
during the break. And tonight during the beer and brats session.
So I do want to say that these tools -- it's kind of a Google thing to build a tool and
get it, you know, almost done and then just let it go. Because that's the way you find
out whether your tools are really interesting and useful to people, is that they'll participate
to continue the tool even after you stop working on it.
So when we changed from a test team to a dev team, we gave these tools up. We open sourced
them and just threw them out to Google. And the Chrome team has picked up two of them.
And the open source folks are picking up more, we hope, you know, Mogotest, maybe they'll
take the Bots stuff. >>Jason Arbon: Mogotest, no, I have to say,
they do a better job. >>James Whittaker: They do a better job, Jason
said. Anyhow, please pick these tools up from open
source. Go to the Google testing blog, that's where the links -- or the open source blog,
all the links are there and you can find them on code.google.com. Do some cool and interesting
stuff with them. We'll keep the Chrome team honest, whatever updates they make, we will
continue to nag them to open source those as well so that we can continue these test
tools. Okay. Thanks a lot, guys. We're going into
a break. [ Applause ]