Learn and share: Lesson learned from Google Test Automation Conference 2015

On November 10-11, 2015, there was the 9th GTAC. Although I was not there, I enjoyed it very much :) How come?

It is because of brilliant recordings:

recordings were available very soon after the event (2 weeks later)
great video & audio quality
audience questions were repeated by the moderator
but the most important was the outstanding content

This way, one can enjoy talks performed by professionals from big enterprises such as: Google YouTube team, Google Chrome OS team, Twitter, Uber, Spotify, Netflix, LinkedIn, Lockheed Martin and more.

I often try to watch conference videos. However, I always give up to finish them all. It is because they are publicly available at least half a year after an event, and thus often outdated. These talks were different though!

Following are notes from each talk. Hopefully, someone will find it useful, and encouraged to see its full version.

Keynote - Jürgen Allgayer (Google YouTube)

Cultural change, which consisted from:

take a SNAPSHOT where we are (how many bugs occurs in the staging phase, etc.)
make SLA (what is our goal?, need for a tool which will tell: this team is this effective)
and agree on it in whole organisation
continuously measure according to defined SLA, to see where we are
how many manual tests? Where do we find bugs more often?

Goal: No manual regressions, but instead manual exploratory.

The Uber Challenge of Cross-Application/Cross-Device Testing - Apple Chow & Bian Jiang

biggest challenge: two separated apps (driver and passenger), while same scenario can be completed using both apps.

solution: in house framework called Octopus, which is capable of running two emulators, and manage communication between them
Octopus uses signaling to make sure tests are executed in the right order -> asynchronous timeouts
Octopus focus: iOS, Android, parallel, signaling, extensible (does not matter what UI framework is used)
the communication is done through USB as most reliable channel
sending files to communicate - most reliable

Why the communication is not mocked? Answer: This is part of your happy path, to finally ensure you are good to go. It does not replaces your unit tests.

Robot Assisted Test Automation - Hans Kuosmanen & Natalia Leinonen (OptoFidelity)

When robot based testing is needed?

complex interacting components and apps
testing reboot or early boot reliability
medical industry, safety
Chrome OS uses it

Mobile Game Test Automation Using Real Devices - Jouko Kaasila (Bitbar/Testdroid)

use OpenCV for image comparison

side note: OpenCV is capable of lot of interesting things (object detection, machine learning, video analysis, GPU accelerated computer vision), BSD licence

parallel server side execution
Appium server, Appium client, OpenCV - all on one virtual machine instance

screenshots do not go through internet

Chromecast Test Automation - Brian Gogan (Google)

testing WIFI functionality in ''Test beds'' (if I heard the name correctly)

small faraday cage which can block signal
shield rooms

Bad WIFI network - software emulated (netem)
(6:35) things which gone bad

test demand exceeded device supply
test results varying across devices ( e.g. HDMI ) - solutions: support groups in device manager, add allocation wait time & alerts, SLA < 5 wait time for any device in any group, full traceability of device and test run

(6:44) things that gone really wrong

unreliable devices, arbitrary going offline for many offline reasons

fried hardware, overheating, loss of network connection, kernel bugs, broken recovery mechanism, mutable MAC - solutions: monitoring, logging, redundancy, connectivity - sanity checks at device allocation time, static IP, quarantine broken devices, buy good hardware

first prototype for testing lab on card board

Using Robots for Android App Testing - Dr.Shauvik Roy Choudhary (Georgia Tech/Checkdroid)

3 ways to navigate / explore app

random (Monkey)
model based
systematic exploration strategy

(12:00) - tools comparison

Your Tests Aren't Flaky - Alister Scott (Automattic)

A rerun culture is toxic.
There is no such thing as flakiness if you you have testable app.
Application test-ability is more than IDs for every element.
Application test-ability == Application usability.
How to kill flakiness

do not rerun tests, use flaky tests as an insight -> build test-ability

(16:10) - very strong statement to fight flaky tests - I would make a big poster and make it visible for all testers in QA department :)

''What I do have are a very particular set of skills, skills I have acquired over a very long testing career. Skills that make me a nightmare for flaky tests like you. I will look for you, I will find you, and I will kill you'' - Liam Neeson, Test Engineer

Large-Scale Automated Visual Testing - Adam Carmi (Applitools)

why not pixel to pixel comparison?

anti-aliasing - on each machine is different - different algorithm used
same with pixel brightness

screenshots baseline maintenance should be code less

I have stated and demonstrated the same in my diploma thesis :P

Hands Off Regression Testing - Karin Lundberg (Twitter) and Puneet Khanduri (Twitter)

in house project Diffy - makes diff on responses from 2 production and new candidate servers

more clear from this slide
interesting way how to deal with the "noise" (time-stamps, random numbers)

use production traffic by instrumenting clusters

Automated Accessibility Testing for Android Applications - Casey Burkhardt (Google)

we all have on daily bases accessibility problem: driving car, coking
accessibility is about challenging developers assumption that user can hear, see the content, interact with the app, distinguish colors
Android services: talk back, BrailleBack
(8:22) - common mistakes
(10:34) - accessibility test framework

can interact with Espresso

Statistical Data Sampling - Celal Ziftci (Google) and Ben Greenberg (MIT graduate student)

getting testing data from production

collecting logs from requests and responses

need to take into consideration whole production data

they managed to reduce the sample to minimum

Nest Automation Infrastructure - Usman Abdullah (Nest), Giulia Guidi (Nest) and Sam Gordon (Nest)

(4:15) - Challenges of IoT

coordinating sensors
battery powered devices

(4:55) - Solutions
motion detection challenges

end to end pipeline
reproducibility
test duration

Motion detection tested with camera in front of TV :)

Enabling Streaming Experiments at Netflix - Minal Mishra (Netflix)

Canary deployment for Web Apps

Canary release is a technique to reduce the risk of introducing a new software version in production by slowly rolling out the change to a small subset of users before rolling it out to the entire infrastructure and making it available to everybody.
I knew the process under different names: Android stage roll out of the app, or Phased rollout.
Danilo Sato is describing this in mode detail here.

Mock the Internet - Yabin Kang (LinkedIn)

Flashback proxy - their in-house project, which acts as a gateway proxy for three tier architecture communication with the outside world (external partners, Google, Facebook, etc.)
it works in record, replay mode
it can act as as proxy between components of three tier architecture, or as a proxy between communication of mobile clients
mocks the network layer

Effective Testing of a GPS Monitoring Station Receiver - Andrew Knodt (Lockheed Martin)

GPS can be divided into three segments:

user segments (mobile client) who receives signal
space segment - satellites
control segment - tells satellites what to do, 31 satellites currently operating

Monitoring station receiver, user in control segment - measure distance to each satellite

Automation on Wearable Devices - Anurag Routroy (Intel)

3:00 - how to setup android wearable real device to test on
7:00 - how to start Appium session for wearable device

Unified Infra and CI Integration Testing (Docker/Vagrant) - Maxim Guenis (Supersonic)

using docker to create database with pre-populated data, MySQL snapshot, so each test session start with fresh data
vagrant + docker

because they need iOS, Windows

Not using Docker in production, it is not mature enough and because of legacy code
docker plus Selenium

it handles Selenium server
good for CI

Docker runs inside Jenkins slaves, runs smoothly
Running 100 browser instances simultaneously, requires powerful workstations though
One Selenium Grid for each stack

Test Suites and Program Analysis - Patrick Lam (University of Waterloo)

static vs. dynamic program analysis
great book XUnit Test Patterns
copy and paste tests increase “test dept”
verify part of test often helps to find similarities among tests, and later refactor them
Soot framework, opensource library, do analysis on java bytecode (also Android), used for finding refactorable test methods

Coverage is Not Strongly Correlated with Test Suite Effectiveness - Laura Inozemtseva (University of Waterloo)

how can we estimate a fault detection ability of test suite - mutation testing

good mutation candidates: change plus for minus, change constant values

kind of well known and obvious facts presented

Fake Backends with RpcReplay - Matt Garrett (Google)

problem with moc/stubs: we need to ensure they are working, so we test them as well
they record request and responses (RPC server), and they serve them instead of starting expensive servers
a continuous job which updates RPC logs
as bonus, no problem with broken dependencies. Tests run against last green microservices, so if one microservice is broken, then devs are not blocked.

Chrome OS Test Automation Lab - Simran Basi (Google) and Chris Sosa (Google)

Chrome OS development model

stable releases from branches, no development on branches, just cherry-picks from always stable trunk
all feature implementation and bug fixes on trunk first

using BuildBot - a CI framework
they are emulating change in distance from WIFI router to chrome OS
type of testing they are doing
they are using AutoTest
(19:40) Chrome OS partners goals for testing: OEM, SoC, HW component vendors, Independent Bios vendors
What kind of bugs real devices found on top of emulators: wifi, bluetooth, kernel, touchpad, anything low level

Learn and share

Sunday, January 10, 2016

Lesson learned from Google Test Automation Conference 2015