Learn and share: 2017

I took me half a year to get to GTAC 2016 videos, but it was again absolutely worth. Engineers from the top companies presenting about challenges they had overcome. IMHO most of regular IT conferences talks have either solely a marketing purpose without any given value or are too beginner level oriented. GTAC is different! That is why I thought it might be again worth to write down notes from this year. Someone (similar to me - 6 years experience in Test Engineering) could see it in 5 minutes, and decide which talk is suited for him more easily.

Evolution of Business and Engineering Productivity

(34:49) When duplication is technical dept and when it can be a benefit?

flexibility, competition, collaboration when duplication is done in organized way, when it is unnoticed becomes dept

(37:22) Metrics and Measurements, e.g. how long it takes my code goes to production? Senior managers have to support this, and develop quarterly goals for employees based on this metrics.
(40:12) Test and Release strategy 2.0: continuous deployment, canary testing (testing during canary release), production monitoring, …
Build, test, release, repeat will not be sufficient in near future
Machine learning to find out which tests are completely useless, so they do not need to be run.
(52:20) Automated vs manual test ratio within Google Ads: 95%:5%
(53:00) Other metrics at Google: How long developers spend with running tests before submitting their change (a.k.a presubmit latency)?
10 days in general for getting my change into production

Automating Telepresence Robot Driving

Telepresence - refers to a set of technologies which allow a person to feel as if they were present, to give the appearance of being present, or to have an effect, via telerobotics, at a place other than their true location [https://en.wikipedia.org/wiki/Telepresence].
Beam Telepresence System - https://suitabletech.com/beampro/
Beacon
(19:12) LiDAR - measure distance with laser
(20:50) - Hardware stack for testing of such robot in lab

Beam robot with modifications for Lidar scanning, Beam charging Dock, Hokuyo Lidar, NUC small form computer

Difficult to find orientation of symmetric object, so make it asymmetric but in a way that it does not affect test (do not change weight that much etc.)
(22:08) Lab room considerations have - to be isolated (also for safety cautions, robot has 100 pounds), lighting, flooring

What’s in your Wallet?

Galen - automate test of look and feel for responsive websites
Hygieia - a single, easy to use dashboard to visualize near real time status of entire software delivery pipeline

Using test run automation statistics to predict which tests to run

(8:55) Which tests not to run?

100% successful, during last month, have > 100 test runs, those who run on all branches
Key point: disabled only on trunk, enabled on branches from which merges go to trunk, so basically, when they fail during merge process, they are again enabled, and run at least for another month
They were able to save about 50% build time.

Selenium-based test automation for Windows and Windows Phone

Winium.Mobile - something apart from Appium support for Windows on mobile devices
Winium.Desktop automation - opensourced, WPF, WinForms, any accessible app

The Quirkier Side of Testings

Funny one, must see :)

ML Algorithm for Setting up Mobile Test Environment

(09:50) Machine Learning algorithm to choose devices for test lab

Decision tree, random forest classifier

“Can you hear me?” - Surviving Audio Quality Testing

(06:58) Audio software testing pyramid
(16:40) POLQA algorithm for testing of audio quality. The inputs for this alg. are a reference audio file, and a recorded one. The result is a Mean Opinion Scale result (MOS), which is a grade from 0 to 5.
(18:08) Frequency analysis - it identifies actors in the audio recording. Each person speaks with different frequency.
(18:51) Speech presence - Finds out regions in the recording where speech was given.
(19:01) Amplitude analysis - Verify some speakers are not too loud or not too silent
(19:40) Live demo of web service which employs those algorithms

IATF: An new Automated Cross-platform and Multi-device API Test Framework

(21:25) Test steps sequence diagram for testing communication between two clients connected to a server (via WebRTC protocol)

Using Formal Concept Analysis in software testing

Can be used for finding dependencies among method parameters, in the form of implications
(14:27) Can be used for analysis of test report. Nice example
Lattice usage analysis is equivalent to finding most common descriptions of failed tests. In big systems lattice is a good representation for finding similar functionality.
Possible extension: ML to find out possible reason why some test failed.

How Flaky Tests in Continuous Integration: Current Practice at Google and Future Directions

SLA for dev, time he/she commits and gets answer = 3 hours in general
Not every change triggers right away test jobs, ⅓ does
With ML they can be 90% sure that some test is flaky, and they do not have to rerun it 10 times as usually
(14:10) how to identify that tests are flaky, patterns, features, correlations

Developer Experience, FTW!

Firebase test lab for Android devices, Espresso, Robotium, or UI Automator 2.0
Espresso test recorder available in Android studio
(55:29) Firebase test lab will maybe in the future would be able to use real user actions to test the application

Docker Based Geo Dispersed Test Farm - Test Infrastructure Practice in Intel Android Program

Release and deliver test suites in the way of docker image

OpenHTF - The Open-Source Hardware Testing Framework

https://github.com/google/openhtf
Test harness OSS Python library with Web GUI
Plugins for sensors, platforms, chips… any other hardware stuff. Currently not many plugins available.

Directed Test Generation to Detect Loop Inefficiencies

Redundant traversal of loops performance issue
Toddler: detecting performance problems via similar memory-access patterns
Glider - suggested approach to address redundant traversal
Implemented in Soot bytecode framework

Need for Speed - Accelerate Automation Tests From 3 Hours to 3 Minutes

Enablers:

dedicated environment, saved 57 minutes
empty DBs instead of shared DBs, saved 34 minutes
simulate dependencies (stub external dependencies), saved 24 minutes, but made tests more stable
Moved to containers - slowed the operation, as they did lot of IO operations
Run databases in memory - 4 minutes saved
Do not clean test data, when you have data in containers, once tests ends, container disappear, so no need for this, 15 minutes saved
Run tests in parallel, everybody starts with this, but one should end with this step, 41 min saved. One has to find the right number of threads, too many threads can slow things down.
Equalize workload, not every thread executed equal number of test cases
By vertical scaling (RAM and CPU) they were able to run in 1:38 min
They want to go below one minute by scaling horizontally

ClusterRunner: making fast test-feedback easy through horizontal scaling

OSS framework to run tests in parallel
It is not CI server, it avoid the CI stuff, although it integrates with CI nicely

Integration Testing with Multiple Mobile Devices and Services

Most frameworks are for single-device, when E2E testing challenges may come up: synchronize steps between multiple devices, large range of equipment - attenuator, call box, power meter, wireless AP
Mobly - OSS Python Google library, used to test Android, controls a collection of devices/equipment in a test bed (isolated mobile devices, network switch, IoT, etc)
centralized vs decentralized way of executing/dispatching test logic. Mobly is centralized - they found it more easy to debug
(18:24) Cool demonstration - two phones on watch, phone A gives voice command to watch, watch initiates a call to phone B, phone B gets call notification
Similar frameworks: openHTF, Firebase

Scale vs Value: Test Automation at the BBC

They become overwhelmed by manual regression tests > BDD to define with different stakeholders what to automate > separate team to ensure devices for testing are available, inventory, status of devices > test lab (lot of smart TVs, test lab in fire corridor :D)
PUMA - plan to adapt framework broadly within the company

Prove core functionality, automated checks for the core value of your product or system. Regularly audited to combat bloat
Understood by all - everyone cares, anyone can execute, visibility to all
Mandatory - part of delivery pipeline, any fail check stops the build
Automated

Whatever framework you use, you need to step back and see what value it brings to you: only important tests should run on real devices, etc

Finding bugs in C++ libraries using LibFuzzer

What to fuzz: anything that consumes untrusted or complicated inputs: parsers of any kind, media codecs, network protocols, crypto, compression, compilers and interpreters, regular expression matchers, databases, browsers, text editors/processors, OS kernels, drivers, supervisors, Chrome UI
How to fuzz: generation based fuzz or mutation based fuzz or guided mutation-based
Mutation: e.g. bit flipping

How I learned to crash test a server

They programmed outlet into which you can ssh, and turn on/off any of the socket
Crash virtual machines vs crashing physical machines, both need to be done
Virtual machines: from host, single command both for KVM and VMWare based
BIOS has setting to restore on AC power loss
On Windows there is utility bcedit, by which you can stop the prompt (Start Windows Normally, ...) after an abrupt Windows restart
They did not find a systematic way how to crash Windows by internal command (how ironic? :D)

Learn and share

Sunday, June 11, 2017

Lesson learned from Google Test Automation Conference 2016