Maestro takes 4-6 seconds just to start. Before it taps a single button, you’ve already lost time to JVM initialization, class loading, and gRPC server setup. Then it allocates 350-400 MB of RAM for the privilege.
We ran both tools head-to-head on 8 real-world test flows. Here’s every number.
Headline Results
| Metric | maestro-runner | Maestro | Difference |
|---|---|---|---|
| Average test time | 14.2s | 34.1s | 2.4x faster |
| Peak RSS memory | 27 MB | 358 MB | 13x less |
| Startup time | 0.02s | 4.8s | 240x faster |
| Binary size | 21 MB | ~289 MB (JVM + deps) | 14x smaller |
Same Android emulator (Pixel 7, API 33), same machine (M2 MacBook Pro, 16 GB), same flow files. Each test run 5 times; we report the median.
Per-Test Breakdown
These are real flows from a production app – login, navigation, form submission, scrolling lists, deep links, and multi-step checkout. Not synthetic benchmarks designed to flatter one tool.
| Test Flow | maestro-runner | Maestro | Speedup |
|---|---|---|---|
| Login + navigate | 8.3s | 19.7s | 2.4x |
| Form fill + submit | 11.1s | 28.4s | 2.6x |
| Scroll list + tap item | 9.8s | 25.1s | 2.6x |
| Deep link + assert | 6.2s | 14.8s | 2.4x |
| Multi-step checkout | 22.4s | 61.3s | 2.7x |
| Back navigation chain | 7.1s | 18.9s | 2.7x |
| Text input (Unicode) | 12.7s | 45.6s | 3.6x |
| Launch + clearState | 9.4s | 22.8s | 2.4x |
Most flows land in the 2.4-2.7x range, which is what we expected. The architectural overhead of gRPC serialization and hierarchy parsing adds a roughly constant tax per command.
The outlier is text input at 3.6x. We did not expect that gap. Maestro types character-by-character through gRPC, and under load it drops characters – which triggers retries, which compound the delay. maestro-runner uses direct ADB input with full Unicode support. No intermediary, no dropped characters, no retry loop.
Resource Consumption
We measured a full 8-flow run with /usr/bin/time -l on macOS. The page fault count is the number that surprised us most.
| Resource | maestro-runner | Maestro |
|---|---|---|
| Peak RSS | 27 MB | 358 MB |
| Resident Set (avg) | 22 MB | 312 MB |
| Page faults | 1,842 | 89,431 |
| CPU time (user) | 1.8s | 14.2s |
| CPU time (system) | 0.9s | 3.1s |
| Involuntary ctx switches | 312 | 4,891 |
89,431 page faults versus 1,842. The JVM allocates aggressively at startup – reserving heap space it may never use – and the OS pays for it in page management overhead the entire run. This is not a bug in Maestro. It is a consequence of the JVM’s memory model. It is also why Maestro gets slower on memory-constrained CI runners: once the OS starts paging, every command pays a latency tax.
Why It’s Faster
Four architectural differences. Each one independently measurable, and each one was a deliberate choice we made after reading Maestro’s source code.
No JVM startup
We expected this to matter. We did not expect it to matter this much.
Maestro’s startup sequence: launch JVM, load ~2,000 classes, start gRPC server on port 7001, wait for device connection, begin test. Time: 4-6 seconds. maestro-runner’s startup: parse flags, connect to device over ADB, begin test. Time: ~20 milliseconds.
For a 60-second test, 5 seconds of startup is noise. For an 8-second test like “deep link + assert,” startup is 38% of total runtime. In CI, where you often run one flow per job, this is not a rounding error. It is the difference between a 15-second job and a 6-second job.
No gRPC intermediary
Maestro routes every device command through a gRPC layer:
YAML -> Kotlin Orchestrator -> gRPC -> Device Server -> UIAutomator2
maestro-runner talks directly to UIAutomator2’s HTTP server:
YAML -> Go Runner -> HTTP -> UIAutomator2
One less hop, one less serialization layer, one less source of StatusRuntimeException: UNKNOWN.
Native selectors first
This was the single biggest per-command speedup, and we discovered it almost by accident.
When you write tapOn: "Login", Maestro fetches the entire UI hierarchy, parses it into a tree, and walks it in Kotlin. Every time. maestro-runner tries native UIAutomator2 selectors first – text("Login") goes straight to the platform’s built-in search. Only if that misses does it fall back to hierarchy parsing.
Native queries run on-device in native code. Hierarchy parsing requires transferring the entire XML tree over HTTP, then parsing it in the runner. For common selectors – the vast majority of real-world flows – the native path is dramatically faster.
Go’s memory model
Go programs start with a small heap and grow as needed. The runtime is ~8 MB. No class loading phase, no JIT warmup, no garbage collector pause storm at startup.
Maestro’s JVM allocates ~200 MB at startup for the heap, loads the Kotlin standard library, the gRPC stack, and the Maestro class hierarchy before executing a single test command. That 200 MB is not free – it competes with the Android emulator for RAM, and on CI runners, the emulator always loses that fight.
What This Feels Like in CI
Numbers in a table are one thing. Living with them daily is another.
A 20-flow suite in CI:
| Metric | maestro-runner | Maestro |
|---|---|---|
| Total execution time | ~4.7 min | ~11.4 min |
| CI minutes consumed | 5 min | 12 min |
| Memory headroom | 473 MB free (500 MB runner) | 142 MB free (500 MB runner) |
The memory headroom line is the one that changes your experience. With 142 MB free on a 500 MB runner, you are one background process away from the OS swapping – and once the OS starts swapping, tests do not just slow down. They become flaky. Random timeouts. Element-not-found errors on elements that are clearly visible. The kind of failures that pass on retry and erode your team’s trust in the test suite.
With 473 MB free, you have room. The emulator runs comfortably. Tests pass consistently.
On GitHub Actions (billed per minute), the same 20-flow suite saves ~7 minutes per run. At 10 runs per day, that is 70 minutes/day – roughly 35 hours/month of CI time. Real money, but honestly, the bigger win is that your team stops treating test failures as noise.
Methodology
- Machine: M2 MacBook Pro, 16 GB RAM, macOS 14.3
- Emulator: Pixel 7, API 33, arm64, 2 GB RAM
- Maestro version: 1.38.1 (latest at time of test)
- maestro-runner version: 0.1.0
- Runs per test: 5 (median reported)
- Measurement:
timefor wall clock,/usr/bin/time -lfor resources,pspolling for RSS (approach inspired by Go’s built-in benchmarking tools) - Controls: Same flow files, same emulator image, fresh emulator boot between full runs, no other processes running
Both tools ultimately execute the same UIAutomator2 commands on the device. The difference is what sits between the runner and the device: maestro-runner uses direct HTTP; Maestro wraps UIAutomator2 behind gRPC. The benchmark measures the cost of that difference.