Flow File Structure
A flow file is a YAML document with an optional config section separated by ---:
# Config section (optional)
appId: com.example.app
name: Login Flow
tags:
- smoke
- login
---
# Steps section
- launchApp
- tapOn: "Login"
- inputText: "[email protected]"
- assertVisible: "Welcome"
If no config section is needed, omit the --- separator and write steps directly.
Flow Config Fields
| Field | Type | Description |
|---|---|---|
appId |
string | App package ID (Android) or bundle ID (iOS) |
url |
string | Web app URL (alternative to appId for web testing) |
name |
string | Human-readable flow name (shown in reports) |
tags |
string[] | Tags for filtering (--include-tags, --exclude-tags) |
env |
map | Environment variables available to the flow |
timeout |
int | Overall flow timeout in milliseconds |
commandTimeout |
int | Default timeout for all commands in ms (overrides driver default) |
waitForIdleTimeout |
int | Wait for device idle in ms (0 = disabled, omit = use global setting) |
typingFrequency |
int | WDA typing speed in keys/sec. Overrides --typing-frequency CLI flag. 0 = use WDA default (60) |
onFlowStart |
step[] | Lifecycle hook: steps to run before the main commands |
onFlowComplete |
step[] | Lifecycle hook: steps to run after the main commands (runs even if the flow fails) |
Example:
appId: com.example.app
name: Checkout Flow
tags:
- smoke
- checkout
env:
BASE_URL: https://staging.example.com
TEST_USER: testuser
timeout: 120000
commandTimeout: 15000
waitForIdleTimeout: 3000
onFlowStart:
- launchApp:
appId: com.example.app
clearState: true
onFlowComplete:
- takeScreenshot: final-state.png
---
- tapOn: "Shop"
- tapOn: "Add to Cart"
- tapOn: "Checkout"
Common Step Properties
All steps support these optional fields:
| Field | Type | Description |
|---|---|---|
optional |
bool | If true, step failure won't fail the flow |
label |
string | Human-readable label for logging and reports |
timeout |
int | Step-specific timeout in milliseconds |
- tapOn:
text: "Maybe Visible"
optional: true
label: "Dismiss optional popup"
timeout: 3000
Tap & Gesture Commands
tapOn
Tap on an element.
# Simple text match
- tapOn: "Login"
# By ID
- tapOn:
id: submit-btn
# With options
- tapOn:
text: "Submit"
longPress: true
repeat: 2
delay: 100
retryTapIfNoChange: true
waitUntilVisible: true
waitToSettleTimeoutMs: 500
# Tap at relative point within element
- tapOn:
id: slider
point: "80%, 50%"
# Tap at screen percentage (no selector)
- tapOn:
point: "50%, 90%"
# Tap at absolute pixel coordinates
- tapOn:
point: "286, 819"
Absolute pixel coordinates are detected automatically — if the value contains %, it's treated as a percentage; otherwise as absolute pixels. Coordinates are validated against the device screen dimensions (e.g., "2000, 3000" on a 1080x2400 screen is rejected).
| Field | Type | Description |
|---|---|---|
| (selector) | Element to tap (see Selectors) | |
longPress |
bool | Perform a long press instead of tap |
repeat |
int | Number of taps |
delay |
int | Delay between repeated taps (ms) |
point |
string | Tap at "x%, y%" (percentage) or "x, y" (absolute pixels) |
retryTapIfNoChange |
bool | Retry tap if screen didn't change |
waitUntilVisible |
bool | Wait for element to become visible before tapping |
waitToSettleTimeoutMs |
int | Wait for UI to settle after tap (ms) |
doubleTapOn
Double-tap on an element.
- doubleTapOn: "Item"
- doubleTapOn:
id: zoomable-image
longPressOn
Long-press on an element.
- longPressOn: "Delete"
- longPressOn:
id: context-menu-target
tapOnPoint
Tap at specific coordinates.
# Absolute coordinates
- tapOnPoint:
x: 100
y: 200
# Percentage-based
- tapOnPoint:
point: "50%, 90%"
# With long press
- tapOnPoint:
x: 100
y: 200
longPress: true
swipe
Swipe gesture.
# Simple direction
- swipe: UP
- swipe: DOWN
- swipe: LEFT
- swipe: RIGHT
# With options
- swipe:
direction: DOWN
speed: 50
duration: 300
# Custom start/end (percentage)
- swipe:
start: "50%, 80%"
end: "50%, 20%"
# Custom start/end (absolute)
- swipe:
startX: 100
startY: 500
endX: 100
endY: 200
| Field | Type | Description |
|---|---|---|
direction |
string | UP, DOWN, LEFT, RIGHT |
start |
string | Start point "x%, y%" |
end |
string | End point "x%, y%" |
startX/startY |
int | Absolute start coordinates |
endX/endY |
int | Absolute end coordinates |
duration |
int | Swipe duration in ms |
speed |
int | Speed 0-100 |
Note: Swipe coordinates now match Maestro behavior across all drivers. On iOS (WDA), the default swipe duration is 100ms.
scroll
Scroll the screen.
- scroll: DOWN
- scroll: UP
scrollUntilVisible
Scroll until an element becomes visible.
- scrollUntilVisible:
element: "End of List"
direction: DOWN
# With options
- scrollUntilVisible:
element:
id: footer
direction: DOWN
maxScrolls: 20
speed: 40
visibilityPercentage: 80
centerElement: true
| Field | Type | Description |
|---|---|---|
element |
selector | Element to scroll to (text string or selector object) |
direction |
string | Scroll direction: UP, DOWN, LEFT, RIGHT |
maxScrolls |
int | Maximum scroll attempts |
speed |
int | Scroll speed 0-100 |
visibilityPercentage |
int | Percentage of element that must be visible |
centerElement |
bool | Scroll element to center of screen |
timeout |
int | Overall timeout in milliseconds for the scroll operation |
Navigation Commands
back
Press the back button.
- back
hideKeyboard
Hide the on-screen keyboard.
- hideKeyboard
Platform behavior:
- Android: Presses the BACK key
- iOS: Sends the RETURN key to dismiss
Tip: On Android, maestro-runner automatically detects when the keyboard covers a target element after
inputTextorinputRandomand suggests adding this command.
Text Input Commands
inputText
Type text into a field.
# Into the currently focused field
- inputText: "[email protected]"
# Into a specific element
- inputText:
text: "username"
id: email-field
Tip: On Android, if the soft keyboard covers the next element you want to interact with, maestro-runner detects this and suggests adding
- hideKeyboard. It's good practice to add- hideKeyboardafter text input if the next step interacts with an element that might be covered:- inputText: "[email protected]" - hideKeyboard - tapOn: "Submit"
inputRandom
Generate and input random data.
- inputRandom: EMAIL
- inputRandom: NUMBER
- inputRandom: PERSON_NAME
- inputRandom: TEXT
# With length
- inputRandom:
type: NUMBER
length: 10
Types: EMAIL, NUMBER, PERSON_NAME, TEXT
Shorthands
- inputRandomEmail
- inputRandomNumber
- inputRandomPersonName
- inputRandomText
eraseText
Erase characters from the focused field.
# Erase 5 characters
- eraseText: 5
# Map form
- eraseText:
characters: 10
If no count is specified, defaults to 50 characters.
copyTextFrom
Copy text from an element to the clipboard.
- copyTextFrom: "Price Label"
- copyTextFrom:
id: total-amount
pasteText
Paste text from the clipboard.
- pasteText
setClipboard
Set the clipboard to a specific text value.
- setClipboard: "text to paste later"
- setClipboard:
text: "clipboard content"
Assertion Commands
assertVisible
Assert that an element is visible on screen.
- assertVisible: "Welcome"
- assertVisible:
id: success-banner
assertNotVisible
Assert that an element is NOT visible.
- assertNotVisible: "Error"
- assertNotVisible:
id: loading-spinner
assertTrue
Assert a JavaScript condition evaluates to true.
- assertTrue: "1 === 1"
- assertTrue: "${count} > 0"
assertCondition
Assert a complex condition.
- assertCondition:
visible:
text: "Success"
- assertCondition:
notVisible:
id: error-dialog
- assertCondition:
scriptCondition: "output.result === true"
- assertCondition:
platform: Android
| Field | Type | Description |
|---|---|---|
visible |
selector | Assert element is visible |
notVisible |
selector | Assert element is not visible |
scriptCondition |
string | JavaScript expression that must be truthy |
platform |
string | Assert running on platform (Android, iOS) |
assertNoDefectsWithAI
Use AI vision to check the current screen for visual defects (broken layouts, overlapping elements, truncated text, etc.). Android and iOS only.
- assertNoDefectsWithAI
No parameters — analyzes the current screenshot automatically.
assertWithAI
Use AI vision to verify a natural language assertion against the current screen. Android and iOS only.
# Simple form
- assertWithAI: "The login button is visible and enabled"
# Map form
- assertWithAI:
assertion: "The submit button is enabled and the form has no errors"
The assertion is a human-readable description of what should be true on screen.
extractTextWithAI
Use AI vision to extract specific text from the current screen and store it in a variable. Android and iOS only.
- extractTextWithAI:
query: "price"
variable: product_price
# Use the extracted value
- assertTrue: "${product_price} !== ''"
| Field | Type | Description |
|---|---|---|
query |
string | What to extract (e.g., "price", "phone number", "error message") |
variable |
string | Variable name to store the extracted text |
extendedWaitUntil
Wait until a condition is met.
- extendedWaitUntil:
visible:
text: "Ready"
- extendedWaitUntil:
notVisible:
id: loading-spinner
App Lifecycle Commands
launchApp
Launch an application.
# Simple
- launchApp
- launchApp: com.example.app
# With options
- launchApp:
appId: com.example.app
clearState: true
stopApp: false
newSession: true
permissions:
camera: allow
location: deny
arguments:
--username: devicelab
--password: robustest
environment:
BASE_URL: "https://api.example.com"
LOG_LEVEL: "debug"
# Fresh session (Appium driver, useful for real iOS devices)
- launchApp:
appId: com.example.app
newSession: true
| Field | Type | Description |
|---|---|---|
appId |
string | App package/bundle ID (defaults to flow's appId) |
clearState |
bool | Clear app data before launch |
clearKeychain |
bool | Clear the iOS keychain before launch |
stopApp |
bool | Stop app before relaunching (default: true) |
newSession |
bool | Create a fresh Appium session (Appium driver only). Useful when clearState fails on real iOS devices. On iOS real devices, clearState is skipped when this is true |
permissions |
map | Set permissions: allow, deny, unset. Defaults to all: allow if not specified. See setPermissions for platform-specific behavior. |
arguments |
map | Launch arguments passed to the app. On iOS, these become ProcessInfo.arguments and can set UserDefaults via --key value syntax. |
environment |
map | Environment variables passed to the app. On iOS, these become ProcessInfo.processInfo.environment. Supports variable substitution (${var}). |
stopApp
Gracefully stop an app.
- stopApp
- stopApp: com.example.app
killApp
Forcefully kill an app.
- killApp
- killApp: com.example.app
clearState
Clear app data/state.
- clearState
- clearState: com.example.app
clearKeychain
Clear the device keychain. iOS only.
- clearKeychain
Also available as a launchApp option:
- launchApp:
appId: com.example.app
clearKeychain: true
setPermissions
Set app permissions.
- setPermissions:
appId: com.example.app
permissions:
camera: allow
location: deny
contacts: allow
notifications: allow
Permission shortcuts: location, camera, contacts, phone, microphone, bluetooth, storage, notifications, medialibrary, calendar, sms, all
Values: allow, deny, unset
iOS Simulator — permissions are handled via simctl privacy (silent, no dialogs):
allow(default) — pre-grants permission, app gets.authorizeddeny— revokes permission, app gets.deniedunset— hands off, don't touch permissions at all
iOS Real Device — permissions are handled via WDA's defaultAlertAction:
allow(default) — auto-accepts all permission dialogsdeny— auto-dismisses all permission dialogs- Mixed permissions — not supported, user handles dialogs manually
If no permissions field is specified, defaults to all: allow.
# Grant everything (default behavior)
- launchApp:
permissions:
all: allow
# Deny everything
- launchApp:
permissions:
all: deny
# Don't touch permissions
- launchApp:
permissions:
all: unset
# Deny camera, grant rest
- launchApp:
permissions:
camera: deny
notifications: allow
Alert Handling
acceptAlert
Accepts (taps Allow/OK) the current system alert dialog. Polls for up to 5 seconds (default) waiting for an alert to appear. If no alert appears, succeeds silently.
Useful for handling iOS permission dialogs when using permissions: { all: unset }.
- acceptAlert # wait up to 5s for alert, tap Allow
- acceptAlert:
timeout: 3000 # custom timeout in ms
dismissAlert
Dismisses (taps Don't Allow/Cancel) the current system alert dialog. Same polling behavior as acceptAlert.
- dismissAlert # wait up to 5s for alert, tap Don't Allow
- dismissAlert:
timeout: 3000 # custom timeout in ms
Note: These commands use WDA's alert API, which can interact with system-level dialogs (SpringBoard). Unlike tapOn, which only sees the app's UI elements.
Device Control Commands
setLocation
Set the device's GPS location. Android and Appium only (not supported on iOS).
- setLocation:
latitude: "37.7749"
longitude: "-122.4194"
Values are strings to support variable substitution (e.g., "${LAT}").
setOrientation
Set device orientation.
- setOrientation: PORTRAIT
- setOrientation: LANDSCAPE
setAirplaneMode
Enable or disable airplane mode.
- Android: Uses
cmd connectivity airplane-mode(Android 11+) orsettings put(older versions) - iOS: Automates the Settings app to toggle airplane mode on real devices
- setAirplaneMode:
enabled: true
# Scalar syntax
- setAirplaneMode: enabled
- setAirplaneMode: disabled
toggleAirplaneMode
Toggle airplane mode on/off.
- toggleAirplaneMode
travel
Simulate location travel along a path. Android only.
- travel:
points:
- "37.7749, -122.4194"
- "37.8044, -122.2712"
speed: 50
| Field | Type | Description |
|---|---|---|
points |
string[] | Waypoints as "latitude, longitude" |
speed |
float | Speed in km/h |
openLink
Open a URL or deep link.
- openLink: "https://example.com"
- openLink: "myapp://page/detail"
- openLink:
link: "myapp://settings"
autoVerify: true
browser: false
| Field | Type | Description |
|---|---|---|
link |
string | URL or deep link |
autoVerify |
bool | Auto-verify app link association |
browser |
bool | Force open in browser |
openBrowser
Open a URL in the device's browser.
- openBrowser: "https://example.com"
- openBrowser:
url: "https://example.com/page"
Flow Control Commands
repeat
Repeat steps a fixed number of times or while a condition holds.
# Fixed count
- repeat:
times: "3"
commands:
- tapOn: "Next"
- swipe: LEFT
# While condition
- repeat:
while:
visible:
text: "Load More"
commands:
- tapOn: "Load More"
- scroll: DOWN
| Field | Type | Description |
|---|---|---|
times |
string | Number of iterations (string for variable support, e.g., "${COUNT}") |
while |
condition | Continue while condition is true |
commands |
step[] | Steps to repeat |
Condition fields: visible (selector), notVisible (selector), scriptCondition (string), platform (string).
The while condition supports a timeout field (ms to wait before declaring the condition is false). Default: 7 seconds.
- repeat:
while:
visible: "Delete"
timeout: 2000
commands:
- tapOn: "Delete"
retry
Retry steps on failure.
# Inline commands
- retry:
maxRetries: "3"
commands:
- tapOn: "Submit"
- assertVisible: "Success"
# External flow file
- retry:
maxRetries: "2"
file: "submit-flow.yaml"
env:
MODE: test
| Field | Type | Description |
|---|---|---|
maxRetries |
string | Max retry attempts (string for variable support) |
commands |
step[] | Steps to retry |
file |
string | External flow file to retry |
env |
map | Environment variables for the retried flow |
runFlow
Run another flow file or inline commands.
# External file
- runFlow: "login.yaml"
# With options
- runFlow:
file: "setup.yaml"
env:
MODE: test
when:
visible:
text: "Not Logged In"
# Inline commands
- runFlow:
commands:
- tapOn: "Accept"
- tapOn: "Continue"
when:
visible:
text: "Terms and Conditions"
# With timeout — cancels mid-operation if the sub-flow takes too long
- runFlow:
file: "slow-operation.yaml"
timeout: 5000
| Field | Type | Description |
|---|---|---|
file |
string | Path to external flow file |
commands |
step[] | Inline steps (alternative to file) |
when |
condition | Only run if condition is true |
env |
map | Environment variables for the sub-flow |
timeout |
int | Cancel the sub-flow if it exceeds this many milliseconds. Element-finding polling loops inside the sub-flow are interrupted on expiry, and the failure is classified as TIMEOUT in reports. |
runScript / evalScript
Execute JavaScript.
# Inline script
- runScript: |
function calculate() {
return 42;
}
output.result = calculate();
# Script file
- runScript: "scripts/setup.js"
# With environment
- runScript:
file: "scripts/setup.js"
env:
API_KEY: abc123
# Eval and store output
- evalScript: "output.total = ${price} * ${quantity}"
evalWebViewScript
Execute a JavaScript expression inside a mobile WebView and store the result. Android only (DeviceLab driver). Uses Chrome DevTools Protocol (CDP) to evaluate the expression in the WebView context, rather than the native app context.
# Evaluate an expression in the WebView
- evalWebViewScript: "document.title"
# Store the result in a variable
- evalWebViewScript:
script: "document.querySelector('.price').textContent"
env:
COUPON: "${coupon_code}"
runWebViewScript
Run a JavaScript file inside a mobile WebView. Android only (DeviceLab driver). Similar to runScript but targets the WebView context via CDP instead of the native app.
# Run a script file in the WebView
- runWebViewScript: "scripts/fill-form.js"
# With environment variables
- runWebViewScript:
file: "scripts/validate-page.js"
env:
EXPECTED_TITLE: "Checkout"
Media Commands
takeScreenshot
Take a screenshot and save to a file.
- takeScreenshot: "login-screen.png"
addMedia
Add media files (images, videos) to the device gallery. Android only.
- addMedia:
files:
- "test-data/photo.png"
- "test-data/video.mp4"
File paths are relative to the flow file directory. Uses Android's media scanner to make files visible in the gallery.
startRecording / stopRecording
Record video of the device screen. Android only.
- startRecording: "test-run.mp4"
# ... test steps ...
- stopRecording: "test-run.mp4"
Other Commands
pressKey
Press a device key.
- pressKey: ENTER
- pressKey: HOME
- pressKey: BACK
waitForAnimationToEnd
Wait for UI animations to complete.
- waitForAnimationToEnd
defineVariables
Define variables for use in subsequent steps.
- defineVariables:
USERNAME: testuser
PASSWORD: secret123
MAX_RETRIES: "5"
Variables are available via ${VARIABLE_NAME} syntax in subsequent steps.
JavaScript APIs
Scripts executed via runScript and evalScript have access to these built-in APIs:
| API | Description |
|---|---|
console.log(), console.warn(), console.error() |
Logging (output captured in reports) |
setTimeout(), setInterval(), clearTimeout(), clearInterval() |
Timers |
http.get(), http.post(), http.put(), http.delete(), http.request() |
HTTP client (default 30s timeout) |
JSON.parse(), JSON.stringify() |
Standard JSON support |
maestro.platform |
Current platform string ("android" or "ios") |
maestro.copiedText |
Last copied text from copyTextFrom |
output.* |
Output variables (accessible by subsequent flow steps) |
HTTP Module
var response = http.get("https://api.example.com/users");
if (response.ok) {
output.userId = response.json.id;
}
http.post("https://api.example.com/data", {
headers: {"Authorization": "Bearer " + token},
body: {key: "value"}, // auto-converted to JSON
timeout: 10000
});
Response object includes: status, body, headers, ok (2xx-3xx), json (auto-parsed if valid).
Variable Substitution
Variables are expanded in all string fields in your flow steps.
JavaScript Expressions (${...})
${expression} syntax supports full JavaScript expressions:
- inputText: "${Date.now()}"
- assertTrue: "${count > 0}"
Dollar Variables ($VAR)
Simple $VAR_NAME syntax is also supported:
- inputText: "$USERNAME"
Variable defaults
When expanding ${VAR} in flow YAML, you can provide a fallback using the
|| or ?? operators (matching Maestro's GraalJS syntax):
# Fall back to a literal default if the variable is undefined or empty
appId: ${APP_ID || "com.example.app"}
# Nullish coalescing — only fall back on undefined/null, keep empty strings
env:
USERNAME: ${CUSTOM_USER ?? "devicelab"}
# Chain multiple fallbacks
env:
PASSWORD: ${CUSTOM_PASS || ALT_PASS || "default-pass"}
Undefined variables resolve to undefined (no ReferenceError), letting
|| / ?? short-circuit cleanly. Useful for flows that run in multiple
environments with optional overrides.
Variable Priority
Variable sources are merged in this order (later overrides earlier):
- System environment variables (uppercase, 3+ characters)
- Workspace config
env - CLI
-eflags - Flow-level
env defineVariablessteps during execution
Note: Variable substitution works in all flow config fields, including
appId. For example:appId: ${APP_ID} --- - launchAppmaestro-runner -e APP_ID=com.example.app flow.yaml
Selectors
Selectors identify elements on screen. They can be a simple string (text match) or an object with multiple criteria.
Primary Selectors
| Field | Type | Description |
|---|---|---|
text |
string | Match by visible text (contains match, case-insensitive). Supports regex. |
id |
string | Match by resource ID (Android) or accessibility ID (iOS) |
css |
string | CSS selector (for web views) |
# Text match (shorthand)
- tapOn: "Login"
# Explicit text
- tapOn:
text: "Login"
# By ID
- tapOn:
id: submit-btn
# By CSS (web)
- tapOn:
css: "#login-form button[type=submit]"
Size Matching
| Field | Type | Description |
|---|---|---|
width |
int | Match by element width |
height |
int | Match by element height |
tolerance |
int | Allowed size deviation in pixels |
State Filters
| Field | Type | Description |
|---|---|---|
enabled |
bool | Filter by enabled state |
selected |
bool | Filter by selected state |
checked |
bool | Filter by checked state |
focused |
bool | Filter by focused state |
- tapOn:
text: "Submit"
enabled: true
- assertVisible:
id: checkbox
checked: true
Index
When multiple elements match, select by zero-based index.
- tapOn:
text: "Item"
index: "2" # Taps the third matching element (0-indexed)
The value is a string to support variable substitution ("${IDX}").
Traits
Filter by element traits (comma-separated).
- tapOn:
text: "Title"
traits: "heading"
Relative Selectors
Find elements based on their position relative to other elements.
| Field | Type | Description |
|---|---|---|
below |
selector | Element is below this anchor |
above |
selector | Element is above this anchor |
leftOf |
selector | Element is left of this anchor |
rightOf |
selector | Element is right of this anchor |
childOf |
selector | Element is a child of this anchor |
containsChild |
selector | Element contains this child |
containsDescendants |
selector[] | Element contains all these descendants |
insideOf |
selector | Element's center point is inside this anchor |
# Tap the input field below the "Email" label
- tapOn:
id: input-field
below:
text: "Email"
# Tap element to the right of a label
- tapOn:
id: toggle
rightOf:
text: "Dark Mode"
# Tap inside a specific dialog
- tapOn:
text: "OK"
insideOf:
id: confirmation-dialog
# Element that contains specific children
- tapOn:
id: list-item
containsChild:
text: "Product A"
# Nested relative selectors
- tapOn:
text: "Delete"
below:
text: "Username"
rightOf:
text: "Settings"
Element Shorthand
In commands like scrollUntilVisible, the element field is an alias for text:
# These are equivalent
- scrollUntilVisible:
element: "Footer"
direction: DOWN
- scrollUntilVisible:
element:
text: "Footer"
direction: DOWN
Regex in ID Selectors
The id selector supports regex patterns across all drivers. Auto-detected by the presence of regex metacharacters:
# Wildcard match
- tapOn:
id: "username-.*"
# Alternation
- assertVisible:
id: "(username|email)-input"
# Suffix anchor
- tapOn:
id: "login.*-button$"
Regex Support
Text selectors automatically detect regex patterns:
# Literal text match
- tapOn: "Login"
# Regex match (auto-detected by quantifiers like *, +, ?, [], etc.)
- tapOn: "Item [0-9]+"
- assertVisible: "Price: \\$[0-9]+\\.99"
The following characters trigger regex mode: *, +, ?, [, ], {, }, |, (, ), ^ (at start), $ (at end). A lone . without a following quantifier is treated as literal (so mastodon.social matches literally).