Flow Commands

Flow File Structure

A flow file is a YAML document with an optional config section separated by ---:

# Config section (optional)
appId: com.example.app
name: Login Flow
tags:
  - smoke
  - login
---
# Steps section
- launchApp
- tapOn: "Login"
- inputText: "[email protected]"
- assertVisible: "Welcome"

If no config section is needed, omit the --- separator and write steps directly.


Flow Config Fields

Field Type Description
appId string App package ID (Android) or bundle ID (iOS)
url string Web app URL (alternative to appId for web testing)
name string Human-readable flow name (shown in reports)
tags string[] Tags for filtering (--include-tags, --exclude-tags)
env map Environment variables available to the flow
timeout int Overall flow timeout in milliseconds
commandTimeout int Default timeout for all commands in ms (overrides driver default)
waitForIdleTimeout int Wait for device idle in ms (0 = disabled, omit = use global setting)
typingFrequency int WDA typing speed in keys/sec. Overrides --typing-frequency CLI flag. 0 = use WDA default (60)
onFlowStart step[] Lifecycle hook: steps to run before the main commands
onFlowComplete step[] Lifecycle hook: steps to run after the main commands (runs even if the flow fails)

Example:

appId: com.example.app
name: Checkout Flow
tags:
  - smoke
  - checkout
env:
  BASE_URL: https://staging.example.com
  TEST_USER: testuser
timeout: 120000
commandTimeout: 15000
waitForIdleTimeout: 3000
onFlowStart:
  - launchApp:
      appId: com.example.app
      clearState: true
onFlowComplete:
  - takeScreenshot: final-state.png
---
- tapOn: "Shop"
- tapOn: "Add to Cart"
- tapOn: "Checkout"

Common Step Properties

All steps support these optional fields:

Field Type Description
optional bool If true, step failure won't fail the flow
label string Human-readable label for logging and reports
timeout int Step-specific timeout in milliseconds
- tapOn:
    text: "Maybe Visible"
    optional: true
    label: "Dismiss optional popup"
    timeout: 3000

Tap & Gesture Commands

tapOn

Tap on an element.

# Simple text match
- tapOn: "Login"

# By ID
- tapOn:
    id: submit-btn

# With options
- tapOn:
    text: "Submit"
    longPress: true
    repeat: 2
    delay: 100
    retryTapIfNoChange: true
    waitUntilVisible: true
    waitToSettleTimeoutMs: 500

# Tap at relative point within element
- tapOn:
    id: slider
    point: "80%, 50%"

# Tap at screen percentage (no selector)
- tapOn:
    point: "50%, 90%"

# Tap at absolute pixel coordinates
- tapOn:
    point: "286, 819"

Absolute pixel coordinates are detected automatically — if the value contains %, it's treated as a percentage; otherwise as absolute pixels. Coordinates are validated against the device screen dimensions (e.g., "2000, 3000" on a 1080x2400 screen is rejected).

Field Type Description
(selector) Element to tap (see Selectors)
longPress bool Perform a long press instead of tap
repeat int Number of taps
delay int Delay between repeated taps (ms)
point string Tap at "x%, y%" (percentage) or "x, y" (absolute pixels)
retryTapIfNoChange bool Retry tap if screen didn't change
waitUntilVisible bool Wait for element to become visible before tapping
waitToSettleTimeoutMs int Wait for UI to settle after tap (ms)

doubleTapOn

Double-tap on an element.

- doubleTapOn: "Item"
- doubleTapOn:
    id: zoomable-image

longPressOn

Long-press on an element.

- longPressOn: "Delete"
- longPressOn:
    id: context-menu-target

tapOnPoint

Tap at specific coordinates.

# Absolute coordinates
- tapOnPoint:
    x: 100
    y: 200

# Percentage-based
- tapOnPoint:
    point: "50%, 90%"

# With long press
- tapOnPoint:
    x: 100
    y: 200
    longPress: true

swipe

Swipe gesture.

# Simple direction
- swipe: UP
- swipe: DOWN
- swipe: LEFT
- swipe: RIGHT

# With options
- swipe:
    direction: DOWN
    speed: 50
    duration: 300

# Custom start/end (percentage)
- swipe:
    start: "50%, 80%"
    end: "50%, 20%"

# Custom start/end (absolute)
- swipe:
    startX: 100
    startY: 500
    endX: 100
    endY: 200
Field Type Description
direction string UP, DOWN, LEFT, RIGHT
start string Start point "x%, y%"
end string End point "x%, y%"
startX/startY int Absolute start coordinates
endX/endY int Absolute end coordinates
duration int Swipe duration in ms
speed int Speed 0-100

Note: Swipe coordinates now match Maestro behavior across all drivers. On iOS (WDA), the default swipe duration is 100ms.

scroll

Scroll the screen.

- scroll: DOWN
- scroll: UP

scrollUntilVisible

Scroll until an element becomes visible.

- scrollUntilVisible:
    element: "End of List"
    direction: DOWN

# With options
- scrollUntilVisible:
    element:
      id: footer
    direction: DOWN
    maxScrolls: 20
    speed: 40
    visibilityPercentage: 80
    centerElement: true
Field Type Description
element selector Element to scroll to (text string or selector object)
direction string Scroll direction: UP, DOWN, LEFT, RIGHT
maxScrolls int Maximum scroll attempts
speed int Scroll speed 0-100
visibilityPercentage int Percentage of element that must be visible
centerElement bool Scroll element to center of screen
timeout int Overall timeout in milliseconds for the scroll operation

back

Press the back button.

- back

hideKeyboard

Hide the on-screen keyboard.

- hideKeyboard

Platform behavior:

  • Android: Presses the BACK key
  • iOS: Sends the RETURN key to dismiss

Tip: On Android, maestro-runner automatically detects when the keyboard covers a target element after inputText or inputRandom and suggests adding this command.


Text Input Commands

inputText

Type text into a field.

# Into the currently focused field
- inputText: "[email protected]"

# Into a specific element
- inputText:
    text: "username"
    id: email-field

Tip: On Android, if the soft keyboard covers the next element you want to interact with, maestro-runner detects this and suggests adding - hideKeyboard. It's good practice to add - hideKeyboard after text input if the next step interacts with an element that might be covered:

- inputText: "[email protected]"
- hideKeyboard
- tapOn: "Submit"

inputRandom

Generate and input random data.

- inputRandom: EMAIL
- inputRandom: NUMBER
- inputRandom: PERSON_NAME
- inputRandom: TEXT

# With length
- inputRandom:
    type: NUMBER
    length: 10

Types: EMAIL, NUMBER, PERSON_NAME, TEXT

Shorthands

- inputRandomEmail
- inputRandomNumber
- inputRandomPersonName
- inputRandomText

eraseText

Erase characters from the focused field.

# Erase 5 characters
- eraseText: 5

# Map form
- eraseText:
    characters: 10

If no count is specified, defaults to 50 characters.

copyTextFrom

Copy text from an element to the clipboard.

- copyTextFrom: "Price Label"
- copyTextFrom:
    id: total-amount

pasteText

Paste text from the clipboard.

- pasteText

setClipboard

Set the clipboard to a specific text value.

- setClipboard: "text to paste later"
- setClipboard:
    text: "clipboard content"

Assertion Commands

assertVisible

Assert that an element is visible on screen.

- assertVisible: "Welcome"
- assertVisible:
    id: success-banner

assertNotVisible

Assert that an element is NOT visible.

- assertNotVisible: "Error"
- assertNotVisible:
    id: loading-spinner

assertTrue

Assert a JavaScript condition evaluates to true.

- assertTrue: "1 === 1"
- assertTrue: "${count} > 0"

assertCondition

Assert a complex condition.

- assertCondition:
    visible:
      text: "Success"

- assertCondition:
    notVisible:
      id: error-dialog

- assertCondition:
    scriptCondition: "output.result === true"

- assertCondition:
    platform: Android
Field Type Description
visible selector Assert element is visible
notVisible selector Assert element is not visible
scriptCondition string JavaScript expression that must be truthy
platform string Assert running on platform (Android, iOS)

assertNoDefectsWithAI

Use AI vision to check the current screen for visual defects (broken layouts, overlapping elements, truncated text, etc.). Android and iOS only.

- assertNoDefectsWithAI

No parameters — analyzes the current screenshot automatically.

assertWithAI

Use AI vision to verify a natural language assertion against the current screen. Android and iOS only.

# Simple form
- assertWithAI: "The login button is visible and enabled"

# Map form
- assertWithAI:
    assertion: "The submit button is enabled and the form has no errors"

The assertion is a human-readable description of what should be true on screen.

extractTextWithAI

Use AI vision to extract specific text from the current screen and store it in a variable. Android and iOS only.

- extractTextWithAI:
    query: "price"
    variable: product_price

# Use the extracted value
- assertTrue: "${product_price} !== ''"
Field Type Description
query string What to extract (e.g., "price", "phone number", "error message")
variable string Variable name to store the extracted text

extendedWaitUntil

Wait until a condition is met.

- extendedWaitUntil:
    visible:
      text: "Ready"

- extendedWaitUntil:
    notVisible:
      id: loading-spinner

App Lifecycle Commands

launchApp

Launch an application.

# Simple
- launchApp
- launchApp: com.example.app

# With options
- launchApp:
    appId: com.example.app
    clearState: true
    stopApp: false
    newSession: true
    permissions:
      camera: allow
      location: deny
    arguments:
      --username: devicelab
      --password: robustest
    environment:
      BASE_URL: "https://api.example.com"
      LOG_LEVEL: "debug"

# Fresh session (Appium driver, useful for real iOS devices)
- launchApp:
    appId: com.example.app
    newSession: true
Field Type Description
appId string App package/bundle ID (defaults to flow's appId)
clearState bool Clear app data before launch
clearKeychain bool Clear the iOS keychain before launch
stopApp bool Stop app before relaunching (default: true)
newSession bool Create a fresh Appium session (Appium driver only). Useful when clearState fails on real iOS devices. On iOS real devices, clearState is skipped when this is true
permissions map Set permissions: allow, deny, unset. Defaults to all: allow if not specified. See setPermissions for platform-specific behavior.
arguments map Launch arguments passed to the app. On iOS, these become ProcessInfo.arguments and can set UserDefaults via --key value syntax.
environment map Environment variables passed to the app. On iOS, these become ProcessInfo.processInfo.environment. Supports variable substitution (${var}).

stopApp

Gracefully stop an app.

- stopApp
- stopApp: com.example.app

killApp

Forcefully kill an app.

- killApp
- killApp: com.example.app

clearState

Clear app data/state.

- clearState
- clearState: com.example.app

clearKeychain

Clear the device keychain. iOS only.

- clearKeychain

Also available as a launchApp option:

- launchApp:
    appId: com.example.app
    clearKeychain: true

setPermissions

Set app permissions.

- setPermissions:
    appId: com.example.app
    permissions:
      camera: allow
      location: deny
      contacts: allow
      notifications: allow

Permission shortcuts: location, camera, contacts, phone, microphone, bluetooth, storage, notifications, medialibrary, calendar, sms, all

Values: allow, deny, unset

iOS Simulator — permissions are handled via simctl privacy (silent, no dialogs):

  • allow (default) — pre-grants permission, app gets .authorized
  • deny — revokes permission, app gets .denied
  • unset — hands off, don't touch permissions at all

iOS Real Device — permissions are handled via WDA's defaultAlertAction:

  • allow (default) — auto-accepts all permission dialogs
  • deny — auto-dismisses all permission dialogs
  • Mixed permissions — not supported, user handles dialogs manually

If no permissions field is specified, defaults to all: allow.

# Grant everything (default behavior)
- launchApp:
    permissions:
      all: allow

# Deny everything
- launchApp:
    permissions:
      all: deny

# Don't touch permissions
- launchApp:
    permissions:
      all: unset

# Deny camera, grant rest
- launchApp:
    permissions:
      camera: deny
      notifications: allow

Alert Handling

acceptAlert

Accepts (taps Allow/OK) the current system alert dialog. Polls for up to 5 seconds (default) waiting for an alert to appear. If no alert appears, succeeds silently.

Useful for handling iOS permission dialogs when using permissions: { all: unset }.

- acceptAlert                    # wait up to 5s for alert, tap Allow
- acceptAlert:
    timeout: 3000               # custom timeout in ms

dismissAlert

Dismisses (taps Don't Allow/Cancel) the current system alert dialog. Same polling behavior as acceptAlert.

- dismissAlert                   # wait up to 5s for alert, tap Don't Allow
- dismissAlert:
    timeout: 3000               # custom timeout in ms

Note: These commands use WDA's alert API, which can interact with system-level dialogs (SpringBoard). Unlike tapOn, which only sees the app's UI elements.


Device Control Commands

setLocation

Set the device's GPS location. Android and Appium only (not supported on iOS).

- setLocation:
    latitude: "37.7749"
    longitude: "-122.4194"

Values are strings to support variable substitution (e.g., "${LAT}").

setOrientation

Set device orientation.

- setOrientation: PORTRAIT
- setOrientation: LANDSCAPE

setAirplaneMode

Enable or disable airplane mode.

  • Android: Uses cmd connectivity airplane-mode (Android 11+) or settings put (older versions)
  • iOS: Automates the Settings app to toggle airplane mode on real devices
- setAirplaneMode:
    enabled: true

# Scalar syntax
- setAirplaneMode: enabled
- setAirplaneMode: disabled

toggleAirplaneMode

Toggle airplane mode on/off.

- toggleAirplaneMode

travel

Simulate location travel along a path. Android only.

- travel:
    points:
      - "37.7749, -122.4194"
      - "37.8044, -122.2712"
    speed: 50
Field Type Description
points string[] Waypoints as "latitude, longitude"
speed float Speed in km/h

Open a URL or deep link.

- openLink: "https://example.com"
- openLink: "myapp://page/detail"

- openLink:
    link: "myapp://settings"
    autoVerify: true
    browser: false
Field Type Description
link string URL or deep link
autoVerify bool Auto-verify app link association
browser bool Force open in browser

openBrowser

Open a URL in the device's browser.

- openBrowser: "https://example.com"
- openBrowser:
    url: "https://example.com/page"

Flow Control Commands

repeat

Repeat steps a fixed number of times or while a condition holds.

# Fixed count
- repeat:
    times: "3"
    commands:
      - tapOn: "Next"
      - swipe: LEFT

# While condition
- repeat:
    while:
      visible:
        text: "Load More"
    commands:
      - tapOn: "Load More"
      - scroll: DOWN
Field Type Description
times string Number of iterations (string for variable support, e.g., "${COUNT}")
while condition Continue while condition is true
commands step[] Steps to repeat

Condition fields: visible (selector), notVisible (selector), scriptCondition (string), platform (string).

The while condition supports a timeout field (ms to wait before declaring the condition is false). Default: 7 seconds.

- repeat:
    while:
      visible: "Delete"
      timeout: 2000
    commands:
      - tapOn: "Delete"

retry

Retry steps on failure.

# Inline commands
- retry:
    maxRetries: "3"
    commands:
      - tapOn: "Submit"
      - assertVisible: "Success"

# External flow file
- retry:
    maxRetries: "2"
    file: "submit-flow.yaml"
    env:
      MODE: test
Field Type Description
maxRetries string Max retry attempts (string for variable support)
commands step[] Steps to retry
file string External flow file to retry
env map Environment variables for the retried flow

runFlow

Run another flow file or inline commands.

# External file
- runFlow: "login.yaml"

# With options
- runFlow:
    file: "setup.yaml"
    env:
      MODE: test
    when:
      visible:
        text: "Not Logged In"

# Inline commands
- runFlow:
    commands:
      - tapOn: "Accept"
      - tapOn: "Continue"
    when:
      visible:
        text: "Terms and Conditions"

# With timeout — cancels mid-operation if the sub-flow takes too long
- runFlow:
    file: "slow-operation.yaml"
    timeout: 5000
Field Type Description
file string Path to external flow file
commands step[] Inline steps (alternative to file)
when condition Only run if condition is true
env map Environment variables for the sub-flow
timeout int Cancel the sub-flow if it exceeds this many milliseconds. Element-finding polling loops inside the sub-flow are interrupted on expiry, and the failure is classified as TIMEOUT in reports.

runScript / evalScript

Execute JavaScript.

# Inline script
- runScript: |
    function calculate() {
      return 42;
    }
    output.result = calculate();

# Script file
- runScript: "scripts/setup.js"

# With environment
- runScript:
    file: "scripts/setup.js"
    env:
      API_KEY: abc123

# Eval and store output
- evalScript: "output.total = ${price} * ${quantity}"

evalWebViewScript

Execute a JavaScript expression inside a mobile WebView and store the result. Android only (DeviceLab driver). Uses Chrome DevTools Protocol (CDP) to evaluate the expression in the WebView context, rather than the native app context.

# Evaluate an expression in the WebView
- evalWebViewScript: "document.title"

# Store the result in a variable
- evalWebViewScript:
    script: "document.querySelector('.price').textContent"
    env:
      COUPON: "${coupon_code}"

runWebViewScript

Run a JavaScript file inside a mobile WebView. Android only (DeviceLab driver). Similar to runScript but targets the WebView context via CDP instead of the native app.

# Run a script file in the WebView
- runWebViewScript: "scripts/fill-form.js"

# With environment variables
- runWebViewScript:
    file: "scripts/validate-page.js"
    env:
      EXPECTED_TITLE: "Checkout"

Media Commands

takeScreenshot

Take a screenshot and save to a file.

- takeScreenshot: "login-screen.png"

addMedia

Add media files (images, videos) to the device gallery. Android only.

- addMedia:
    files:
      - "test-data/photo.png"
      - "test-data/video.mp4"

File paths are relative to the flow file directory. Uses Android's media scanner to make files visible in the gallery.

startRecording / stopRecording

Record video of the device screen. Android only.

- startRecording: "test-run.mp4"
# ... test steps ...
- stopRecording: "test-run.mp4"

Other Commands

pressKey

Press a device key.

- pressKey: ENTER
- pressKey: HOME
- pressKey: BACK

waitForAnimationToEnd

Wait for UI animations to complete.

- waitForAnimationToEnd

defineVariables

Define variables for use in subsequent steps.

- defineVariables:
    USERNAME: testuser
    PASSWORD: secret123
    MAX_RETRIES: "5"

Variables are available via ${VARIABLE_NAME} syntax in subsequent steps.


JavaScript APIs

Scripts executed via runScript and evalScript have access to these built-in APIs:

API Description
console.log(), console.warn(), console.error() Logging (output captured in reports)
setTimeout(), setInterval(), clearTimeout(), clearInterval() Timers
http.get(), http.post(), http.put(), http.delete(), http.request() HTTP client (default 30s timeout)
JSON.parse(), JSON.stringify() Standard JSON support
maestro.platform Current platform string ("android" or "ios")
maestro.copiedText Last copied text from copyTextFrom
output.* Output variables (accessible by subsequent flow steps)

HTTP Module

var response = http.get("https://api.example.com/users");
if (response.ok) {
  output.userId = response.json.id;
}

http.post("https://api.example.com/data", {
  headers: {"Authorization": "Bearer " + token},
  body: {key: "value"},  // auto-converted to JSON
  timeout: 10000
});

Response object includes: status, body, headers, ok (2xx-3xx), json (auto-parsed if valid).


Variable Substitution

Variables are expanded in all string fields in your flow steps.

JavaScript Expressions (${...})

${expression} syntax supports full JavaScript expressions:

- inputText: "${Date.now()}"
- assertTrue: "${count > 0}"

Dollar Variables ($VAR)

Simple $VAR_NAME syntax is also supported:

- inputText: "$USERNAME"

Variable defaults

When expanding ${VAR} in flow YAML, you can provide a fallback using the || or ?? operators (matching Maestro's GraalJS syntax):

# Fall back to a literal default if the variable is undefined or empty
appId: ${APP_ID || "com.example.app"}

# Nullish coalescing — only fall back on undefined/null, keep empty strings
env:
  USERNAME: ${CUSTOM_USER ?? "devicelab"}

# Chain multiple fallbacks
env:
  PASSWORD: ${CUSTOM_PASS || ALT_PASS || "default-pass"}

Undefined variables resolve to undefined (no ReferenceError), letting || / ?? short-circuit cleanly. Useful for flows that run in multiple environments with optional overrides.

Variable Priority

Variable sources are merged in this order (later overrides earlier):

  1. System environment variables (uppercase, 3+ characters)
  2. Workspace config env
  3. CLI -e flags
  4. Flow-level env
  5. defineVariables steps during execution

Note: Variable substitution works in all flow config fields, including appId. For example:

appId: ${APP_ID}
---
- launchApp
maestro-runner -e APP_ID=com.example.app flow.yaml

Selectors

Selectors identify elements on screen. They can be a simple string (text match) or an object with multiple criteria.

Primary Selectors

Field Type Description
text string Match by visible text (contains match, case-insensitive). Supports regex.
id string Match by resource ID (Android) or accessibility ID (iOS)
css string CSS selector (for web views)
# Text match (shorthand)
- tapOn: "Login"

# Explicit text
- tapOn:
    text: "Login"

# By ID
- tapOn:
    id: submit-btn

# By CSS (web)
- tapOn:
    css: "#login-form button[type=submit]"

Size Matching

Field Type Description
width int Match by element width
height int Match by element height
tolerance int Allowed size deviation in pixels

State Filters

Field Type Description
enabled bool Filter by enabled state
selected bool Filter by selected state
checked bool Filter by checked state
focused bool Filter by focused state
- tapOn:
    text: "Submit"
    enabled: true

- assertVisible:
    id: checkbox
    checked: true

Index

When multiple elements match, select by zero-based index.

- tapOn:
    text: "Item"
    index: "2"    # Taps the third matching element (0-indexed)

The value is a string to support variable substitution ("${IDX}").

Traits

Filter by element traits (comma-separated).

- tapOn:
    text: "Title"
    traits: "heading"

Relative Selectors

Find elements based on their position relative to other elements.

Field Type Description
below selector Element is below this anchor
above selector Element is above this anchor
leftOf selector Element is left of this anchor
rightOf selector Element is right of this anchor
childOf selector Element is a child of this anchor
containsChild selector Element contains this child
containsDescendants selector[] Element contains all these descendants
insideOf selector Element's center point is inside this anchor
# Tap the input field below the "Email" label
- tapOn:
    id: input-field
    below:
      text: "Email"

# Tap element to the right of a label
- tapOn:
    id: toggle
    rightOf:
      text: "Dark Mode"

# Tap inside a specific dialog
- tapOn:
    text: "OK"
    insideOf:
      id: confirmation-dialog

# Element that contains specific children
- tapOn:
    id: list-item
    containsChild:
      text: "Product A"

# Nested relative selectors
- tapOn:
    text: "Delete"
    below:
      text: "Username"
      rightOf:
        text: "Settings"

Element Shorthand

In commands like scrollUntilVisible, the element field is an alias for text:

# These are equivalent
- scrollUntilVisible:
    element: "Footer"
    direction: DOWN

- scrollUntilVisible:
    element:
      text: "Footer"
    direction: DOWN

Regex in ID Selectors

The id selector supports regex patterns across all drivers. Auto-detected by the presence of regex metacharacters:

# Wildcard match
- tapOn:
    id: "username-.*"

# Alternation
- assertVisible:
    id: "(username|email)-input"

# Suffix anchor
- tapOn:
    id: "login.*-button$"

Regex Support

Text selectors automatically detect regex patterns:

# Literal text match
- tapOn: "Login"

# Regex match (auto-detected by quantifiers like *, +, ?, [], etc.)
- tapOn: "Item [0-9]+"
- assertVisible: "Price: \\$[0-9]+\\.99"

The following characters trigger regex mode: *, +, ?, [, ], {, }, |, (, ), ^ (at start), $ (at end). A lone . without a following quantifier is treated as literal (so mastodon.social matches literally).