Your Appium test passes 8 out of 10 times. Nobody trusts the test results anymore. Your CI pipeline has a “re-run failed tests” step because everyone assumes the first run might be a fluke.

This is the test flakiness problem, and it’s solvable. (If you haven’t seen the numbers, the real cost of flaky tests is staggering – Microsoft pegged it at $1.14M/year.)

After analyzing thousands of flaky Appium tests across mobile teams, we’ve identified the root causes and fixes. This guide covers each one with actual code examples.

What Makes a Test Flaky?

A flaky test produces inconsistent results—sometimes passes, sometimes fails—without any code changes. The result is non-deterministic.

Root Causes (Ranked by Frequency)

Based on our analysis and industry research (also see Selenium WebDriver documentation and Google Testing Blog):

Cause Frequency Fix Difficulty
Timing/synchronization issues ~45% Medium
Poor element locators ~25% Easy
Test data dependencies ~15% Medium
Environment instability ~10% Hard
Actual bugs (rare!) ~5% Varies

Let’s fix each one.

Fix 1: Synchronization Issues (~45% of Flakiness)

The most common cause: your test tries to interact with an element before it’s ready. This is also a major source of slow Appium tests.

The Problem

java
// ❌ FLAKY: Element might not be clickable yet
driver.findElement(By.id("submit_button")).click();

// ❌ FLAKY: Sleep doesn't guarantee element is ready
Thread.sleep(2000);
driver.findElement(By.id("submit_button")).click();

// ❌ FLAKY: Implicit wait only waits for presence, not clickability
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
driver.findElement(By.id("submit_button")).click();

The Fix: Explicit Waits with Specific Conditions

java
// ✅ STABLE: Wait for clickability before clicking
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
WebElement button = wait.until(
    ExpectedConditions.elementToBeClickable(By.id("submit_button"))
);
button.click();

// ✅ STABLE: Wait for visibility before reading text
WebElement message = wait.until(
    ExpectedConditions.visibilityOfElementLocated(By.id("success_message"))
);
String text = message.getText();

// ✅ STABLE: Wait for element to disappear (loading spinners)
wait.until(
    ExpectedConditions.invisibilityOfElementLocated(By.id("loading_spinner"))
);

Custom Wait Conditions

Sometimes standard conditions aren’t enough:

java
// Wait for element to have specific text
wait.until(driver -> {
    WebElement element = driver.findElement(By.id("status"));
    return element.getText().equals("Complete");
});

// Wait for element count to stabilize (list loading)
wait.until(driver -> {
    List<WebElement> items = driver.findElements(By.className("list_item"));
    return items.size() >= 10;
});

// Wait for attribute to change
wait.until(driver -> {
    WebElement button = driver.findElement(By.id("submit"));
    return !button.getAttribute("disabled").equals("true");
});

Wait Strategy Reference

Action Wait For Condition
Click button Clickable elementToBeClickable
Read text Visible visibilityOfElementLocated
Check exists Present presenceOfElementLocated
Fill input Visible + Enabled elementToBeClickable
Assert gone Invisible invisibilityOfElementLocated
Navigate URL change urlContains
List load Count stable Custom wait

Fix 2: Poor Element Locators (~25% of Flakiness)

Fragile locators break when the UI changes slightly.

The Problem

java
// ❌ FLAKY: Depends on UI hierarchy structure
driver.findElement(By.xpath(
    "//android.view.ViewGroup[3]/android.widget.Button[2]"
));

// ❌ FLAKY: Index-based selection
driver.findElement(By.xpath("(//android.widget.Button)[4]"));

// ❌ FLAKY: Relies on displayed text (changes with localization)
driver.findElement(By.xpath("//android.widget.Button[@text='Submit']"));

The Fix: Stable Locator Strategies

java
// ✅ STABLE: Accessibility ID (best cross-platform option)
driver.findElement(MobileBy.accessibilityId("submit_button"));

// ✅ STABLE: Resource ID (Android)
driver.findElement(By.id("com.example.app:id/submit_button"));

// ✅ STABLE: iOS Predicate with stable attributes
driver.findElement(MobileBy.iOSNsPredicateString(
    "type == 'XCUIElementTypeButton' AND name == 'submitButton'"
));

// ✅ STABLE: UiAutomator with resource ID (Android)
driver.findElement(MobileBy.AndroidUIAutomator(
    "new UiSelector().resourceId(\"com.example.app:id/submit_button\")"
));

Locator Stability Guide

Locator Type Stability Speed Use When
Accessibility ID Excellent Fast Always (first choice)
Resource ID Excellent Fast Android-specific
iOS Predicate Good Fast iOS with name/label
iOS Class Chain Moderate Medium iOS hierarchical lookup
XPath with attributes Poor Slow Last resort
XPath with index Very Poor Slow Never

Working with Developers

The best fix is adding accessibility IDs to your app:

Android (contentDescription):

xml
<Button
    android:id="@+id/submit_button"
    android:contentDescription="submit_button"
    android:text="Submit" />

iOS (accessibilityIdentifier):

swift
submitButton.accessibilityIdentifier = "submit_button"

React Native:

jsx
<Button testID="submit_button" title="Submit" />

Fix 3: Test Data Dependencies (~15% of Flakiness)

Tests that share data or depend on external state are inherently flaky.

The Problem

java
// ❌ FLAKY: Test depends on user existing in production database
@Test
void testLogin() {
    login("[email protected]", "password123");
    // Fails if user was deleted or password changed
}

// ❌ FLAKY: Tests run in undefined order, share state
@Test void testCreateOrder() { /* creates order #1234 */ }
@Test void testViewOrder() { /* assumes order #1234 exists */ }

// ❌ FLAKY: External API dependency
@Test void testWeatherWidget() {
    // Fails if weather API is slow or returns different data
}

The Fix: Test Data Isolation

Strategy 1: Create test data in setup

java
@BeforeEach
void setup() {
    // Create unique test user for this test run
    testUser = api.createTestUser(
        "test_" + UUID.randomUUID() + "@example.com"
    );
}

@AfterEach
void teardown() {
    // Clean up test data
    api.deleteUser(testUser.id);
}

@Test
void testLogin() {
    login(testUser.email, testUser.password);
    // Always works because we created the user
}

Strategy 2: Mock external dependencies

java
// Use WireMock for API dependencies
@BeforeEach
void setup() {
    wireMockServer.stubFor(
        get(urlEqualTo("/api/weather"))
            .willReturn(aResponse()
                .withStatus(200)
                .withBody("{\"temp\": 72, \"conditions\": \"sunny\"}")
            )
    );
}

Strategy 3: Use unique identifiers per test run

java
@Test
void testCreateAccount() {
    String uniqueEmail = "test_" + System.currentTimeMillis() + "@example.com";
    registerAccount(uniqueEmail, "password123");
    // No collision with other test runs
}

Fix 4: Environment Instability (~10% of Flakiness)

The device, network, or app state causes inconsistent behavior. Understanding emulator vs real device testing helps reduce environment-related flakiness.

java
// ❌ FLAKY: Animation timing varies
driver.findElement(By.id("button")).click();
driver.findElement(By.id("result")).getText(); // Fails during animation

// ✅ STABLE: Disable animations for testing
// In Android developer settings OR via capability:
capabilities.setCapability("disableWindowAnimation", true);

Android: Disable animations via ADB

bash
adb shell settings put global window_animation_scale 0
adb shell settings put global transition_animation_scale 0
adb shell settings put global animator_duration_scale 0
java
// ❌ FLAKY: Network delay causes timeout
WebElement result = wait.until(
    ExpectedConditions.visibilityOfElementLocated(By.id("api_result"))
); // 10 second timeout isn't enough on slow network

// ✅ STABLE: Longer timeout for network operations
WebDriverWait networkWait = new WebDriverWait(driver, Duration.ofSeconds(30));
WebElement result = networkWait.until(
    ExpectedConditions.visibilityOfElementLocated(By.id("api_result"))
);

App State Flakiness

java
// ❌ FLAKY: Previous test left app in unexpected state
@Test void testDashboard() {
    // Assumes app is on home screen—might be on settings page
    clickElement("dashboard_icon");
}

// ✅ STABLE: Reset to known state at start
@BeforeEach
void resetToHomeScreen() {
    // Force app to known state
    if (!isOnHomeScreen()) {
        driver.resetApp(); // Or navigate explicitly
    }
}

Fix 5: Implement Proper Retry Logic (Last Resort)

If you’ve fixed the root causes and still have occasional flakiness from external factors, add retry logic—but do it right.

Bad Retry Pattern

java
// ❌ BAD: Retries entire test, masks real failures
@Test(retryAnalyzer = RetryAnalyzer.class)
void testLogin() {
    // If this test is flaky, fix the root cause
}

Good Retry Pattern

java
// ✅ GOOD: Retry specific unreliable operations
public WebElement findElementWithRetry(By locator, int maxRetries) {
    int attempts = 0;
    while (attempts < maxRetries) {
        try {
            WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(5));
            return wait.until(ExpectedConditions.presenceOfElementLocated(locator));
        } catch (TimeoutException e) {
            attempts++;
            if (attempts >= maxRetries) throw e;
            // Log the retry for debugging
            logger.warn("Retry {} for locator {}", attempts, locator);
        }
    }
    throw new NoSuchElementException("Element not found after " + maxRetries + " retries");
}

When Retries Are Acceptable

Scenario Retry OK? Better Fix
Cloud device startup Yes Use local devices (see BrowserStack vs LambdaTest)
Third-party API call Yes Mock the API
Element not found No Fix locator/wait
Random timeout No Increase timeout properly
Stale element No Re-find element pattern

Debugging Flaky Tests

When you find a flaky test, don’t just re-run and hope. Debug it.

Step 1: Enable Comprehensive Logging

java
// Capture screenshot on failure
@AfterEach
void captureOnFailure(TestInfo testInfo) {
    if (testInfo.getTestMethod().isPresent()) {
        File screenshot = ((TakesScreenshot) driver)
            .getScreenshotAs(OutputType.FILE);
        FileUtils.copyFile(screenshot,
            new File("screenshots/" + testInfo.getDisplayName() + ".png"));
    }
}

// Enable Appium server logs
capabilities.setCapability("showIOSLog", true);  // iOS
capabilities.setCapability("enablePerformanceLogging", true);  // Android

Step 2: Reproduce the Flakiness

bash
# Run test 100 times to reproduce flakiness
for i in {1..100}; do
  ./gradlew test --tests "com.example.FlakyTest"
  if [ $? -ne 0 ]; then
    echo "Failed on run $i"
    break
  fi
done

Step 3: Check Timing Patterns

Flaky tests often fail:

  • Early morning (network congestion)
  • At hour boundaries (database cleanup jobs)
  • During high CI load (resource contention)
  • On specific devices (hardware variations)

Step 4: Isolate the Cause

java
// Add timing logs around suspicious sections
long start = System.currentTimeMillis();
WebElement element = wait.until(ExpectedConditions.visibilityOfElementLocated(locator));
long duration = System.currentTimeMillis() - start;
logger.info("Element {} found in {}ms", locator, duration);
// If duration varies wildly, you've found the flaky section

Preventing Future Flakiness

Pre-Commit Checks

Before merging a new test:

  1. Run it 20 times locally
  2. Run it on CI 50 times
  3. If any failures, don’t merge until fixed

Monitoring

Track flaky test metrics:

  • Flakiness rate per test
  • Retry rate in CI
  • Time-to-fix for flaky tests

Code Review Checklist

  • No Thread.sleep() calls
  • All element interactions have explicit waits
  • Locators use accessibility IDs or resource IDs
  • Test creates its own test data
  • Test cleans up after itself
  • No dependencies on other test execution order

Summary: The Flakiness Fix Hierarchy

  1. First: Fix synchronization (explicit waits for specific conditions)
  2. Second: Fix locators (accessibility IDs > resource IDs > XPath)
  3. Third: Isolate test data (create/teardown in each test)
  4. Fourth: Stabilize environment (disable animations, increase timeouts)
  5. Last resort: Add targeted retry logic

Target reliability: 99%+ pass rate. If your test suite is below 95%, you have a systemic problem—not just a few flaky tests. Cloud testing costs compound flakiness problems—see our BrowserStack pricing analysis for cost implications.