Your Appium test passes 8 out of 10 times. Nobody trusts the test results anymore. Your CI pipeline has a “re-run failed tests” step because everyone assumes the first run might be a fluke.
This is the test flakiness problem, and it’s solvable. (If you haven’t seen the numbers, the real cost of flaky tests is staggering – Microsoft pegged it at $1.14M/year.)
After analyzing thousands of flaky Appium tests across mobile teams, we’ve identified the root causes and fixes. This guide covers each one with actual code examples.
What Makes a Test Flaky?
A flaky test produces inconsistent results—sometimes passes, sometimes fails—without any code changes. The result is non-deterministic.
Root Causes (Ranked by Frequency)
Based on our analysis and industry research (also see Selenium WebDriver documentation and Google Testing Blog):
| Cause | Frequency | Fix Difficulty |
|---|---|---|
| Timing/synchronization issues | ~45% | Medium |
| Poor element locators | ~25% | Easy |
| Test data dependencies | ~15% | Medium |
| Environment instability | ~10% | Hard |
| Actual bugs (rare!) | ~5% | Varies |
Let’s fix each one.
Fix 1: Synchronization Issues (~45% of Flakiness)
The most common cause: your test tries to interact with an element before it’s ready. This is also a major source of slow Appium tests.
The Problem
// ❌ FLAKY: Element might not be clickable yet
driver.findElement(By.id("submit_button")).click();
// ❌ FLAKY: Sleep doesn't guarantee element is ready
Thread.sleep(2000);
driver.findElement(By.id("submit_button")).click();
// ❌ FLAKY: Implicit wait only waits for presence, not clickability
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
driver.findElement(By.id("submit_button")).click();
The Fix: Explicit Waits with Specific Conditions
// ✅ STABLE: Wait for clickability before clicking
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
WebElement button = wait.until(
ExpectedConditions.elementToBeClickable(By.id("submit_button"))
);
button.click();
// ✅ STABLE: Wait for visibility before reading text
WebElement message = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.id("success_message"))
);
String text = message.getText();
// ✅ STABLE: Wait for element to disappear (loading spinners)
wait.until(
ExpectedConditions.invisibilityOfElementLocated(By.id("loading_spinner"))
);
Custom Wait Conditions
Sometimes standard conditions aren’t enough:
// Wait for element to have specific text
wait.until(driver -> {
WebElement element = driver.findElement(By.id("status"));
return element.getText().equals("Complete");
});
// Wait for element count to stabilize (list loading)
wait.until(driver -> {
List<WebElement> items = driver.findElements(By.className("list_item"));
return items.size() >= 10;
});
// Wait for attribute to change
wait.until(driver -> {
WebElement button = driver.findElement(By.id("submit"));
return !button.getAttribute("disabled").equals("true");
});
Wait Strategy Reference
| Action | Wait For | Condition |
|---|---|---|
| Click button | Clickable | elementToBeClickable |
| Read text | Visible | visibilityOfElementLocated |
| Check exists | Present | presenceOfElementLocated |
| Fill input | Visible + Enabled | elementToBeClickable |
| Assert gone | Invisible | invisibilityOfElementLocated |
| Navigate | URL change | urlContains |
| List load | Count stable | Custom wait |
Fix 2: Poor Element Locators (~25% of Flakiness)
Fragile locators break when the UI changes slightly.
The Problem
// ❌ FLAKY: Depends on UI hierarchy structure
driver.findElement(By.xpath(
"//android.view.ViewGroup[3]/android.widget.Button[2]"
));
// ❌ FLAKY: Index-based selection
driver.findElement(By.xpath("(//android.widget.Button)[4]"));
// ❌ FLAKY: Relies on displayed text (changes with localization)
driver.findElement(By.xpath("//android.widget.Button[@text='Submit']"));
The Fix: Stable Locator Strategies
// ✅ STABLE: Accessibility ID (best cross-platform option)
driver.findElement(MobileBy.accessibilityId("submit_button"));
// ✅ STABLE: Resource ID (Android)
driver.findElement(By.id("com.example.app:id/submit_button"));
// ✅ STABLE: iOS Predicate with stable attributes
driver.findElement(MobileBy.iOSNsPredicateString(
"type == 'XCUIElementTypeButton' AND name == 'submitButton'"
));
// ✅ STABLE: UiAutomator with resource ID (Android)
driver.findElement(MobileBy.AndroidUIAutomator(
"new UiSelector().resourceId(\"com.example.app:id/submit_button\")"
));
Locator Stability Guide
| Locator Type | Stability | Speed | Use When |
|---|---|---|---|
| Accessibility ID | Excellent | Fast | Always (first choice) |
| Resource ID | Excellent | Fast | Android-specific |
| iOS Predicate | Good | Fast | iOS with name/label |
| iOS Class Chain | Moderate | Medium | iOS hierarchical lookup |
| XPath with attributes | Poor | Slow | Last resort |
| XPath with index | Very Poor | Slow | Never |
Working with Developers
The best fix is adding accessibility IDs to your app:
Android (contentDescription):
<Button
android:id="@+id/submit_button"
android:contentDescription="submit_button"
android:text="Submit" />
iOS (accessibilityIdentifier):
submitButton.accessibilityIdentifier = "submit_button"
React Native:
<Button testID="submit_button" title="Submit" />
Fix 3: Test Data Dependencies (~15% of Flakiness)
Tests that share data or depend on external state are inherently flaky.
The Problem
// ❌ FLAKY: Test depends on user existing in production database
@Test
void testLogin() {
login("[email protected]", "password123");
// Fails if user was deleted or password changed
}
// ❌ FLAKY: Tests run in undefined order, share state
@Test void testCreateOrder() { /* creates order #1234 */ }
@Test void testViewOrder() { /* assumes order #1234 exists */ }
// ❌ FLAKY: External API dependency
@Test void testWeatherWidget() {
// Fails if weather API is slow or returns different data
}
The Fix: Test Data Isolation
Strategy 1: Create test data in setup
@BeforeEach
void setup() {
// Create unique test user for this test run
testUser = api.createTestUser(
"test_" + UUID.randomUUID() + "@example.com"
);
}
@AfterEach
void teardown() {
// Clean up test data
api.deleteUser(testUser.id);
}
@Test
void testLogin() {
login(testUser.email, testUser.password);
// Always works because we created the user
}
Strategy 2: Mock external dependencies
// Use WireMock for API dependencies
@BeforeEach
void setup() {
wireMockServer.stubFor(
get(urlEqualTo("/api/weather"))
.willReturn(aResponse()
.withStatus(200)
.withBody("{\"temp\": 72, \"conditions\": \"sunny\"}")
)
);
}
Strategy 3: Use unique identifiers per test run
@Test
void testCreateAccount() {
String uniqueEmail = "test_" + System.currentTimeMillis() + "@example.com";
registerAccount(uniqueEmail, "password123");
// No collision with other test runs
}
Fix 4: Environment Instability (~10% of Flakiness)
The device, network, or app state causes inconsistent behavior. Understanding emulator vs real device testing helps reduce environment-related flakiness.
Device-Related Flakiness
// ❌ FLAKY: Animation timing varies
driver.findElement(By.id("button")).click();
driver.findElement(By.id("result")).getText(); // Fails during animation
// ✅ STABLE: Disable animations for testing
// In Android developer settings OR via capability:
capabilities.setCapability("disableWindowAnimation", true);
Android: Disable animations via ADB
adb shell settings put global window_animation_scale 0
adb shell settings put global transition_animation_scale 0
adb shell settings put global animator_duration_scale 0
Network-Related Flakiness
// ❌ FLAKY: Network delay causes timeout
WebElement result = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.id("api_result"))
); // 10 second timeout isn't enough on slow network
// ✅ STABLE: Longer timeout for network operations
WebDriverWait networkWait = new WebDriverWait(driver, Duration.ofSeconds(30));
WebElement result = networkWait.until(
ExpectedConditions.visibilityOfElementLocated(By.id("api_result"))
);
App State Flakiness
// ❌ FLAKY: Previous test left app in unexpected state
@Test void testDashboard() {
// Assumes app is on home screen—might be on settings page
clickElement("dashboard_icon");
}
// ✅ STABLE: Reset to known state at start
@BeforeEach
void resetToHomeScreen() {
// Force app to known state
if (!isOnHomeScreen()) {
driver.resetApp(); // Or navigate explicitly
}
}
Fix 5: Implement Proper Retry Logic (Last Resort)
If you’ve fixed the root causes and still have occasional flakiness from external factors, add retry logic—but do it right.
Bad Retry Pattern
// ❌ BAD: Retries entire test, masks real failures
@Test(retryAnalyzer = RetryAnalyzer.class)
void testLogin() {
// If this test is flaky, fix the root cause
}
Good Retry Pattern
// ✅ GOOD: Retry specific unreliable operations
public WebElement findElementWithRetry(By locator, int maxRetries) {
int attempts = 0;
while (attempts < maxRetries) {
try {
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(5));
return wait.until(ExpectedConditions.presenceOfElementLocated(locator));
} catch (TimeoutException e) {
attempts++;
if (attempts >= maxRetries) throw e;
// Log the retry for debugging
logger.warn("Retry {} for locator {}", attempts, locator);
}
}
throw new NoSuchElementException("Element not found after " + maxRetries + " retries");
}
When Retries Are Acceptable
| Scenario | Retry OK? | Better Fix |
|---|---|---|
| Cloud device startup | Yes | Use local devices (see BrowserStack vs LambdaTest) |
| Third-party API call | Yes | Mock the API |
| Element not found | No | Fix locator/wait |
| Random timeout | No | Increase timeout properly |
| Stale element | No | Re-find element pattern |
Debugging Flaky Tests
When you find a flaky test, don’t just re-run and hope. Debug it.
Step 1: Enable Comprehensive Logging
// Capture screenshot on failure
@AfterEach
void captureOnFailure(TestInfo testInfo) {
if (testInfo.getTestMethod().isPresent()) {
File screenshot = ((TakesScreenshot) driver)
.getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(screenshot,
new File("screenshots/" + testInfo.getDisplayName() + ".png"));
}
}
// Enable Appium server logs
capabilities.setCapability("showIOSLog", true); // iOS
capabilities.setCapability("enablePerformanceLogging", true); // Android
Step 2: Reproduce the Flakiness
# Run test 100 times to reproduce flakiness
for i in {1..100}; do
./gradlew test --tests "com.example.FlakyTest"
if [ $? -ne 0 ]; then
echo "Failed on run $i"
break
fi
done
Step 3: Check Timing Patterns
Flaky tests often fail:
- Early morning (network congestion)
- At hour boundaries (database cleanup jobs)
- During high CI load (resource contention)
- On specific devices (hardware variations)
Step 4: Isolate the Cause
// Add timing logs around suspicious sections
long start = System.currentTimeMillis();
WebElement element = wait.until(ExpectedConditions.visibilityOfElementLocated(locator));
long duration = System.currentTimeMillis() - start;
logger.info("Element {} found in {}ms", locator, duration);
// If duration varies wildly, you've found the flaky section
Preventing Future Flakiness
Pre-Commit Checks
Before merging a new test:
- Run it 20 times locally
- Run it on CI 50 times
- If any failures, don’t merge until fixed
Monitoring
Track flaky test metrics:
- Flakiness rate per test
- Retry rate in CI
- Time-to-fix for flaky tests
Code Review Checklist
- No
Thread.sleep()calls - All element interactions have explicit waits
- Locators use accessibility IDs or resource IDs
- Test creates its own test data
- Test cleans up after itself
- No dependencies on other test execution order
Frequently Asked Questions
Why are my Appium tests flaky?
The most common causes are timing issues (not waiting for elements properly), inconsistent test data, environment instability, and poor element locators. About 80% of flaky Appium tests are caused by synchronization problems.
Should I add retry logic to handle flaky tests?
Retry logic masks the problem rather than fixing it. Use retries as a last resort for genuinely unreliable third-party dependencies, but first fix the root cause. A test that needs retries is a test that needs fixing.
Is Appium inherently flaky?
No. Appium is not inherently flaky — bad practices make it flaky. Teams that follow proper wait strategies, use stable locators, and isolate test data achieve 95%+ reliability.
How do I debug flaky Appium tests?
Enable video recording and screenshot capture on failure. Run the test 50-100 times in isolation to reproduce the flakiness. Check Appium server logs for timeout errors or element stale exceptions.
Summary: The Flakiness Fix Hierarchy
- First: Fix synchronization (explicit waits for specific conditions)
- Second: Fix locators (accessibility IDs > resource IDs > XPath)
- Third: Isolate test data (create/teardown in each test)
- Fourth: Stabilize environment (disable animations, increase timeouts)
- Last resort: Add targeted retry logic
Target reliability: 99%+ pass rate. If your test suite is below 95%, you have a systemic problem—not just a few flaky tests. Cloud testing costs compound flakiness problems—see our BrowserStack pricing analysis for cost implications.