Dokven

Loading Dokven.
Advanced 10 min read

Test automation basics

How UI automation actually works (locators, actions, assertions, and waits), plus the Page Object Model, what makes a suite maintainable, and the rise of self-healing AI locators.

When an automated test runs, a robot is using your app. It finds a button, clicks it, types into a box, and then checks whether the screen reacted the way it should. That's the whole loop. Strip away the jargon and every UI automation tool (whether it's Selenium, Playwright, Cypress, or a no-code recorder) does the same four things: it finds things, acts on them, checks the result, and waits at the right moments. Master those four and the rest is detail.

The four building blocks

Let's meet the pieces. Every automated UI step is some combination of these:

PieceIn plain wordsTiny example
Locator (selector)How the robot finds an element on the pagethe button with text 'Log in'
ActionWhat the robot does to itclick it, type into it, hover over it
AssertionThe check: what must be true afterwardsthe page now shows 'Welcome back'
WaitPausing until the app is ready, not a fixed number of secondswait until the dashboard appears

Read that as a sentence: find the login button (locator), click it (action), then confirm the welcome message appears (assertion), and wait for it to load before checking (wait). That single sentence is a complete automated test.

Locators: the part that breaks the most

A locator is the robot's way of pointing at something on screen. Humans see a green Log in button; the robot needs a precise instruction to find that exact element among hundreds. Locators can target an element by its id, its visible text, its role ("the button labelled Submit"), or a path through the page's structure.

Good locators are stable locators

Prefer locators tied to meaning (a button's visible text or its accessibility role) over ones tied to position or fragile auto-generated names. A locator like "the button that says Buy now" survives a redesign. A locator like "the third div inside the fourth section" shatters the moment anyone touches the layout. Stable locators are the single biggest factor in whether your suite is a joy or a nightmare.

Locators are also where most flakiness is born. If the robot looks for an element before it has loaded, the test fails for no real reason. That's why waits matter so much.

Waits: the secret to non-flaky tests

Modern apps load in pieces. You click Log in, and the dashboard appears a fraction of a second later. A naive script that checks instantly will fail; the dashboard isn't there yet. The beginner's fix ("just wait 5 seconds") is the wrong fix: it's slow when the app is fast and still flaky when the app is slow.

The right approach is a smart wait: wait until the welcome message appears, then continue, however long that takes, up to a sensible limit. Good tools do much of this automatically. The mindset to carry: never wait a fixed amount of time when you can wait for a condition.

The Page Object Model: tidy tests that age well

Imagine you've written 50 tests that all log in, and every one of them spells out find the email field, type, find the password field, type, find the login button, click. Now the login page gets redesigned. You have to fix the same thing in 50 places. Misery.

The Page Object Model (POM) solves this. The idea is simple: for each page, write one place that describes how to interact with it ("here's how you log in on this page") and have all your tests call that. When the login page changes, you fix it in exactly one spot and all 50 tests heal at once. It's the difference between 50 copies of a phone number scribbled on sticky notes and one entry in your contacts.

Tests

Describe *what* to check

Page objects

Describe *how* to use each page

The app

The real pages and buttons

That middle layer is the whole trick. Tests say what they want ("log in, then check the dashboard"); page objects know how ("the email field is here, the button is there"). Change the how in one place, and every test that depends on it is fixed for free.

What makes automation maintainable

  • Stable locators tied to meaning, not fragile positions.
  • Smart waits for conditions, never fixed sleeps.
  • One source of truth per page (the Page Object Model) so a change lands in one place.
  • Independent tests that don't depend on each other's leftovers: each sets up its own data and cleans up after itself.
  • Clear names and assertions so a failure tells you what broke, not just that something broke.

The modern twist: self-healing and AI locators

The oldest pain in automation is the brittle locator: a developer renames a button, and a hundred tests go red even though nothing is actually broken for users. Self-healing locators are a newer answer. Instead of pinning to a single fragile attribute, the tool remembers several signals about an element (its text, its role, its neighbours, its position) and when one signal changes, it falls back on the others to find the same element again, then quietly updates itself. Modern AI-assisted tools take this further, recognising elements much the way a person would.

This doesn't make maintenance disappear, but it dramatically lowers the everyday breakage that used to make UI automation so exhausting. It's one of the biggest reasons automated end-to-end testing is far more practical in 2026 than it was a decade ago.

Key takeaways
  • Every UI automation tool does the same four things: locate, act, assert, and wait.
  • Stable locators tied to meaning (text, role) are the single biggest factor in a maintainable suite.
  • Never wait a fixed number of seconds when you can wait for a condition: that's how you kill flakiness.
  • The Page Object Model puts each page's how-to in one place, so a redesign means one fix instead of fifty.
  • Self-healing and AI locators reduce the everyday breakage that made UI automation painful, making end-to-end testing far more practical today.