This entry is part 3 of 3 in the series SpecFlow & Selenium

What is Selenium

Selenium is a portable software testing framework for web applications. Basically, it automates browsers what in other words means that it simulates how human beings interact with browsers, e.g. click on buttons, select check boxes, type in some text and so on.

There are two major ways of using the power of Selenium:

  • Selenium IDE – There is Selenium IDE which, in short, allows us to click on the record button, do all the actions in the browser and then replay them by the means of the same IDE. In other words, it allows recording automation test scripts. IDE will generate scripts based on our actions in the browser. Though it allows editing such generated scripts manually, there are no problems with that.
  • WebDriver – is a framework which enables writing UI-tests manually in code. Manually written tests are more maintainable comparing to generated scripts with ugly member names and it, of course, implies that there will be a full power of a programming language such as C# at your disposal.

Testers very often rely on Selenium IDE when they need to automate some UI-tests, usually due to the lack of programming skills, while programmers more often rely on the WebDriver since it gives more freedom and power. In this course, we’re going to talk about the WebDriver 99% of the time.

In general, the WebDriver API can be consumed by different languages such as C#, Java, Ruby etc. So, we’re going to write code in C# referencing Selenium WebDriver NuGet packages. Roughly speaking, there are as many WebDriver API implementations as there are browsers. We’re going to work with the Chrome implementation. And, finally, a specific WebDriver API implementation directly manages the browser. Modern Selenium 3 WebDriver directly calls Web Browser commands rather than fiddling around some javascript code to manage browser and elements on a page. So, we need three components to work with Selenium:

  • Web Browser installed
  • Web Driver which is really a web server sending commands to a browser. Each browser has its own driver because each browser exposes different commands
  • A script which can be written as I already said in different programming languages such as Java or C# or Ruby.

So, this is how the overall process works. In short, with the power of WebDriver, you can do literally anything you want:

  • Navigate to any URL
  • Resize the Browser window
  • Click on any button
  • Select any element
  • Upload files
  • Download files
  • and so on

Key Elements of Selenium

The key parts when we work with Selenium are:

  • WebDriver – which is the most important class responsible for managing a browser. 99% of WebDriver API calls go through the WebDriver object.
  • WebElement – another class which encapsulates any element on a web page such as a button, link or an input element. It provides methods for manipulating elements, clicking on them, sending a text to them and so on.
  • By – a class which represents an extremely important notion which is called a “locator”. It encapsulates information used by WebDriver for searching for a particular element on a page.

Writing First Automation Example

Let’s write our first automation example without digging into details too much. Visit the following link:

You’ll see three buttons and a link. Here is the picture:

I drew black boxes around elements that we want to click on. As I mentioned earlier, there is a concept of a “locator” that is used for searching the elements. Let’s talk about locators a bit.

Selenium Locators

The most important thing you need to learn to successfully use Selenium concerns the problem of locating web elements. To search for an element, you need to provide the so-called locator to a web driver. Locators are represented by the By class. We have eight ways of defining a locator.

Assuming that a page has the following button:

[code lang=”html”] <button id="UID" class="beautifulbtn" name="Name"/>[/code]

we can find it:

  • by ID => By.Id(“UID”);
    Id stands for “identifier” and they are guaranteed to be unique in a document object model or DOM in short. Think of DOM as of a document structure.
  • by Name => By.Name(“Name”); Name is not guaranteed to be unique in a DOM but you’re still able to search for an element by its name if that element defines it. HTML-coders are not obliged to provide one.
  • by class name => By.ClassName(“beautifulbtn”); On complex web pages, the appearance of almost all the elements is defined by CSS classes. CSS stands for Cascade Style-Sheet and in general, it allows to apply styles to elements in a convenient fashion. So, you’re able to locate an element by its CSS class calling the corresponding API.
  • by tag => By.Tag(“button”); You can search for an element or elements by its or their tags. In most cases, searching by a tag, you’ll find more than element and then you’ll have to filter them out to find one that you really need to find. So, sayings by tag, we imply regular HTML tags such as “a”, “button”, “input”, “img” and so on.

The next two options are about searching for an element by link text. Let’s assume that we have the following two hyperlinks defined in HTML:

[code lang=”html”]

     <a href="…">text</a>
     <a href="…">Another text</a>


We can find them:

  • by link text => By.LinkText(“text”); To find the first element by text, you need to call the LinkText method and pass a string that should fully match the link text. Of course, if you need to find the second element with this method, you need to pass “Another text” string.
  • by partial link text => By.PartialLinkText(“text”); PartialLinkText allows finding elements by searching for matching substrings within link texts. So, passing “text” in this case, you’ll find both elements. Passing “Another”, you’ll find only the second element.

The remaining locators are:

  • by CSS selector
  • by XPath

These locators allow querying a document object model by writing either an XPath or a CSS selector. I’m going to cover these locators separately, especially the CSS selectors since they are the most important. After getting acquainted with Selenium locators, let’s try to find the buttons we talked about on

Automating Elements

Before writing any code, we need to install two NuGet packages: Selenium.WebDriver and one of the drivers, for example, Selenium.WebDriver.ChromeDriver.

The driver is an EXE file and after installing a ChromeDriver NuGet package, that executable will be copied to the bin folder as a result of building the application. It is a heart of a chrome web driver.

Assuming that we have a regular console application project, we can instantiate a web driver and navigate to a web page as follows:

[code lang=”csharp”] class Program
private static IWebDriver _driver;

static void Main(string[] args)
_driver = new ChromeDriver();

Before writing locators, I’ll implement a function which clicks on an element and navigates back to the initial page:

[code lang=”csharp”] static void ClickOn(By locator)
IWebDriver driver = new ChromeDriver();
IWebElement element = driver.FindElement(locator);

//to have some time to watch what happens on the screen


And now we can write the following locators:

[code lang=”csharp”] static void TestSimpleLocators()
ClickOn(By.LinkText("Click me using this link text!"));
ClickOn(By.PartialLinkText("Click me using"));

var buttons = _driver.FindElements(By.TagName("button"));
foreach (var button in buttons)
//just logging

static void Main()


If you did everything right, you can launch the app and watch how Selenium opens up a Chrome web browser, navigates to a specified webpage and clicks on buttons and links. Now you have the basic understanding of Selenium.

You, as a reader of my blog, are eligible for taking my video course “Automate Applications with SpecFlow and Selenium WebDriver in C#” with a maximum possible discount just for $9.99


Series Navigation<< Writing the First Acceptance Test with SpecFlow