Mastering Selenium with Python: A Comprehensive Learning Guide

by | Python

Table of Contents

Introduction to Selenium and Web Automation with Python

Overview

Selenium is a powerful tool for web automation that allows you to programmatically interact with web pages. You can use it to automate tasks like form submission, clicking buttons, or even scraping data from web pages. This first unit focuses on setting up Selenium with Python and running your first automation script.

Prerequisites

Install Python

Ensure you have Python installed. You can download it from the official Python website.

Install Selenium

Install the Selenium package via pip:

pip install selenium

Download WebDriver

Selenium requires a WebDriver to interact with the web browser.


  1. ChromeDriver – for Google Chrome
    Download from: ChromeDriver



  2. GeckoDriver – for Mozilla Firefox
    Download from: GeckoDriver


Make sure to add the WebDriver to your system PATH.

Basic Implementation

Step 1: Import Libraries

First, import the necessary libraries.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time

Step 2: Set Up WebDriver

Initialize the WebDriver. Here, we will use Chrome as an example.

# Adjust the path to the location where you downloaded ChromeDriver
driver = webdriver.Chrome(executable_path='/path/to/chromedriver')

Step 3: Open a Website

Use the WebDriver to open a website.

driver.get("http://www.example.com")
print(driver.title)  # Print the title of the webpage

Step 4: Interact with the Web Page

Find elements and perform actions like clicking buttons, entering text, etc.

# Find the search box using its name attribute
search_box = driver.find_element_by_name('q')

# Enter text into the search box
search_box.send_keys('Selenium documentation')

# Submit the search form
search_box.send_keys(Keys.RETURN)

Step 5: Explicit Waits

Use explicit waits to wait for conditions before executing the next steps.

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

try:
    # Wait until the element is located
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "resultStats"))
    )
finally:
    # Close the browser
    driver.quit()

Full Example Code

Here’s a complete snippet that automates a search on Google:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

# Initialize the WebDriver
driver = webdriver.Chrome(executable_path='/path/to/chromedriver')

try:
    # Open Google
    driver.get("http://www.google.com")
    
    # Find the search box
    search_box = driver.find_element_by_name('q')
    
    # Enter search text
    search_box.send_keys('Selenium documentation')
    
    # Submit the search
    search_box.send_keys(Keys.RETURN)
    
    # Wait until the results are displayed
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "resultStats"))
    )
    
    print(driver.title)

finally:
    # Close the browser
    driver.quit()

This basic introduction covers setting up Selenium with Python and running a simple web automation script. You are now ready to explore more complex automation tasks.

Setting Up Your Development Environment

1. Install Python and Selenium

To leverage Selenium with Python for browser automation, make sure you have Python installed. You can verify this by opening a terminal and running:

python --version

If Python is not installed, download and install it from python.org.

Next, install Selenium using pip:

pip install selenium

2. Download the WebDriver

The WebDriver is essential for Selenium to interface with a web browser. You need to download the WebDriver for the specific browser you intend to use (e.g., ChromeDriver for Google Chrome).

For instance, to use ChromeDriver:

  1. Download ChromeDriver from chromedriver.chromium.org/downloads.
  2. Ensure the version matches your installed Chrome version.
  3. Add the ChromeDriver executable to your system PATH.

3. Create a Python Script

Once everything is installed and set up, create a new Python script to start automating browser tasks. Here’s a sample script:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

# Start a new browser session
driver = webdriver.Chrome()  # Make sure 'chromedriver' is in your PATH

# Open a website
driver.get("http://www.google.com")

# Find the search box
search_box = driver.find_element("name", "q")

# Enter search query & hit enter
search_box.send_keys("Selenium WebDriver")
search_box.send_keys(Keys.RETURN)

# Close the browser after a few seconds
driver.implicitly_wait(5)
driver.quit()

4. Running the Script

Save your script with a .py extension and run it from the terminal:

python your_script_name.py

5. Verifying and Debugging

Ensure your setup is working correctly by verifying the browser launches and performs the steps specified in your script without errors. If any issues arise:

  • Verify the correct version of the WebDriver is installed.
  • Ensure the WebDriver executable is included in your system PATH.
  • Check Selenium documentation for any latest changes or additional setups required for specific browsers.

By following these steps, you should have your development environment set up to unlock the power of Selenium with Python and efficiently automate browser tasks.

Unlock the Power of Selenium with Python to Automate Browser Tasks Efficiently

This part of your project demonstrates how to use Selenium with Python to automate various browser tasks. Below, you’ll find a practical implementation of common automation tasks such as navigating to a webpage, interacting with web elements, and extracting information from a webpage.

1. Importing Selenium Libraries

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

2. Initializing the WebDriver

# Initialize Chrome WebDriver
driver = webdriver.Chrome()

3. Navigating to a Webpage

# Open URL
driver.get("https://example.com")

4. Interacting with Web Elements

Finding Elements

# Find element by ID
element_by_id = driver.find_element(By.ID, "element-id")

# Find element by Name
element_by_name = driver.find_element(By.NAME, "element-name")

# Find element by XPath
element_by_xpath = driver.find_element(By.XPATH, "//tag[@attribute='value']")

Clicking Element

# Click an element
element_by_id.click()

Sending Keys to an Input Field

# Send keys to an input field
input_field = driver.find_element(By.NAME, "search")
input_field.send_keys("Selenium")
input_field.send_keys(Keys.RETURN)

5. Waiting for Elements

Explicit Wait

# Wait for an element to be clickable and perform an action
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.ID, 'submit-button')))
element.click()

Implicit Wait

# Implicit wait
driver.implicitly_wait(10)  # seconds

6. Extracting Information from a Webpage

# Extract text from a webpage
heading = driver.find_element(By.TAG_NAME, "h1").text
print(heading)

7. Closing the Browser

# Close the browser
driver.quit()

8. Complete Example

Here’s a complete script that puts it all together.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

# Initialize Chrome WebDriver
driver = webdriver.Chrome()

# Open URL
driver.get("https://example.com")

# Perform actions
try:
    # Wait and click element
    wait = WebDriverWait(driver, 10)
    element = wait.until(EC.element_to_be_clickable((By.ID, 'some-button-id')))
    element.click()

    # Interact with input field
    search_box = driver.find_element(By.NAME, "search")
    search_box.send_keys("Selenium")
    search_box.send_keys(Keys.RETURN)

    # Extract and print heading
    heading = driver.find_element(By.TAG_NAME, "h1").text
    print(heading)

finally:
    # Close the browser
    driver.quit()

This script accomplishes basic web navigation and interaction using Selenium with Python, demonstrating efficient automation techniques.

Understanding Selenium WebDriver

1. Overview

Selenium WebDriver is a browser automation framework that enables you to execute tests against different browsers. It automates the browser by sending commands to its architecture and interacting with web elements on a webpage. It’s an essential tool for testing web applications and automating repetitive web tasks.

2. Importing Required Libraries

Assuming you have Selenium already installed and a proper development environment set up as per your previous units, begin by importing the necessary components.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

3. Initializing WebDriver

To automate a browser task, you first need to create a WebDriver instance corresponding to the browser of your choice.

# Create an instance of a Chrome WebDriver (You can use Firefox, Safari, etc.)
driver = webdriver.Chrome()
# For Firefox: driver = webdriver.Firefox()
# For Safari: driver = webdriver.Safari()

4. Opening a Webpage

Navigate to a specific URL using the get method.

driver.get("https://www.example.com")

5. Interacting with Web Elements

Finding Elements

Use various techniques to locate web elements like id, name, class name, tag name, CSS selector, and XPath.

# By ID
element_by_id = driver.find_element(By.ID, "element-id")

# By Name
element_by_name = driver.find_element(By.NAME, "element-name")

# By Class Name
element_by_class = driver.find_element(By.CLASS_NAME, "element-class")

# By CSS Selector
element_by_css = driver.find_element(By.CSS_SELECTOR, ".element-class")

# By XPath
element_by_xpath = driver.find_element(By.XPATH, "//*[@id='element-id']")

Performing Actions

You can perform actions like clicking, sending keys, and submitting forms.

# Click an element
element_by_id.click()

# Enter text into an input field
element_by_name.send_keys("Your text here")

# Simulate pressing Enter key
element_by_name.send_keys(Keys.RETURN)

# Submit a form
form_element = driver.find_element(By.ID, "form-id")
form_element.submit()

6. Handling Alerts and Pop-ups

Automate handling browser alerts and pop-ups.

# Switch to alert and accept it (dismiss, send_keys methods are also available)
alert = driver.switch_to.alert
alert.accept()

7. Taking Screenshots

Capture the current browser window.

driver.save_screenshot("screenshot.png")

8. Closing Browser

Close the browser when your tasks are completed.

# Close the current window
driver.close()

# Quit the WebDriver session and close all associated windows
driver.quit()

9. Example: Complete Code

Here’s a complete code snippet that incorporates the above operations:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

# Initialize the WebDriver
driver = webdriver.Chrome()

# Open a webpage
driver.get("https://www.example.com")

# Interact with the web page
search_element = driver.find_element(By.NAME, "q")
search_element.send_keys("Selenium WebDriver")
search_element.send_keys(Keys.RETURN)

# Wait for some results to be loaded
driver.implicitly_wait(10)  # waits for 10 seconds

# Take a screenshot
driver.save_screenshot("search_results.png")

# Close the browser
driver.quit()

This code demonstrates the initial steps you need to automate a web task using Selenium WebDriver. Feel free to expand on this by integrating more complex actions and controls into your automation scripts.

Navigating and Interacting with Web Elements

In this section, we will explore how to interact with and navigate through web elements using Selenium WebDriver and Python.

Locating Elements

Before interacting with elements, you need to locate them on the page. Selenium provides several methods to find elements:

By ID

element = driver.find_element(By.ID, 'element_id')

By Name

element = driver.find_element(By.NAME, 'element_name')

By XPath

element = driver.find_element(By.XPATH, '//*[@id="element_id"]')

By CSS Selector

element = driver.find_element(By.CSS_SELECTOR, '.class_name')

Interacting with Elements

Clicking a Button

button = driver.find_element(By.ID, 'submit_button')
button.click()

Sending Text to Input Field

input_field = driver.find_element(By.NAME, 'username')
input_field.send_keys('my_username')

Clearing Text from Input Field

input_field = driver.find_element(By.NAME, 'username')
input_field.clear()

Selecting from a Dropdown

from selenium.webdriver.support.ui import Select

dropdown = Select(driver.find_element(By.ID, 'dropdown_id'))
dropdown.select_by_visible_text('Option Text')

Submitting a Form

form = driver.find_element(By.ID, 'form_id')
form.submit()

Waiting for Elements

To ensure elements are present before interacting, use explicit waits.

Import WebDriverWait and expected_conditions

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Explicit Wait Example

element = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.ID, 'element_id'))
)

Handling Alerts

Accepting an Alert

alert = driver.switch_to.alert
alert.accept()

Dismissing an Alert

alert = driver.switch_to.alert
alert.dismiss()

Navigating Browser History

Going Back

driver.back()

Going Forward

driver.forward()

Refreshing the Page

driver.refresh()

Complete Example

Here’s a consolidated example to demonstrate the discussed concepts:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select

# Initialize WebDriver
driver = webdriver.Chrome()

# Open URL
driver.get('https://example.com')

# Locate and click a button
button = driver.find_element(By.ID, 'submit_button')
button.click()

# Wait for an input field to be present
input_field = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.NAME, 'username'))
)

# Clear and type into the input field
input_field.clear()
input_field.send_keys('my_username')

# Locate and select an option from dropdown
dropdown = Select(driver.find_element(By.ID, 'dropdown_id'))
dropdown.select_by_visible_text('Option Text')

# Submit the form
form = driver.find_element(By.ID, 'form_id')
form.submit()

# Handle alert if present
try:
    alert = WebDriverWait(driver, 10).until(EC.alert_is_present())
    alert.accept()
except:
    print("No alert found")

# Navigate back, forward, and refresh
driver.back()
driver.forward()
driver.refresh()

# Close the browser
driver.quit()

This implementation covers typical web interactions using Selenium WebDriver in Python, such as locating elements, clicking buttons, filling out forms, and navigating browser history.

Advanced WebDriver Techniques

Handling Alerts

Automating the handling of alerts requires understanding the Alert interface in Selenium. Here is an example for interacting with alert boxes:

from selenium import webdriver
from selenium.webdriver.common.alert import Alert

# Assume driver is already initialized and the page loaded has an alert
driver.switch_to.alert.accept()  # Accepts the alert

# If you need to dismiss the alert
driver.switch_to.alert.dismiss()

# Interacting with prompt alerts
alert = driver.switch_to.alert
alert.send_keys("Some text")
alert.accept()

Working with IFrames

Switching to an iframe requires using the switch_to.frame method. Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://example.com')

# Assuming there is an iframe on the page
iframe = driver.find_element_by_tag_name('iframe')
driver.switch_to.frame(iframe)  # Switch to the iframe

# Do operations inside iframe
driver.find_element_by_id('element_in_iframe').click()

# Switch back to the main document
driver.switch_to.default_content()

Taking Screenshots

Taking a screenshot can be useful for debugging or documentation purposes. Here’s how to capture a screenshot:

driver.save_screenshot('screenshot.png')

# Alternative method
driver.get_screenshot_as_file('screenshot_alt.png')

Executing JavaScript

Sometimes, interacting with the web page using pure WebDriver commands may not suffice. In such cases, you can execute JavaScript:

# Scroll to the bottom of the page
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

# Fetching title of the current page
title = driver.execute_script("return document.title;")

Handling Multiple Windows

Managing different windows or browser tabs can be crucial when dealing with multi-page workflows:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Assume driver is already initialized and you open a new window
main_window = driver.current_window_handle

driver.find_element_by_link_text('Open new window').click()

# Wait until the new window/tab is loaded and switch to it
new_window = WebDriverWait(driver, 10).until(EC.new_window_is_opened(driver.window_handles))
driver.switch_to.window(new_window[-1])

# Perform actions on the new window/tab
driver.find_element_by_id('new_element').click()

# Switch back to the original window
driver.close()  # Close current window
driver.switch_to.window(main_window)

Customizing Network Conditions

To simulate various network conditions, Selenium 4 introduced support for Chrome DevTools Protocol:

from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

capabilities = DesiredCapabilities.CHROME.copy()
options = Options()
options.add_argument("--headless")

service = Service('/path/to/chromedriver')
driver = webdriver.Chrome(service=service, options=options, desired_capabilities=capabilities)

# Customizing network conditions
driver.execute_cdp_cmd('Network.enable', {})
driver.execute_cdp_cmd('Network.emulateNetworkConditions', {
  'offline': False,
  'latency': 100,  # Additional latency (ms)
  'downloadThroughput': 500 * 1024 / 8,  # Maximal throughput (bytes per second)
  'uploadThroughput': 500 * 1024 / 8  # Maximal throughput (bytes per second)
})

Conclusion

These implementations provide practical demonstrations of advanced Selenium WebDriver techniques, including handling alerts, iframes, taking screenshots, executing JavaScript, managing multiple windows, and customizing network conditions.

Automated Testing Strategies Using Selenium with Python

Unit 7: Unlock the Power of Selenium with Python to Automate Browser Tasks Efficiently

This unit will cover the practical implementation of automated testing strategies using Selenium with Python.

Structure of Testing Strategies

  1. Writing Test Cases
  2. Running Tests Sequentially
  3. Parallel Testing
  4. Generating Test Reports

Writing Test Cases

import unittest
from selenium import webdriver

class GoogleSearchTest(unittest.TestCase):
    def setUp(self):
        self.driver = webdriver.Chrome()
    
    def test_search_in_google(self):
        driver = self.driver
        driver.get("http://www.google.com")
        self.assertIn("Google", driver.title)
        elem = driver.find_element_by_name("q")
        elem.send_keys("Selenium")
        elem.submit()
        self.assertIn("Selenium", driver.page_source)
    
    def tearDown(self):
        self.driver.quit()

if __name__ == "__main__":
    unittest.main()

Running Tests Sequentially

Using unittest framework, test cases are executed sequentially by default. Ensure all test cases are in the same file or properly discovered.

# Command to run unittest
python -m unittest discover -s tests

Parallel Testing

To run tests in parallel, you can use pytest with the pytest-xdist plugin.


  1. Install pytest and pytest-xdist:


    pip install pytest pytest-xdist


  2. Example pytest test file (test_google.py):


import pytest
from selenium import webdriver

@pytest.fixture
def driver():
    driver = webdriver.Chrome()
    yield driver
    driver.quit()

def test_search_in_google(driver):
    driver.get("http://www.google.com")
    assert "Google" in driver.title
    elem = driver.find_element_by_name("q")
    elem.send_keys("Selenium")
    elem.submit()
    assert "Selenium" in driver.page_source
  1. Command to run tests in parallel:
    pytest -n 4

Generating Test Reports

Use pytest with pytest-html for HTML reports.


  1. Install pytest-html:


    pip install pytest-html


  2. Command to generate a HTML report:


    pytest --html=report.html

Example Project Structure

Here’s an example of how to structure your files for organized test automation.

project/
|-- tests/
|   |-- test_google.py
|-- requirements.txt
|-- report.html

Conclusion

By following these guidelines, you can implement an effective automated testing strategy using Selenium with Python. This structure supports writing test cases, running them sequentially or in parallel, and generating reports to keep track of test results.

Best Practices and Real-World Applications

Efficient Element Locators

Use unique identifiers for locating web elements to ensure stability and performance.

Example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://example.com')

# Use unique IDs or CSS selectors for performance
element = driver.find_element_by_id('uniqueElementId')
element.click()

driver.quit()

Explicit Waits

To handle synchronization issues, use explicit waits instead of hardcoded sleep.

Example:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://example.com')

# Use explicit wait
element = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.ID, 'uniqueElementId'))
)
element.click()

driver.quit()

Page Object Model (POM)

Create reusable and maintainable code using POM.

Example:

class LoginPage:
    def __init__(self, driver):
        self.driver = driver
        self.username_locator = 'user'
        self.password_locator = 'pass'
        self.login_button_locator = 'login'

    def login(self, username, password):
        self.driver.find_element_by_name(self.username_locator).send_keys(username)
        self.driver.find_element_by_name(self.password_locator).send_keys(password)
        self.driver.find_element_by_name(self.login_button_locator).click()


class TestLogin:
    def test_login(self):
        driver = webdriver.Chrome()
        driver.get('http://example.com/login')

        login_page = LoginPage(driver)
        login_page.login('myusername', 'mypassword')

        driver.quit()

Exception Handling

Handle exceptions gracefully to ensure robust automation scripts.

Example:

from selenium.common.exceptions import NoSuchElementException, TimeoutException
from selenium import webdriver

driver = webdriver.Chrome()

try:
    driver.get('http://example.com')
    element = driver.find_element_by_id('nonExistingElement')
    element.click()
except NoSuchElementException as e:
    print(f'Error: {e}')
except TimeoutException as e:
    print(f'Error: {e}')
finally:
    driver.quit()

Headless Browsers

Run tests in headless mode for CI/CD pipelines to save resources.

Example:

from selenium.webdriver.chrome.options import Options
from selenium import webdriver

options = Options()
options.headless = True

driver = webdriver.Chrome(options=options)
driver.get('http://example.com')

print(driver.title)

driver.quit()

Real-World Application: Automated Form Submission

A common real-world application is automating form submission for data entry tasks.

Example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('http://example.com/form')

# Assume form has fields with unique name attributes
driver.find_element_by_name('firstName').send_keys('John')
driver.find_element_by_name('lastName').send_keys('Doe')
driver.find_element_by_name('email').send_keys('john.doe@example.com')
driver.find_element_by_name('submit').click()

driver.quit()

These examples are designed to be directly applicable and demonstrate various best practices for using Selenium in real-world scenarios. Implement these to enhance the efficiency and robustness of your automation scripts.

Related Posts