Mastering Data Filtering in SQL

by | SQL

Setting Up Your SQL Environment

Step 1: Install MySQL Server

Download MySQL Server from the official website
Run the installer
Choose the setup type (Typical, Complete, or Custom)
Follow the installation prompts to complete the installation

Step 2: Install MySQL Workbench

Download MySQL Workbench
Run the installer
Follow the installation prompts to complete the installation

Step 3: Configure MySQL Server

Open MySQL Workbench
Connect to MySQL Server
Click on the + icon to create a new connection
Enter connection name, hostname (localhost), port (default 3306), username (root), and password
Test connection
Click Test Connection to ensure connectivity

Step 4: Setup Initial Database and Table


  1. Create a Database


    CREATE DATABASE sample_db;


  2. Use the Database


    USE sample_db;


  3. Create a Table


    CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10, 2)
    );


  4. Insert Sample Data


    INSERT INTO employees (first_name, last_name, department, salary) VALUES
    ('John', 'Doe', 'Engineering', 60000.00),
    ('Jane', 'Smith', 'Sales', 55000.00),
    ('Sam', 'Brown', 'HR', 50000.00);

Verification

  1. Run a Basic Query
    SELECT * FROM employees;

Conclusion

This setup will allow you to proceed with your project, filtering data tables using SQL. Ensure that MySQL Server and Workbench are functioning correctly and verify your initial data setup using the provided SQL commands.

SQL Implementation: Basic SELECT and WHERE Clauses

-- Select specific columns from a table
SELECT first_name, last_name, email
FROM customers;

-- Select all columns from a table
SELECT *
FROM orders;

-- Filter rows based on a condition
SELECT order_id, order_date, customer_id
FROM orders
WHERE order_date = '2023-10-01';

-- Filter rows with multiple conditions using AND
SELECT product_id, product_name, price
FROM products
WHERE price > 100 AND category = 'Electronics';

-- Filter rows with multiple conditions using OR
SELECT employee_id, first_name, department
FROM employees
WHERE department = 'Sales' OR department = 'Marketing';

-- Using a NOT condition
SELECT customer_id, first_name, last_name
FROM customers
WHERE NOT country = 'USA';

-- Using LIKE for pattern matching
SELECT product_id, product_name
FROM products
WHERE product_name LIKE 'A%'; -- Products that start with 'A'

-- Using IN for filtering specific values
SELECT customer_id, first_name, last_name
FROM customers
WHERE country IN ('USA', 'Canada', 'UK');

-- Using BETWEEN for range filtering
SELECT order_id, order_date, total_amount
FROM orders
WHERE total_amount BETWEEN 500 AND 1000;

-- Combining multiple clauses
SELECT employee_id, first_name, last_name, department
FROM employees
WHERE department = 'Engineering' AND hire_date BETWEEN '2020-01-01' AND '2023-10-01';

Each SQL query should help you practice how to retrieve and filter data efficiently from your database tables. You may run these queries in your SQL environment to see their real-life utility.

-- Filtering records from a 'customers' table

-- Assuming we have a 'customers' table structured as follows:
-- customers (customer_id INT, name VARCHAR, age INT, country VARCHAR, active BOOLEAN)

-- Filtering customers who are active AND from the USA
SELECT *
FROM customers
WHERE active = true
  AND country = 'USA';

-- Filtering customers who are either from the USA OR older than 30
SELECT *
FROM customers
WHERE country = 'USA'
   OR age > 30;

-- Filtering customers who are active AND (from the USA OR older than 30)
SELECT *
FROM customers
WHERE active = true
  AND (country = 'USA' OR age > 30);

-- Filtering customers who are not active OR are younger than 25
SELECT *
FROM customers
WHERE active = false
   OR age < 25;

-- Filtering customers who are not from the USA and are older than 40 or inactive
SELECT *
FROM customers
WHERE country  'USA'
  AND (age > 40 OR active = false);
-- Let's assume we have a table named 'employees' with columns 'id', 'name', and 'email'

-- Pattern Matching with LIKE
-- Fetch employees whose names start with 'A'
SELECT * FROM employees
WHERE name LIKE 'A%';

-- Fetch employees whose email contains 'example'
SELECT * FROM employees
WHERE email LIKE '%example%';

-- Regular Expressions with PostgreSQL
-- Fetch employees whose name contains exactly 5 letters
SELECT * FROM employees
WHERE name ~ '^\w{5}
ย 
#39;; -- Fetch employees whose email ends with '.com' SELECT * FROM employees WHERE email ~ '\.com
ย 
#39;; -- MySQL REGEXP examples -- Fetch employees whose name starts with 'A' and ends with 'n' SELECT * FROM employees WHERE name REGEXP '^A.*n
ย 
#39;; -- Fetch employees whose email contains a digit SELECT * FROM employees WHERE email REGEXP '[0-9]'; 

Notes:

In these examples, LIKE uses % as a wildcard for any sequence of characters.
The ~ operator in PostgreSQL is used for regular expressions.
REGEXP is used for regular expressions in MySQL.

Apply these queries to filter data in your own SQL database environment.

Working with NULL Values in SQL

1. Selecting Rows with NULL Values

-- Select rows where 'column_name' is NULL
SELECT *
FROM your_table
WHERE column_name IS NULL;

2. Selecting Rows Without NULL Values

-- Select rows where 'column_name' is NOT NULL
SELECT *
FROM your_table
WHERE column_name IS NOT NULL;

3. Handling NULL with COALESCE

-- Replace NULL with a default value
SELECT COALESCE(column_name, 'default_value') AS column_name
FROM your_table;

4. Using NULLIF

-- Return NULL if two expressions are equal
SELECT NULLIF(column1, column2) AS result
FROM your_table;

5. Conditional Count Ignoring NULLs

-- Count rows ignoring NULLs in 'column_name'
SELECT COUNT(column_name) AS non_null_count
FROM your_table
WHERE column_name IS NOT NULL;

6. SUM Ignoring NULLs

-- Calculate sum ignoring NULLs
SELECT SUM(column_name) AS total_sum
FROM your_table;

7. Updating NULL Values

-- Update NULL values to a specified default value
UPDATE your_table
SET column_name = 'default_value'
WHERE column_name IS NULL;

8. Joining Tables Handling NULLs

-- Left join, ensuring rows with NULL in 'column_b' are maintained
SELECT a.*, b.*
FROM table_a a
LEFT JOIN table_b b ON a.id = b.a_id
WHERE b.column_b IS NULL OR b.column_b IS NOT NULL;

These SQL snippets can be directly employed in real-life projects to effectively work with NULL values in Data Engineering tasks.

Date Filtering and Range Queries in SQL

Date Filtering

Scenario: You have a ‘transactions’ table with a ‘transaction_date’ column.

-- Filtering transactions on a specific date
SELECT * 
FROM transactions 
WHERE transaction_date = '2023-10-01';

Scenario: Filtering transactions in a specific month.

-- Using the MONTH() and YEAR() functions 
SELECT * 
FROM transactions 
WHERE MONTH(transaction_date) = 10 
AND YEAR(transaction_date) = 2023;

Range Queries

Scenario: You need to find transactions between two specific dates.

-- Filtering transactions between two dates
SELECT *
FROM transactions
WHERE transaction_date BETWEEN '2023-10-01' AND '2023-10-31';

Scenario: Find transactions before a certain date or after a certain date.

-- Filtering transactions before a specific date
SELECT *
FROM transactions
WHERE transaction_date  '2023-10-01';

Scenario: Find transactions in the last 7 days.

-- Using DATE_SUB() to get transactions from the last 7 days
SELECT *
FROM transactions
WHERE transaction_date >= CURDATE() - INTERVAL 7 DAY;

Practical Query Combination

Scenario: Complex query combining date filtering and range queries to find transactions in the last month that are over $1000.

-- Combining date range and amount filtering
SELECT *
FROM transactions
WHERE transaction_date >= CURDATE() - INTERVAL 1 MONTH
AND amount > 1000;

This implementation allows filtering based on specific dates, ranges, and conditions, providing practical use cases for date-based data selection in SQL.

Related Posts