SQL (Structured Query Language)

standard language for accessing and manipulating databases

Important

although SQL is an ANSI/ISO standard, there are different versions of the SQL language most of the SQL database programs also have their own proprietary extensions in addition to the SQL standard each database uses its own variation of SQL

basic operations:
- execute queries against a database
- retrieve data from a database
- insert records in a database
- update records in a database
- delete records from a database
- create new databases
- create new tables in a database
- create stored procedures in a database
- create views in a database
- set permissions on tables, procedures, and views

order of operations in SQL query:

FROM => JOIN => WHERE => GROUP BY => HAVING => SELECT => ORDER BY => LIMIT

SELECT: retrieve data from database
- *: select all
FROM: select the target(s) table(s)
- list the tables from where data is fetched
- able to create an alias for a table: FROM users u
WHERE: filter records based on conditions
- BETWEEN: selects values within given range (values can be numbers, text or dates)
- LIKE: filter records that matches the string
  - wildcards: used for pattern matching
    - %: zero or more chars
      - SELECT * FROM products WHERE product_name LIKE 'banana%';
    - _: single char
      - SELECT * FROM products WHERE product_name LIKE '__oot';
GROUP BY: group records based on one or more columns
- joins rows of a column that have same value
- you must pass HOW to group the other values in the remaining columns
- SELECT column1, COUNT(*) FROM table_name GROUP BY column1;
HAVING: similar to the WHERE clause, but
- operates on groups after they've been grouped, rather than rows before they've been grouped
- IMPORTANT: you can't use HAVING before GROUP BY
ORDER BY: order records based on one or more columns
LIMIT: limit the number of rows returned
- LIMIT 3;: show only the first 3 results
OFFSET: skip a number of rows before starting to return results
- OFFSET 5;: skip first 5 results and show the rest
INSERT: add new rows to a table
- data types: syntax is different depending on the DB (mysql, postgresql, etc)
  - numeric
    - data types that exist on mysql and postgresql
      - NUMERIC(precision, scale)
        
        precision: total number of digits that can be stored
        
        scale: maximum number of digits to the right of the decimal point
      - INTEGER
      - DECIMAL
      - FLOAT
      - SMALLINT
      - BIGINT
      - DOUBLE PRECISION
    - only mysql
      - DOUBLE
    - only postgresql
      - REAL
  - character string: CHAR, VARCHAR
  - binary: BINARY(32)
  - boolean: BOOLEAN
  - date and time: DATE, TIME, DATETIME, TIMESTAMP
UPDATE: modify existing data in a table
DELETE: remove data from a table
CREATE TABLE: create new table in the database
- FOREIGN KEY: creates a column with the values of a column in another table
  - CASCADE: defines what happens to dependent rows in the foreign key table when the referenced row is updated or deleted
    - ON DELETE: when a record in the primary table is deleted, any records in the foreign key table that reference the deleted primary key will also be deleted
    - ON UPDATE: changes on primary table also apply to child tables
ALTER table: modify existing table in the database
DROP table: delete a table and its data from the database
TRUNCATE table: delete all rows from a table without deleting the table itself
RENAME: ALTER TABLE users RENAME TO customers;
- postgresql syntax

Warning

when creating a table, the last attribute should not have a comma after it, just like in the example below

CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    Name VARCHAR(100) NOT NULL,
    Email VARCHAR(100) UNIQUE
);

CREATE DATABASE: create new database
DROP mydatabase: delete a database and all its contents
set operations: combine results from two SELECT statements (they must have the same number of columns and compatible types)
- UNION: combines and remove duplicates
- INTERSECT: returns only rows that appear in both result sets
- EXCEPT(postgresql) or MINUS(oracle): returns rows from the first set that aren't in the second

SELECT name FROM students
UNION
SELECT name FROM teachers;

restrictions: enforce rules on tables/columns to ensure data integrity
- table restrictions
  - column restrictions
    - NULL
      - you can't compare NULL with equal sign (e.g. WHERE NOME=NULL)
      - use IS keywork (e.g. WHERE NOME IS NULL)
    - NOT NULL
    - UNIQUE
    - FOREIGN KEY
    - CHECK enforces a condition on values in a column
- assertions: apply conditions across the whole table
constraints: enforce rules on table's data
- PRIMARY KEY: value can't be NULL and has to be UNIQUE
  - the primary key can be one column or a set of columns (e.g. PRIMARY KEY (user_id, order_id))
- FOREIGN KEY: links the value of a column in table1 to the value of another column in table2
  - CASCADE: explained before
- UNIQUE: all values in the column are distinct
- CHECK: all values in the column satisfy a condition
  - CONSTRAINT chk_stock CHECK (stock_quantity >= 0)
- DEFAULT: defines the initial value of a column when inserting a new row
  - balance DECIMAL(10, 2) DEFAULT 0.00
- NOT NULL: a column cannot have NULL value
  - email VARCHAR(100) NOT NULL
ALIAS: temporary name for column
- select firstname as "first name", lastname as "last name" from employees;
DISTINCT: removes duplicate rows from the results of a query
- SELECT DISTINCT city FROM customers;
aggregate functions: return only 1 result
- SUM(): return the sum of all values in the column
- AVG(): return the average value of the column
- MIN(): return minimum value
- MAX(): return maximum value
- COUNT(column_or_table): count number of rows in result set with non-NULL values in specified column or table

Important

aggregate functions CAN be used inside: SELECT, HAVING, ORDER BY aggregate functions CANNOT be used inside: FROM, WHERE

AS: rename a table or column temporarily in a query
- column aliasing: SELECT name AS full_name FROM users;
- table aliasing: SELECT u.name FROM users AS u;
ON: tells sql which columns to compare to join rows from two tables
- defines how to join rows by matching column values
- use cases:
  - JOIN
  - CREATE TRIGGER
  - FOREIGN KEY
  - GRANT/REVOKE
  - CREATE INDEX

SELECT *
FROM orders o
JOIN customers c ON o.customer_id = c.id;

named constraints

rules that allow data to be entered into a table only if it meets the predefined conditions

examples
- NOT NULL: ensures a column cannot have NULL value
- UNIQUE: ensures all values in a column are different
  - can also be used to ensure a combination of columns is unique, also called "composite unique constraint"
- PRIMARY KEY: uniquely identifies each row in a table
- FOREIGN KEY: ensures referential integrity between tables
- DEFAULT: sets a default value for a column
- CHECK: ensures all values in a column satisfy certain conditions
adding a constraint: ALTER TABLE users ADD CONSTRAINT email_unique UNIQUE (email);
- another example: ALTER TABLE users ADD CONSTRAINT primary_key PRIMARY KEY (city, name, id);
renaming a constraint: ALTER TABLE users RENAME CONSTRAINT email_unique TO unique_email_address;
removing a constraint: ALTER TABLE users DROP CONSTRAINT email_unique;
naming a constraint: ALTER TABLE users ADD CONSTRAINT fk_city FOREIGN KEY (city_id) REFERENCES cities(id);

CREATE TABLE workout_exercises (
  id UUID PRIMARY KEY,
  workout_id UUID NOT NULL REFERENCES workouts(id) ON DELETE CASCADE,
  exercise_id UUID NOT NULL REFERENCES exercises(id) ON DELETE CASCADE,
  sets INTEGER,
  reps INTEGER,
  order_number INTEGER,
  CONSTRAINT unique_workout_exercise UNIQUE (workout_id, exercise_id)
);

Important

try to name constraints when creating them

subqueries

IN: filter results from another sql query
- checks if a value exists in a list or subquery result
- use when the subquery may return multiple values
EXISTS
ALL
SOME
ANY

SELECT id, song_name, artist_id
FROM songs
WHERE artist_id IN (
    SELECT id
    FROM artists
    WHERE artist_name LIKE 'Rick%'
);

use = when the subquery is expected to return a single value
- if the subquery returns more than one value, the query will fail with an error
- this operator is used for one-to-one comparisons

SELECT *
FROM transactions
WHERE user_id = (
    SELECT id
    FROM users
    WHERE name = 'David'
);

join

combines data from multiple tables into a single result set

combines related data
reduces data redundancy
simplify queries

types of join	join conditions
INNER JOIN	NATURAL
LEFT OUTER JOIN	ON PREDICATE
RIGHT OUTER JOIN	USING (A1, A2, ...)
FULL OUTER JOIN	---

natural join: automatically joins matching values
- simpler join operation
inner join: intersection
left join: left table
right join: right table
full join: L U R

INNER JOIN

returns only rows where there's a match in both tables

SELECT *
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;

SELECT students.name, classes.name
FROM students
INNER JOIN classes on classes.class_id = students.class_id;

table_name.column_name
the ON clause specifies the condition to join the tables
- if the columns on the ON clause have the same name, the column won't appear twice in the result
when joining tables, if there are columns with same name, they will appear twice in the result

LEFT OUTER JOIN = LEFT JOIN

returns all rows from the left table, plus matching rows from the right table (NULLs if no match)

has the same number of rows as the left table
adds matching data from the right table (or NULL if there's no match)

SELECT e.name, d.name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.id;

RIGHT OUTER JOIN (not supported by SQLite)

opposite of left join

has the same number of rows as the right table
adds matching data from the left table (or NULL if there's no match)
not supported in SQLite

example: returns all departments, even if no employee is assigned

SELECT employees.name, departments.name
FROM employees
RIGHT JOIN departments ON employees.dept_id = departments.id;

FULL OUTER JOIN = FULL JOIN (not supported by SQLite)

returns all rows from both tables, with NULLs where there's no match on either side

number of rows in result = rows that matched + rows only in LEFT table with no match in RIGHT + rows only in RIGHT table with no match in LEFT

not supported in SQLite and MySQL

example: returns all employees and all departments. If there's no match, it still shows the row with NULL values

SELECT employees.name, departments.name
FROM employees
FULL OUTER JOIN departments ON employees.dept_id = departments.id;

trigger

events that can trigger: SELECT, UPDATE, DELETE
- can happen BEFORE or AFTER event

CREATE TRIGGER log_insert
AFTER INSERT ON users
FOR EACH ROW
EXECUTE FUNCTION log_user_insert();

view

alias for a SQL query that can be treated as a table

CREATE VIEW it_employees AS SELECT id, first_name, last_name, salary
FROM employees
WHERE department = 'IT';

performance

index

performance optimization structure that allows the database to retrieve rows more quickly

creates a binary tree
faster to look up values in a column
- O(log n)
slower write operations
primary keys are indexed by default
it's fairly common to name an index after the column it's created on with a suffix of _idx
you shouldn't index too many columns
- indexes take up space
- create performance overhead when inserting or updating data
- each time you insert a record, that record needs to be added to many trees

CREATE INDEX email_idx ON users(email);

remove index query
- postgresql: DROP INDEX email_idx;
- mysql: ALTER TABLE your_table DROP INDEX index_name;

multi-column indexes

speed up look ups that depend on multiple columns
only add multi-column indexes if you're doing frequent lookups on a specific combination of columns
A multi-column index is sorted by the first column first, the second column next, and so forth
first column lookup in multi-column index has same performance as a single-column index
2nd or 3rd columns lookups in multi-column index have bad performance compared to single-column lookups

CREATE INDEX first_name_last_name_age_idx
ON users (first_name, last_name, age);

transaction control

manage changes made by a transaction and endure database integrity

use transaction control (BEGIN, COMMIT, ROLLBACK) when:
- you’re doing multiple related updates that must succeed or fail together
- you’re updating multiple tables and want consistency
- you want to recover from errors during the process (e.g., using SAVEPOINT)
BEGIN/START TRANSACTION: starts a new transaction
- use case: when you want multiple SQL statements to be executed as a single atomic unit
  - atomic unit: either all of the sql statements succeed or none of them do
COMMIT: makes all changes made in the current transaction permanent
- use case: after a set of successful operations, use COMMIT to save them to the database

BEGIN;

INSERT INTO accounts (user_id, balance) VALUES (1, 1000);
UPDATE accounts SET balance = balance - 200 WHERE user_id = 1;

COMMIT;

ROLLBACK: undo changes made in the current transaction
- if something goes wrong, cancel all operations since the last BEGIN

BEGIN;

UPDATE accounts SET balance = balance - 200 WHERE user_id = 1;

-- Something goes wrong
ROLLBACK;

SAVEPOINT: sets a named point within a transaction that you can roll back to without rolling back the entire transaction
- use case: useful in large transactions where you may want to undo only part of it

BEGIN;

INSERT INTO accounts (user_id, balance) VALUES (2, 500);
SAVEPOINT after_insert;

UPDATE accounts SET balance = balance / 0 WHERE user_id = 2; -- This will cause an error

ROLLBACK TO after_insert; -- Undo only the update, keep the insert

COMMIT; -- Save the insert

queries

CREATE TABLE employees(
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    age INTEGER,
    is_manager BOOLEAN,
    salary INTEGER
);

SELECT id, name
FROM users
WHERE id = 1;

SELECT employee_name, salary
FROM employees
WHERE salary BETWEEN 30000 and 60000;

SELECT product_name, quantity
FROM products
WHERE quantity NOT BETWEEN 20 and 100;

SELECT *
FROM Orders
WHERE OrderDate BETWEEN '01/07/1996' AND '31/07/1996';

delete all records in users table:

DELETE FROM users;

DELETE FROM users
WHERE email = 'example@email.com';

Important

use = when comparing non-NULL values use IS NULL or IS NOT NULL only when checking for NULLs

DELETE FROM users
WHERE email IS NULL;

SELECT name, price, quantity
FROM products
ORDER BY price DESC;

ALTER TABLE employees
RENAME TO contractors;

ALTER TABLE employees
ADD birthdate DATE;

INSERT INTO employees (name, age, is_manager, salary)
VALUES ('John Doe', 30, true, 50000);

ALTER TABLE contractors
RENAME COLUMN salary TO invoice;

UPDATE users
SET name = 'John Doe', salary = 20000
WHERE id = 5;

CREATE TABLE departments (
    id INTEGER PRIMARY KEY,
    department_name TEXT NOT NULL
);

CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    department_id INTEGER,
    CONSTRAINT fk_departments
    FOREIGN KEY (department_id) REFERENCES departments(id)
);

SELECT COUNT(*) FROM tableName;

SELECT COUNT(columnName) FROM tableName WHERE condition;

SELECT SUM(columnName) FROM tableName;

SELECT SUM(columnName) FROM tableName WHERE condition;

SELECT max(price)
FROM products;

SELECT product_name, min(price)
FROM products;

SELECT AVG(columnName) FROM tableName;

SELECT AVG(columnName) FROM tableName WHERE condition;

SELECT album_id, count(song_id)
FROM songs
GROUP BY album_id;

SELECT album_id, count(id) as count
FROM songs
GROUP BY album_id
HAVING count > 5;

CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    Name VARCHAR(100)
);

CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    Product VARCHAR(100),
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID) ON DELETE CASCADE
);

TRUNCATE TABLE Employees;

CREATE TABLE Departments (
    DeptID INT PRIMARY KEY,
    DeptName VARCHAR(100)
);

CREATE TABLE Employees (
    EmpID INT PRIMARY KEY,
    EmpName VARCHAR(100),
    DeptID INT,
    FOREIGN KEY (DeptID) REFERENCES Departments(DeptID) ON UPDATE CASCADE
);

example combining several constraints:

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    department_id INT,
    manager_id INT,
    salary DECIMAL(10, 2) CHECK (salary > 0),
    hire_date DATE DEFAULT CURRENT_DATE,
    CONSTRAINT fk_department
        FOREIGN KEY (department_id) REFERENCES departments (department_id)
        ON DELETE SET NULL
        ON UPDATE CASCADE,
    CONSTRAINT fk_manager
        FOREIGN KEY (manager_id) REFERENCES employees (employee_id)
        ON DELETE SET NULL
        ON UPDATE CASCADE
);

SELECT product_name, shipment_status
FROM products
WHERE shipment_status IN ('shipped', 'preparing', 'out of stock');

SELECT EmployeeName
FROM Employees
WHERE EmployeeID IN (SELECT EmployeeID FROM Orders WHERE OrderDate > '2020-01-01');

SELECT email
FROM employees
INTERSECT
SELECT email
FROM contractors;

SELECT email
FROM employees
UNION
SELECT email
FROM contractors;

SELECT email
FROM employees
MINUS
SELECT email
FROM contractors;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL (Structured Query Language)

named constraints

subqueries

join

INNER JOIN

LEFT OUTER JOIN = LEFT JOIN

RIGHT OUTER JOIN (not supported by SQLite)

FULL OUTER JOIN = FULL JOIN (not supported by SQLite)

trigger

view

performance

index

multi-column indexes

transaction control

queries

FilesExpand file tree

sql.md

Latest commit

History

sql.md

File metadata and controls

SQL (Structured Query Language)

named constraints

subqueries

join

INNER JOIN

LEFT OUTER JOIN = LEFT JOIN

RIGHT OUTER JOIN (not supported by SQLite)

FULL OUTER JOIN = FULL JOIN (not supported by SQLite)

trigger

view

performance

index

multi-column indexes

transaction control

queries