databases postgresql MySQL MongoDB
standard language for accessing and manipulating databases
Important
although SQL is an ANSI/ISO standard, there are different versions of the SQL language most of the SQL database programs also have their own proprietary extensions in addition to the SQL standard each database uses its own variation of SQL
- basic operations:
- execute queries against a database
- retrieve data from a database
- insert records in a database
- update records in a database
- delete records from a database
- create new databases
- create new tables in a database
- create stored procedures in a database
- create views in a database
- set permissions on tables, procedures, and views
order of operations in SQL query:
FROM => JOIN => WHERE => GROUP BY => HAVING => SELECT => ORDER BY => LIMIT
-
SELECT: retrieve data from database*: select all
-
FROM: select the target(s) table(s)- list the tables from where data is fetched
- able to create an alias for a table:
FROM users u
-
WHERE: filter records based on conditionsBETWEEN: selects values within given range (values can be numbers, text or dates)LIKE: filter records that matches the string- wildcards: used for pattern matching
%: zero or more charsSELECT * FROM products WHERE product_name LIKE 'banana%';
_: single charSELECT * FROM products WHERE product_name LIKE '__oot';
- wildcards: used for pattern matching
-
GROUP BY: group records based on one or more columns- joins rows of a column that have same value
- you must pass HOW to group the other values in the remaining columns
SELECT column1, COUNT(*) FROM table_name GROUP BY column1;
-
HAVING: similar to theWHEREclause, but- operates on groups after they've been grouped, rather than rows before they've been grouped
- IMPORTANT: you can't use
HAVINGbeforeGROUP BY
-
ORDER BY: order records based on one or more columns -
LIMIT: limit the number of rows returnedLIMIT 3;: show only the first 3 results
-
OFFSET: skip a number of rows before starting to return resultsOFFSET 5;: skip first 5 results and show the rest
-
INSERT: add new rows to a table- data types: syntax is different depending on the DB (mysql, postgresql, etc)
- numeric
- data types that exist on mysql and postgresql
NUMERIC(precision, scale)- precision: total number of digits that can be stored
- scale: maximum number of digits to the right of the decimal point
- INTEGER
- DECIMAL
- FLOAT
- SMALLINT
- BIGINT
- DOUBLE PRECISION
- only mysql
- DOUBLE
- only postgresql
- REAL
- data types that exist on mysql and postgresql
- character string: CHAR, VARCHAR
- binary: BINARY(32)
- boolean: BOOLEAN
- date and time:
DATE,TIME,DATETIME,TIMESTAMP
- numeric
- data types: syntax is different depending on the DB (mysql, postgresql, etc)
-
UPDATE: modify existing data in a table -
DELETE: remove data from a table -
CREATE TABLE: create new table in the databaseFOREIGN KEY: creates a column with the values of a column in another tableCASCADE: defines what happens to dependent rows in the foreign key table when the referenced row is updated or deletedON DELETE: when a record in the primary table is deleted, any records in the foreign key table that reference the deleted primary key will also be deletedON UPDATE: changes on primary table also apply to child tables
-
ALTER table: modify existing table in the database -
DROP table: delete a table and its data from the database -
TRUNCATE table: delete all rows from a table without deleting the table itself -
RENAME:ALTER TABLE users RENAME TO customers;- postgresql syntax
Warning
when creating a table, the last attribute should not have a comma after it, just like in the example below
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
Email VARCHAR(100) UNIQUE
);-
CREATE DATABASE: create new database -
DROP mydatabase: delete a database and all its contents -
set operations: combine results from two
SELECTstatements (they must have the same number of columns and compatible types)UNION: combines and remove duplicatesINTERSECT: returns only rows that appear in both result setsEXCEPT(postgresql) orMINUS(oracle): returns rows from the first set that aren't in the second
SELECT name FROM students
UNION
SELECT name FROM teachers;-
restrictions: enforce rules on tables/columns to ensure data integrity
- table restrictions
- column restrictions
NULL- you can't compare
NULLwith equal sign (e.g.WHERE NOME=NULL) - use
ISkeywork (e.g.WHERE NOME IS NULL)
- you can't compare
NOT NULLUNIQUEFOREIGN KEYCHECKenforces a condition on values in a column
- column restrictions
- assertions: apply conditions across the whole table
- table restrictions
-
constraints: enforce rules on table's data
PRIMARY KEY: value can't be NULL and has to be UNIQUE- the primary key can be one column or a set of columns (e.g.
PRIMARY KEY (user_id, order_id))
- the primary key can be one column or a set of columns (e.g.
FOREIGN KEY: links the value of a column in table1 to the value of another column in table2CASCADE: explained before
UNIQUE: all values in the column are distinctCHECK: all values in the column satisfy a conditionCONSTRAINT chk_stock CHECK (stock_quantity >= 0)
DEFAULT: defines the initial value of a column when inserting a new rowbalance DECIMAL(10, 2) DEFAULT 0.00
NOT NULL: a column cannot have NULL valueemail VARCHAR(100) NOT NULL
-
ALIAS: temporary name for columnselect firstname as "first name", lastname as "last name" from employees;
-
DISTINCT: removes duplicate rows from the results of a querySELECT DISTINCT city FROM customers;
-
aggregate functions: return only 1 result
SUM(): return the sum of all values in the columnAVG(): return the average value of the columnMIN(): return minimum valueMAX(): return maximum valueCOUNT(column_or_table): count number of rows in result set with non-NULL values in specified column or table
Important
aggregate functions CAN be used inside: SELECT, HAVING, ORDER BY
aggregate functions CANNOT be used inside: FROM, WHERE
AS: rename a table or column temporarily in a query- column aliasing:
SELECT name AS full_name FROM users; - table aliasing:
SELECT u.name FROM users AS u;
- column aliasing:
ON: tells sql which columns to compare to join rows from two tables- defines how to join rows by matching column values
- use cases:
JOINCREATE TRIGGERFOREIGN KEYGRANT/REVOKECREATE INDEX
SELECT *
FROM orders o
JOIN customers c ON o.customer_id = c.id;rules that allow data to be entered into a table only if it meets the predefined conditions
-
examples
NOT NULL: ensures a column cannot have NULL valueUNIQUE: ensures all values in a column are different- can also be used to ensure a combination of columns is unique, also called "composite unique constraint"
PRIMARY KEY: uniquely identifies each row in a tableFOREIGN KEY: ensures referential integrity between tablesDEFAULT: sets a default value for a columnCHECK: ensures all values in a column satisfy certain conditions
-
adding a constraint:
ALTER TABLE users ADD CONSTRAINT email_unique UNIQUE (email);- another example:
ALTER TABLE users ADD CONSTRAINT primary_key PRIMARY KEY (city, name, id);
- another example:
-
renaming a constraint:
ALTER TABLE users RENAME CONSTRAINT email_unique TO unique_email_address; -
removing a constraint:
ALTER TABLE users DROP CONSTRAINT email_unique; -
naming a constraint:
ALTER TABLE users ADD CONSTRAINT fk_city FOREIGN KEY (city_id) REFERENCES cities(id);
CREATE TABLE workout_exercises (
id UUID PRIMARY KEY,
workout_id UUID NOT NULL REFERENCES workouts(id) ON DELETE CASCADE,
exercise_id UUID NOT NULL REFERENCES exercises(id) ON DELETE CASCADE,
sets INTEGER,
reps INTEGER,
order_number INTEGER,
CONSTRAINT unique_workout_exercise UNIQUE (workout_id, exercise_id)
);Important
try to name constraints when creating them
IN: filter results from another sql query- checks if a value exists in a list or subquery result
- use when the subquery may return multiple values
EXISTSALLSOMEANY
SELECT id, song_name, artist_id
FROM songs
WHERE artist_id IN (
SELECT id
FROM artists
WHERE artist_name LIKE 'Rick%'
);- use
=when the subquery is expected to return a single value- if the subquery returns more than one value, the query will fail with an error
- this operator is used for one-to-one comparisons
SELECT *
FROM transactions
WHERE user_id = (
SELECT id
FROM users
WHERE name = 'David'
);combines data from multiple tables into a single result set
- combines related data
- reduces data redundancy
- simplify queries
| types of join | join conditions |
|---|---|
| INNER JOIN | NATURAL |
| LEFT OUTER JOIN | ON PREDICATE |
| RIGHT OUTER JOIN | USING (A1, A2, ...) |
| FULL OUTER JOIN | --- |
- natural join: automatically joins matching values
- simpler join operation
- inner join: intersection
- left join: left table
- right join: right table
- full join: L U R
returns only rows where there's a match in both tables
SELECT *
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;SELECT students.name, classes.name
FROM students
INNER JOIN classes on classes.class_id = students.class_id;table_name.column_name- the
ONclause specifies the condition to join the tables- if the columns on the
ONclause have the same name, the column won't appear twice in the result
- if the columns on the
- when joining tables, if there are columns with same name, they will appear twice in the result
returns all rows from the left table, plus matching rows from the right table (NULLs if no match)
- has the same number of rows as the left table
- adds matching data from the right table (or
NULLif there's no match)
SELECT e.name, d.name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.id;opposite of left join
-
has the same number of rows as the right table
-
adds matching data from the left table (or
NULLif there's no match) -
not supported in SQLite
example: returns all departments, even if no employee is assigned
SELECT employees.name, departments.name
FROM employees
RIGHT JOIN departments ON employees.dept_id = departments.id;returns all rows from both tables, with NULLs where there's no match on either side
number of rows in result = rows that matched + rows only in LEFT table with no match in RIGHT + rows only in RIGHT table with no match in LEFT
- not supported in SQLite and MySQL
example: returns all employees and all departments. If there's no match, it still shows the row with NULL values
SELECT employees.name, departments.name
FROM employees
FULL OUTER JOIN departments ON employees.dept_id = departments.id;- events that can trigger:
SELECT,UPDATE,DELETE- can happen
BEFOREorAFTERevent
- can happen
CREATE TRIGGER log_insert
AFTER INSERT ON users
FOR EACH ROW
EXECUTE FUNCTION log_user_insert();alias for a SQL query that can be treated as a table
CREATE VIEW it_employees AS SELECT id, first_name, last_name, salary
FROM employees
WHERE department = 'IT';performance optimization structure that allows the database to retrieve rows more quickly
- creates a binary tree
- faster to look up values in a column
- O(log n)
- slower write operations
- primary keys are indexed by default
- it's fairly common to name an index after the column it's created on with a suffix of
_idx - you shouldn't index too many columns
- indexes take up space
- create performance overhead when inserting or updating data
- each time you insert a record, that record needs to be added to many trees
CREATE INDEX email_idx ON users(email);- remove index query
- postgresql:
DROP INDEX email_idx; - mysql:
ALTER TABLE your_table DROP INDEX index_name;
- postgresql:
- speed up look ups that depend on multiple columns
- only add multi-column indexes if you're doing frequent lookups on a specific combination of columns
- A multi-column index is sorted by the first column first, the second column next, and so forth
- first column lookup in multi-column index has same performance as a single-column index
- 2nd or 3rd columns lookups in multi-column index have bad performance compared to single-column lookups
CREATE INDEX first_name_last_name_age_idx
ON users (first_name, last_name, age);manage changes made by a transaction and endure database integrity
-
use transaction control (
BEGIN,COMMIT,ROLLBACK) when:- you’re doing multiple related updates that must succeed or fail together
- you’re updating multiple tables and want consistency
- you want to recover from errors during the process (e.g., using SAVEPOINT)
-
BEGIN/START TRANSACTION: starts a new transaction- use case: when you want multiple SQL statements to be executed as a single atomic unit
- atomic unit: either all of the sql statements succeed or none of them do
- use case: when you want multiple SQL statements to be executed as a single atomic unit
-
COMMIT: makes all changes made in the current transaction permanent- use case: after a set of successful operations, use
COMMITto save them to the database
- use case: after a set of successful operations, use
BEGIN;
INSERT INTO accounts (user_id, balance) VALUES (1, 1000);
UPDATE accounts SET balance = balance - 200 WHERE user_id = 1;
COMMIT;ROLLBACK: undo changes made in the current transaction- if something goes wrong, cancel all operations since the last
BEGIN
- if something goes wrong, cancel all operations since the last
BEGIN;
UPDATE accounts SET balance = balance - 200 WHERE user_id = 1;
-- Something goes wrong
ROLLBACK;SAVEPOINT: sets a named point within a transaction that you can roll back to without rolling back the entire transaction- use case: useful in large transactions where you may want to undo only part of it
BEGIN;
INSERT INTO accounts (user_id, balance) VALUES (2, 500);
SAVEPOINT after_insert;
UPDATE accounts SET balance = balance / 0 WHERE user_id = 2; -- This will cause an error
ROLLBACK TO after_insert; -- Undo only the update, keep the insert
COMMIT; -- Save the insertCREATE TABLE employees(
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
age INTEGER,
is_manager BOOLEAN,
salary INTEGER
);SELECT id, name
FROM users
WHERE id = 1;SELECT employee_name, salary
FROM employees
WHERE salary BETWEEN 30000 and 60000;SELECT product_name, quantity
FROM products
WHERE quantity NOT BETWEEN 20 and 100;SELECT *
FROM Orders
WHERE OrderDate BETWEEN '01/07/1996' AND '31/07/1996';delete all records in users table:
DELETE FROM users;DELETE FROM users
WHERE email = 'example@email.com';Important
use = when comparing non-NULL values
use IS NULL or IS NOT NULL only when checking for NULLs
DELETE FROM users
WHERE email IS NULL;SELECT name, price, quantity
FROM products
ORDER BY price DESC;ALTER TABLE employees
RENAME TO contractors;ALTER TABLE employees
ADD birthdate DATE;INSERT INTO employees (name, age, is_manager, salary)
VALUES ('John Doe', 30, true, 50000);ALTER TABLE contractors
RENAME COLUMN salary TO invoice;UPDATE users
SET name = 'John Doe', salary = 20000
WHERE id = 5;CREATE TABLE departments (
id INTEGER PRIMARY KEY,
department_name TEXT NOT NULL
);CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
department_id INTEGER,
CONSTRAINT fk_departments
FOREIGN KEY (department_id) REFERENCES departments(id)
);SELECT COUNT(*) FROM tableName;SELECT COUNT(columnName) FROM tableName WHERE condition;SELECT SUM(columnName) FROM tableName;SELECT SUM(columnName) FROM tableName WHERE condition;SELECT max(price)
FROM products;SELECT product_name, min(price)
FROM products;SELECT AVG(columnName) FROM tableName;SELECT AVG(columnName) FROM tableName WHERE condition;SELECT album_id, count(song_id)
FROM songs
GROUP BY album_id;SELECT album_id, count(id) as count
FROM songs
GROUP BY album_id
HAVING count > 5;CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
Name VARCHAR(100)
);
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
Product VARCHAR(100),
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID) ON DELETE CASCADE
);TRUNCATE TABLE Employees;CREATE TABLE Departments (
DeptID INT PRIMARY KEY,
DeptName VARCHAR(100)
);
CREATE TABLE Employees (
EmpID INT PRIMARY KEY,
EmpName VARCHAR(100),
DeptID INT,
FOREIGN KEY (DeptID) REFERENCES Departments(DeptID) ON UPDATE CASCADE
);example combining several constraints:
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
department_id INT,
manager_id INT,
salary DECIMAL(10, 2) CHECK (salary > 0),
hire_date DATE DEFAULT CURRENT_DATE,
CONSTRAINT fk_department
FOREIGN KEY (department_id) REFERENCES departments (department_id)
ON DELETE SET NULL
ON UPDATE CASCADE,
CONSTRAINT fk_manager
FOREIGN KEY (manager_id) REFERENCES employees (employee_id)
ON DELETE SET NULL
ON UPDATE CASCADE
);SELECT product_name, shipment_status
FROM products
WHERE shipment_status IN ('shipped', 'preparing', 'out of stock');SELECT EmployeeName
FROM Employees
WHERE EmployeeID IN (SELECT EmployeeID FROM Orders WHERE OrderDate > '2020-01-01');SELECT email
FROM employees
INTERSECT
SELECT email
FROM contractors;SELECT email
FROM employees
UNION
SELECT email
FROM contractors;SELECT email
FROM employees
MINUS
SELECT email
FROM contractors;