# SQL Basics & PostgreSQL

Let's run some SQL together!

Refer to this [repo](https://github.com/wanghejin/24orbital-db) for the complete code.

## PostgreSQL Setup

### Installation

[PostgreSQL Download Package Installer(EDB)](https://www.postgresql.org/download/)

{% hint style="danger" %}
Please select at least `PostgreSQL Server` and `pgAdmin` for installation!
{% endhint %}

![Selecting components](https://2807223923-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTUqAJOgHs57S8lmqdxRV%2Fuploads%2Fgit-blob-c2b9f94746c75aef610e5fcd54f32fc379be949e%2Fimage-2.png?alt=media)

{% hint style="info" %}
In case you still forget to select pgAdmin, you can always downloaded it separately [here](https://www.pgadmin.org/download/)
{% endhint %}

### Database Connection

When you open up pgAdmin, click on Servers, there should already be a default Server `PostgreSQL <version>`, with a `postgres` database in it. In case you don't, we can also create a new server and database with the following steps:

![create server](https://2807223923-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTUqAJOgHs57S8lmqdxRV%2Fuploads%2Fgit-blob-1086a2a8fb5e6f776ca1a47f3a6065c4f0354bca%2Fimage.png?alt=media)

```
Recommended Config:

Hostname: localhost
user: postgres (default)
```

![create database](https://2807223923-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTUqAJOgHs57S8lmqdxRV%2Fuploads%2Fgit-blob-6f32caba0ef7aeeae5fda5a2c1bcc892efe35986%2Fimage-1.png?alt=media)

After you are connected to the database, you can open the query tool and run your SQL queries there.

## C~~R~~UD

### Create Table

```sql
CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE,  
  username TEXT,
  password_hash TEXT
);

-- Create Notebook table
CREATE TABLE notebooks (
  id SERIAL PRIMARY KEY,
  nb_name TEXT,  -- avoid reserved keywords like 'name'
  nb_description TEXT,
  created_at TIMESTAMP,
  created_by INT,
  FOREIGN KEY (created_by) REFERENCES users(id)
);

-- Create Note table
CREATE TABLE notes (
  id SERIAL PRIMARY KEY,
  title TEXT,
  content TEXT,
  created_at TIMESTAMP,
  notebook_id INT,
  FOREIGN KEY (notebook_id) REFERENCES notebooks(id)
);
```

### Insert

```sql
INSERT INTO users (email, username, password_hash) VALUES ('alice@example.com', 'alice', 'hash1');

INSERT INTO users VALUES (DEFAULT, 'charlie@example.com', 'charlie', 'hash2');

-- Do refer to the repo for complete code!
```

### Update

```sql
UPDATE users SET username = 'alex' WHERE username = 'alice';
```

### Delete

```sql
DELETE FROM users WHERE id = 1;
```

### Drop

```sql
DROP TABLE users;
```

## Simple Queries

```sql
-- Retrieve all users:
SELECT * FROM users;

-- Retrieve a specific user by their email:
SELECT * FROM users WHERE email = 'bob@example.com';

-- Retrieve all distinct usernames:
SELECT DISTINCT username FROM users;

-- Retrieve all users without a specific username:
SELECT * FROM users WHERE username IS NULL;

-- Retrieve all users whose username is not null:
SELECT * FROM users WHERE username IS NOT NULL;
```

## Algebraic Queries

### Inner Join

```sql
-- Retrieve all notebooks along with their corresponding user information:
SELECT nb.*, u.username, u.email
FROM notebooks nb
INNER JOIN users u ON nb.created_by = u.id;
-- is equivalent to
SELECT nb.*, u.username, u.email
FROM notebooks nb, users u
WHERE nb.created_by = u.id;
```

### Left Outer Join

{% hint style="info" %}
[Here](https://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins) is an excellent article explaining the different join concepts, do give it a read!
{% endhint %}

OUTER JOINS are joins that return matched values and unmatched values from either or both tables. In the case of left outer join, we return matched values and unmatched values from the table to the left of the JOIN keyword.

```sql
-- Retrieve all notebooks along with their note ids, even if the notebook does not contain any note:
SELECT nb.*, nt.id AS note_id
FROM notebooks nb
LEFT OUTER JOIN notes nt ON nb.id = nt.notebook_id;
```

## Set Operators

`UNION`, `INTERSECT`, `EXCEPT`

They eliminate duplicates unless annotated with the keyword `ALL`.

To use the set operators on two tables, they must satisfy the condition:

* The two queries must be union-compatible.
* They must return the same number of columns with the same domains in the same order.

```sql
-- select customers who downloaded both version 1.0 and 2.0 for game Aerified
SELECT d.customerid FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '1.0'
INTERSECT
SELECT d.customerid FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '2.0';
```

## Aggregate Queries

`COUNT`, `MAX`, `MIN`, `AVG`, `SUM`, `STDDEV`, ...

```sql
-- Retrieve the total number of users:
SELECT COUNT(*) FROM users;

-- Retrieve total number of users w/ a username:
SELECT COUNT(username) FROM users;

-- Retrieve total number of distinct usernames:
SELECT COUNT(DISTINCT username) FROM users;
```

## Group By, Having

* `GROUP BY` creates groups of records that have the same values for the specified fields **before** computing the aggregate functions. Without a clause, only one group is formed implicitly by SQL as soon as an aggregate function is called.
  * Aggregate functions cannot appear before `GROUP BY`(in `WHERE`), hence we have `HAVING`.
* `HAVING` adds conditions to be checked after the evaluation of the `GROUP BY` clause. It only involves aggregate functions, columns listed in the `GROUP BY` clause and subqueries.
* Attributes in the `SELECT` clause must appear in the `GROUP BY` clause, unless it is used in an aggregate function.

### A slightly complicated query:

Try to figure out what it does!

```sql
SELECT u.username, COUNT(nb.id) AS total_notes
FROM users u
LEFT OUTER JOIN notebooks nb ON u.id = nb.created_by
LEFT OUTER JOIN notes nt ON nb.id = nt.notebook_id
GROUP BY u.id, u.username
ORDER BY total_notes DESC;
```

## Nested Queries

Nest a subquery in a query.

* Subqueries can be used only in `FROM` clause or `WHERE` clause
* Comparison to a subquery should always use the quantifier `ALL` or `ANY`

Many ways to do the same thing. Just for illustration.

```sql
-- Retrieve user who created the most notes
SELECT username, total_notes
FROM (SELECT u.username, COUNT(nb.id) AS total_notes
        FROM users u
        LEFT OUTER JOIN notebooks nb ON u.id = nb.created_by
        LEFT OUTER JOIN notes nt ON nb.id = nt.notebook_id
        GROUP BY u.id, u.username
        ORDER BY total_notes DESC) 
LIMIT 1;
```
