In the previous article, we looked at setting up our Postgres docker image and interacting with it using sql files and the interactive terminal psql. This article is going to be following along from that, going into a bit more detail on how we can search our tables and extract meaningful information, as well as an introduction into table relationships.
Although our database table is pretty simple and small, I'd like at this point to introduce the concept of a data dump and how we can initialise our database with some data. This will be very useful for testing purposes as you can create a fresh database with data that you expect.
If you have been following along from the previous article, here is how you would go about creating a dump from your docker container:
docker exec postgres pg_dump -U test test > test_020223.sql
This is telling docker to execute the commandpg_dump -U test test > test_020223.sql
on the container named postgres
. Our command provides a user oftest
using the-U
flag, then specifies the name of the database that we wish to take a dump of, followed by the file on our local system that we want to have the sql pasted into.
For what we've done in the previous article, you should get the following file created:
--
-- PostgreSQL database dump
--
-- Dumped from database version 15.1 (Debian 15.1-1.pgdg110+1)
-- Dumped by pg_dump version 15.1 (Debian 15.1-1.pgdg110+1)
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
SET default_tablespace = '';
SET default_table_access_method = heap;
--
-- Name: friends; Type: TABLE; Schema: public; Owner: test
--
CREATE TABLE public.friends (
id integer NOT NULL,
name character varying(80),
age integer
);
ALTER TABLE public.friends OWNER TO test;
--
-- Name: friends_id_seq; Type: SEQUENCE; Schema: public; Owner: test
--
CREATE SEQUENCE public.friends_id_seq
AS integer
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ALTER TABLE public.friends_id_seq OWNER TO test;
--
-- Name: friends_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: test
--
ALTER SEQUENCE public.friends_id_seq OWNED BY public.friends.id;
--
-- Name: friends id; Type: DEFAULT; Schema: public; Owner: test
--
ALTER TABLE ONLY public.friends ALTER COLUMN id SET DEFAULT nextval('public.friends_id_seq'::regclass);
--
-- Data for Name: friends; Type: TABLE DATA; Schema: public; Owner: test
--
COPY public.friends (id, name, age) FROM stdin;
5 Sam 38
6 Fred 21
.
--
-- Name: friends_id_seq; Type: SEQUENCE SET; Schema: public; Owner: test
--
SELECT pg_catalog.setval('public.friends_id_seq', 1, false);
--
-- Name: friends friends_pkey; Type: CONSTRAINT; Schema: public; Owner: test
--
ALTER TABLE ONLY public.friends
ADD CONSTRAINT friends_pkey PRIMARY KEY (id);
--
-- PostgreSQL database dump complete
--
It's interesting reading through this dump and comparing it to the queries we made to create the data in the first place. The dump is a lot more verbose and it seems that Postgres does a lot for us, especially with regards to the SERIAL
id key.
To recreate your database from this dump, we first need to ensure that we do not have a database called test
:
docker exec postgres dropdb -U test test
Again we need to specify the user here as it is the super user for our postgres instance. Then to recreate the database from the dump is very similar to how we created the dump in the first place:
docker exec postgres pg_dump -U test test < test_020223.sql
The only change here is that the arrow is pointing in the opposite direction.
We can do a simple check to see if this has worked. As a refresher, here are two ways that you can interact with the database to follow along:
docker exec -it postgres psql -U test
docker exec postgres psql -U test -f /app/YOUR_FILE_NAME_HERE.sql
For option 2, remember that the file path specified in this command has/app
because of the volume that has been set up in thedocker-compose.yml
file. The file should be a sibling to yourdocker-compose.yml
file on your local machine.
For whichever method you choose to follow along with, the SQL I provide in this article will work the same.
With all that said, let's check that our database has been restored correctly:
SELECT * from friends;
If you've done it correctly, this will return:
id | name | age
----+------+-----
2 | Sam | 38
3 | Fred | 21
(2 rows)
Hurray, I've made a new friend. Unfortunately they're also called Sam, so I'll need to add surnames to our friends table so that I can distinguish between them.
ALTER TABLE friends
ADD COLUMN surname VARCHAR;
Now we're left with some empty values for our existing friends
id | name | age | surname
----+------+-----+---------
2 | Sam | 38 |
3 | Fred | 21 |
We can use the UPDATE
statement from the previous article to add in these surnames:
UPDATE friends
SET surname = 'Brown'
WHERE name = 'Sam';
UPDATE friends
SET surname = 'Green'
WHERE name = 'Fred';
Now before we add our new friend, I think we should change the name of the name column to be “first_name” instead. We can do this with confidence here as we are just exploring postgres, however in a real world application we need to be very careful when renaming column names as there may be queries which rely on them.
ALTER TABLE friends
RENAME COLUMN name TO first_name;
Now our table should look like this:
id | first_name | age | surname
----+------------+-----+---------
2 | Sam | 38 | Brown
3 | Fred | 21 | Green
Let's INSERT
our new friend like we've done before.
INSERT INTO friends (first_name, surname, age)
VALUES ('Sam', 'Blue', 29);
Notice here that we've entered our data in a different order to what the table is in. We can do this because we've specified what each value corresponds to. Our table now looks like this:
id | first_name | age | surname
----+------------+-----+---------
2 | Sam | 38 | Brown
3 | Fred | 21 | Green
4 | Sam | 29 | Blue
Going forward, I'd like to make sure that the first name and surname is always entered, so I can ALTER
the table to set a NOT NULL
constraint to these two columns:
ALTER TABLE friends
ALTER COLUMN first_name SET NOT NULL;
ALTER TABLE friends
ALTER COLUMN surname SET NOT NULL;
Now if we try an incorrect insert statement, like
INSERT INTO friends (first_name, age)
VALUES ('Phil', 29);
We will get an error telling us that we are violating the not null constraint on surname.
Now that I have three whole friends, I'm struggling to remember where they live, so let's add a new table, “addresses” and relate it to our friends table. In a real life situation, each friend will only have one address (unless they're very wealthy), but more than one friend might live at the same address. So we want our relationship to beone address for each friend, butmany friends for each address. So we want a one-to-many relationship. First we'll create our new addresses table:
CREATE TABLE addresses (
id SERIAL PRIMARY KEY,
street VARCHAR(30) NOT NULL,
city VARCHAR(30) NOT NULL
);
Then we'll add our address_id reference column to our friends table:
ALTER TABLE friends
ADD COLUMN address_id INT;
Finally we'll add the constraint which gives us the relationship we want:
ALTER TABLE friends
ADD CONSTRAINT fk_address FOREIGN KEY(address_id) REFERENCES addresses(id);
Now our relationship is set up, we can add a new address for Fred:
INSERT INTO addresses (street, city)
VALUES ('23 Fake Street', 'Fakeville');
UPDATE friends
SET address_id = (
SELECT id FROM addresses
WHERE street = '23 Fake Street' AND city = 'Fakeville'
)
WHERE first_name = 'Fred';
Most of this should be starting to look familiar, however this is the first time that we have done a query within a query. We're selecting theid
fromaddresses
for the address that we just inserted, in order to set the correct value on ourfriends
table.
Now we can select our friends with addresses by using a JOIN:
SELECT * from friends LEFT JOIN addresses on address_id = addresses.id;
And we'll get this result:
id | first_name | age | surname | address_id | id | street | city
----+------------+-----+---------+------------+----+----------------+-----------
2 | Sam | 38 | Brown | | | |
4 | Sam | 29 | Blue | | | |
3 | Fred | 21 | Green | 1 | 1 | 23 Fake Street | Fakeville
Now you may be wondering what the point of that constraint we added was if we still need to specify which columns we're joining on. The constraint makes sure that there is a valid and appropriate relationship there to join on. Another benefit is found when using an ORM with your database. An ORM allows you to interact with the database in a more object oriented way and when a relationship is set up like this, it would allow you to do something like friend.address
to access the address in the relationship. I'll look to go into more detail on this in another article, when I start writing about Sequelize.
You'll notice in the query above that I've used aLEFT JOIN
. This will show all rows from the LEFT table (friends in this case) and only show the matching values from the right. There are a few other types ofJOIN
to be aware of.
RIGHT JOIN
— If we change our query to use a RIGHT JOIN instead we will get all the data from the right table (addresses) and only the values that match on the left table. To see this properly, I'm going to add another address which is not linked to a friend:
INSERT INTO addresses (street, city)
VALUES ('42 Test Road', 'Testtown');
SELECT * from friends RIGHT JOIN addresses on address_id = addresses.id;
id | first_name | age | surname | address_id | id | street | city
----+------------+-----+---------+------------+----+----------------+-----------
3 | Fred | 21 | Green | 1 | 1 | 23 Fake Street | Fakeville
| | | | | 3 | 42 Test Road | Testtown
INNER JOIN
This will return only the rows that would have entries on both tables:
SELECT * from friends INNER JOIN addresses on address_id = addresses.id;
id | first_name | age | surname | address_id | id | street | city
----+------------+-----+---------+------------+----+----------------+-----------
3 | Fred | 21 | Green | 1 | 1 | 23 Fake Street | Fakeville
FULL JOIN
This will return all rows for both tables:
SELECT * from friends FULL JOIN addresses on address_id = addresses.id;
id | first_name | age | surname | address_id | id | street | city
----+------------+-----+---------+------------+----+----------------+-----------
2 | Sam | 38 | Brown | | | |
4 | Sam | 29 | Blue | | | |
3 | Fred | 21 | Green | 1 | 1 | 23 Fake Street | Fakeville
| | | | | 2 | 42 Test Road | Testtown
This diagram is a helpful visual aid of the different joins:
What I've covered so far is some of the most common things that you'd be doing with a database. In my next article I plan to talk through getting started with the Sequelize ORM I mentioned in this article.
Self taught software developer with 11 years experience excelling at JavaScript/Typescript, React, Node and AWS.
I love learning and teaching and have mentored several junior developers over my career. I find teaching is one of the best ways to solidify your own learning, so in the past few years I've been maintaining a technical blog where I write about some things that I've been learning.
I'm passionate about building a teams culture and processes to make it efficient and satisfying to work in. In many roles I have improved the quality and reliability of the code base by introducing or improving the continuous integration pipeline to include quality gates.