PostgreSQL — dumping, relationships, and more

In the previous article, we looked at setting up our Postgres docker image and interacting with it using sql files and the interactive terminal psql. This article is going to be following along from that, going into a bit more detail on how we can search our tables and extract meaningful information, as well as an introduction into table relationships.

wooden person sitting on toilet
Not this kind of dump… (Photo by Giorgio Trovato on Unsplash)

Data Dump

Although our database table is pretty simple and small, I'd like at this point to introduce the concept of a data dump and how we can initialise our database with some data. This will be very useful for testing purposes as you can create a fresh database with data that you expect.

If you have been following along from the previous article, here is how you would go about creating a dump from your docker container:

docker exec postgres pg_dump -U test test > test_020223.sql

This is telling docker to execute the commandpg_dump -U test test > test_020223.sqlon the container named postgres. Our command provides a user oftest using the-U flag, then specifies the name of the database that we wish to take a dump of, followed by the file on our local system that we want to have the sql pasted into.

For what we've done in the previous article, you should get the following file created:

--
-- PostgreSQL database dump
--

-- Dumped from database version 15.1 (Debian 15.1-1.pgdg110+1)
-- Dumped by pg_dump version 15.1 (Debian 15.1-1.pgdg110+1)

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

SET default_tablespace = '';

SET default_table_access_method = heap;

--
-- Name: friends; Type: TABLE; Schema: public; Owner: test
--

CREATE TABLE public.friends (
    id integer NOT NULL,
    name character varying(80),
    age integer
);


ALTER TABLE public.friends OWNER TO test;

--
-- Name: friends_id_seq; Type: SEQUENCE; Schema: public; Owner: test
--

CREATE SEQUENCE public.friends_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;


ALTER TABLE public.friends_id_seq OWNER TO test;

--
-- Name: friends_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: test
--

ALTER SEQUENCE public.friends_id_seq OWNED BY public.friends.id;


--
-- Name: friends id; Type: DEFAULT; Schema: public; Owner: test
--

ALTER TABLE ONLY public.friends ALTER COLUMN id SET DEFAULT nextval('public.friends_id_seq'::regclass);


--
-- Data for Name: friends; Type: TABLE DATA; Schema: public; Owner: test
--

COPY public.friends (id, name, age) FROM stdin;
5 Sam 38
6 Fred 21
.


--
-- Name: friends_id_seq; Type: SEQUENCE SET; Schema: public; Owner: test
--

SELECT pg_catalog.setval('public.friends_id_seq', 1, false);


--
-- Name: friends friends_pkey; Type: CONSTRAINT; Schema: public; Owner: test
--

ALTER TABLE ONLY public.friends
    ADD CONSTRAINT friends_pkey PRIMARY KEY (id);


--
-- PostgreSQL database dump complete
--

It's interesting reading through this dump and comparing it to the queries we made to create the data in the first place. The dump is a lot more verbose and it seems that Postgres does a lot for us, especially with regards to the SERIAL id key.

To recreate your database from this dump, we first need to ensure that we do not have a database called test:

docker exec postgres dropdb -U test test

Again we need to specify the user here as it is the super user for our postgres instance. Then to recreate the database from the dump is very similar to how we created the dump in the first place:

docker exec postgres pg_dump -U test test < test_020223.sql

The only change here is that the arrow is pointing in the opposite direction.

We can do a simple check to see if this has worked. As a refresher, here are two ways that you can interact with the database to follow along:

  1. Start the interactive terminal psql by runningdocker exec -it postgres psql -U test
  2. Add the query to an sql file locally and run it withdocker exec postgres psql -U test -f /app/YOUR_FILE_NAME_HERE.sql

For option 2, remember that the file path specified in this command has/app because of the volume that has been set up in thedocker-compose.yml file. The file should be a sibling to yourdocker-compose.yml file on your local machine.

For whichever method you choose to follow along with, the SQL I provide in this article will work the same.

With all that said, let's check that our database has been restored correctly:

SELECT * from friends;

If you've done it correctly, this will return:

 id | name | age 
----+------+-----
  2 | Sam  |  38
  3 | Fred |  21
(2 rows)

Adding a column to our table

Hurray, I've made a new friend. Unfortunately they're also called Sam, so I'll need to add surnames to our friends table so that I can distinguish between them.

ALTER TABLE friends
ADD COLUMN surname VARCHAR;

Now we're left with some empty values for our existing friends

 id | name | age | surname 
----+------+-----+---------
  2 | Sam  |  38 | 
  3 | Fred |  21 | 

We can use the UPDATE statement from the previous article to add in these surnames:

UPDATE friends
SET surname = 'Brown'
WHERE name = 'Sam';

UPDATE friends
SET surname = 'Green'
WHERE name = 'Fred';

Now before we add our new friend, I think we should change the name of the name column to be “first_name” instead. We can do this with confidence here as we are just exploring postgres, however in a real world application we need to be very careful when renaming column names as there may be queries which rely on them.

ALTER TABLE friends
RENAME COLUMN name TO first_name;

Now our table should look like this:

 id | first_name | age | surname 
----+------------+-----+---------
  2 | Sam        |  38 | Brown
  3 | Fred       |  21 | Green

Let's INSERT our new friend like we've done before.

INSERT INTO friends (first_name, surname, age) 
VALUES ('Sam', 'Blue', 29);

Notice here that we've entered our data in a different order to what the table is in. We can do this because we've specified what each value corresponds to. Our table now looks like this:

 id | first_name | age | surname 
----+------------+-----+---------
  2 | Sam        |  38 | Brown
  3 | Fred       |  21 | Green
  4 | Sam        |  29 | Blue

Going forward, I'd like to make sure that the first name and surname is always entered, so I can ALTER the table to set a NOT NULL constraint to these two columns:

ALTER TABLE friends
ALTER COLUMN first_name SET NOT NULL;

ALTER TABLE friends
ALTER COLUMN surname SET NOT NULL;

Now if we try an incorrect insert statement, like

INSERT INTO friends (first_name, age) 
VALUES ('Phil', 29);

We will get an error telling us that we are violating the not null constraint on surname.

Adding a table relationship

Now that I have three whole friends, I'm struggling to remember where they live, so let's add a new table, “addresses” and relate it to our friends table. In a real life situation, each friend will only have one address (unless they're very wealthy), but more than one friend might live at the same address. So we want our relationship to beone address for each friend, butmany friends for each address. So we want a one-to-many relationship. First we'll create our new addresses table:

CREATE TABLE addresses (
      id     SERIAL PRIMARY KEY,
      street VARCHAR(30) NOT NULL,
      city   VARCHAR(30) NOT NULL
      );

Then we'll add our address_id reference column to our friends table:

ALTER TABLE friends
    ADD COLUMN address_id INT;

Finally we'll add the constraint which gives us the relationship we want:

ALTER TABLE friends
    ADD CONSTRAINT fk_address FOREIGN KEY(address_id) REFERENCES addresses(id);

Now our relationship is set up, we can add a new address for Fred:

INSERT INTO addresses (street, city)
VALUES ('23 Fake Street', 'Fakeville');

UPDATE friends 
SET address_id = (
  SELECT id FROM addresses
  WHERE street = '23 Fake Street' AND city = 'Fakeville'
  )
  WHERE first_name = 'Fred';

Most of this should be starting to look familiar, however this is the first time that we have done a query within a query. We're selecting theid fromaddresses for the address that we just inserted, in order to set the correct value on ourfriends table.

Now we can select our friends with addresses by using a JOIN:

SELECT * from friends LEFT JOIN addresses on address_id = addresses.id;

And we'll get this result:

 id | first_name | age | surname | address_id | id |     street     |   city    
----+------------+-----+---------+------------+----+----------------+-----------
  2 | Sam        |  38 | Brown   |            |    |                | 
  4 | Sam        |  29 | Blue    |            |    |                | 
  3 | Fred       |  21 | Green   |          1 |  1 | 23 Fake Street | Fakeville

Now you may be wondering what the point of that constraint we added was if we still need to specify which columns we're joining on. The constraint makes sure that there is a valid and appropriate relationship there to join on. Another benefit is found when using an ORM with your database. An ORM allows you to interact with the database in a more object oriented way and when a relationship is set up like this, it would allow you to do something like friend.address to access the address in the relationship. I'll look to go into more detail on this in another article, when I start writing about Sequelize.

Other Types of Join

You'll notice in the query above that I've used aLEFT JOIN. This will show all rows from the LEFT table (friends in this case) and only show the matching values from the right. There are a few other types ofJOIN to be aware of.

RIGHT JOIN — If we change our query to use a RIGHT JOIN instead we will get all the data from the right table (addresses) and only the values that match on the left table. To see this properly, I'm going to add another address which is not linked to a friend:

INSERT INTO addresses (street, city)
VALUES ('42 Test Road', 'Testtown');

SELECT * from friends RIGHT JOIN addresses on address_id = addresses.id;
 id | first_name | age | surname | address_id | id |     street     |   city    
----+------------+-----+---------+------------+----+----------------+-----------
  3 | Fred       |  21 | Green   |          1 |  1 | 23 Fake Street | Fakeville
    |            |     |         |            |  3 | 42 Test Road   | Testtown

INNER JOIN This will return only the rows that would have entries on both tables:

SELECT * from friends INNER JOIN addresses on address_id = addresses.id;

 id | first_name | age | surname | address_id | id |     street     |   city    
----+------------+-----+---------+------------+----+----------------+-----------
  3 | Fred       |  21 | Green   |          1 |  1 | 23 Fake Street | Fakeville

FULL JOIN This will return all rows for both tables:

SELECT * from friends FULL JOIN addresses on address_id = addresses.id;

 id | first_name | age | surname | address_id | id |     street     |   city    
----+------------+-----+---------+------------+----+----------------+-----------
  2 | Sam        |  38 | Brown   |            |    |                | 
  4 | Sam        |  29 | Blue    |            |    |                | 
  3 | Fred       |  21 | Green   |          1 |  1 | 23 Fake Street | Fakeville
    |            |     |         |            |  2 | 42 Test Road   | Testtown

This diagram is a helpful visual aid of the different joins:

Venn diagrams showing different types of joins
source

What I've covered so far is some of the most common things that you'd be doing with a database. In my next article I plan to talk through getting started with the Sequelize ORM I mentioned in this article.

Richard Bell

Self taught software developer with 11 years experience excelling at JavaScript/Typescript, React, Node and AWS.

I love learning and teaching and have mentored several junior developers over my career. I find teaching is one of the best ways to solidify your own learning, so in the past few years I've been maintaining a technical blog where I write about some things that I've been learning.

I'm passionate about building a teams culture and processes to make it efficient and satisfying to work in. In many roles I have improved the quality and reliability of the code base by introducing or improving the continuous integration pipeline to include quality gates.