Skip to content

Chapter 1: Designing the Database

Excellent! Let's start by designing the database for our blog system. The goal is to create a schema that is both normalized and easily expandable, while also being pre-optimized for common queries.


1. Core Entities

We'll begin with four core entities:

  • User: The user or author.
  • Post: A blog article.
  • Comment: A comment on a post.
  • Tag: A tag or topic.

2. Relationships

  • UserPost: One-to-Many (One User can write many Posts).
  • PostComment: One-to-Many (One Post can have many Comments).
  • UserComment: One-to-Many (One User can write many Comments).
  • PostTag: Many-to-Many (One Post can have many Tags, and one Tag can be assigned to many Posts). This relationship will require a join table.

3. Detailed Schema Design

Here is the SQL structure to create the tables, with optimized choices already applied:

sql
-- Users Table
CREATE TABLE users (
    id UUID PRIMARY KEY, -- UUIDv7 will be generated by the application
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Posts Table
CREATE TABLE posts (
    id UUID PRIMARY KEY,
    author_id UUID NOT NULL REFERENCES users(id),
    title VARCHAR(255) NOT NULL,
    slug VARCHAR(255) UNIQUE NOT NULL, -- For friendly URLs, indexed
    content TEXT,
    published_at TIMESTAMPTZ, -- NULL if it's a draft
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),

    -- Denormalization for read performance (See Case 15)
    author_username VARCHAR(50) NOT NULL
);

-- Comments Table
CREATE TABLE comments (
    id UUID PRIMARY KEY,
    post_id UUID NOT NULL REFERENCES posts(id),
    author_id UUID NOT NULL REFERENCES users(id),
    parent_id UUID REFERENCES comments(id), -- For nested comments (See Case 20)
    content TEXT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Tags Table
CREATE TABLE tags (
    id SERIAL PRIMARY KEY, -- Use SERIAL for tags as they are few and don't need UUIDs
    name VARCHAR(50) UNIQUE NOT NULL,
    slug VARCHAR(50) UNIQUE NOT NULL
);

-- Join table for the many-to-many relationship
CREATE TABLE post_tags (
    post_id UUID NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    tag_id INT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (post_id, tag_id) -- Composite primary key
);

4. Initial Indexing Strategy

In addition to the indexes automatically created for PRIMARY KEY and UNIQUE constraints, we need to add the following indexes to serve common queries:

sql
-- Speed up finding posts by an author and sorting by creation date
CREATE INDEX idx_posts_author_id_created_at ON posts(author_id, created_at DESC);

-- Speed up finding comments for a specific post
CREATE INDEX idx_comments_post_id ON comments(post_id);

-- Speed up the reverse lookup from a tag to posts
CREATE INDEX idx_post_tags_tag_id ON post_tags(tag_id);

5. Analysis of Optimization Decisions

  • UUID for Primary Keys: We will assume the application generates UUIDv7 (See Case 18). This helps avoid the database becoming a bottleneck for ID generation, and the index will be more efficient than with UUIDv4.
  • TIMESTAMPTZ (Timestamp with Time Zone): Always use this type for time-related columns to avoid timezone confusion.
  • Denormalization in the posts table: Storing author_username directly means we don't need to JOIN with the users table when displaying a list of posts, making read queries much faster.
  • ON DELETE CASCADE: In the post_tags table, this ensures that when a post or a tag is deleted, the related links are automatically removed, keeping the data clean.