BanTool Documentation

Excellent! We'll continue the series with a very practical problem in almost every application: optimizing search functionality.

Case 20: Supercharging Search: `LIKE` vs. Full-Text Search

The Story

System: A knowledge-sharing website or a technical blog.
Table: articles, containing 2,000,000 articles.
Columns: id, title, content (of type TEXT).
The Problem: The application needs a powerful search function that allows users to find articles containing a set of keywords, for example: "optimize database performance".

The Common but Inefficient Approach: Using `ILIKE`

The simplest approach is to use multiple ILIKE (or case-insensitive LIKE) clauses.

sql

EXPLAIN ANALYZE
SELECT id, title
FROM articles
WHERE
    content ILIKE '%optimize%'
AND content ILIKE '%database%'
AND content ILIKE '%performance%';

The Bottleneck

Inevitable Seq Scan: Because the search condition starts with a wildcard (%), the database cannot use a standard B-Tree index. It is forced to perform a Seq Scan—sequentially reading all 2 million articles.
Not Linguistically Smart: LIKE is just a simple string comparison.
- No Stemming: A search for "optimize" will not find an article containing the word "optimizing."
- No Stop Words: A search for "how to do" will waste resources looking for extremely common words like "how" and "to."
- No Ranking: There is no way to know which article is more relevant. An article that mentions "database" once is treated the same as an in-depth article about "database."

The Solution: Use Full-Text Search (FTS) ✅

The Logic: PostgreSQL provides a powerful, built-in full-text search system. It works in two steps:
1. Indexing: It converts text into a special data type called a tsvector. This process splits words, reduces them to their root form (stemming), and removes stop words. This tsvector can then be efficiently indexed using a GIN index.
2. Querying: It converts the user's search string into a tsquery, which can understand operators like AND (&), OR (|), and NOT (!).

The Steps:

sql

-- Step 1 (do this once): Add a tsvector column to store the processed content
ALTER TABLE articles ADD COLUMN content_tsv tsvector;

-- Step 2: Populate the new column
UPDATE articles SET content_tsv = to_tsvector('english', title || ' ' || content);
-- Note: Choose the appropriate language.

-- Step 3: Create a GIN index to speed up searches
CREATE INDEX idx_articles_content_tsv ON articles USING GIN(content_tsv);

-- Step 4: Create a trigger to automatically update the tsv column on changes
-- (This is a crucial step in a real application to keep data fresh)

The Optimized Query:

sql

EXPLAIN ANALYZE
SELECT
    id,
    title,
    -- Rank the relevance of the results
    ts_rank(content_tsv, to_tsquery('english', 'optimize & database & performance')) as relevance
FROM articles
WHERE
    content_tsv @@ to_tsquery('english', 'optimize & database & performance')
ORDER BY
    relevance DESC
LIMIT 50;

Analyzing the Result

Superior Performance: Instead of a Seq Scan, the query will use the GIN index for a nearly instantaneous search, even on millions of rows. The execution time drops from seconds (or minutes) to just a few milliseconds.
Smarter Search Results:
- Thanks to stemming, a search for "optimize" will also return results for "optimizing" and "optimization."
- The & operator ensures that all keywords are present.
Ranking is Available: ts_rank allows you to order the results by relevance, bringing the best articles to the top and significantly improving the user experience.

Conclusion:

For text search functionality, LIKE and ILIKE are the wrong tools for the job.
PostgreSQL's Full-Text Search is a specialized, powerful solution that not only provides superior performance but also delivers much higher-quality search results.

Case 20: Supercharging Search: LIKE vs. Full-Text Search ​

The Story ​

The Common but Inefficient Approach: Using ILIKE ​

The Bottleneck ​