SQL Professionals Are Withholding These 9 Database Optimisation Tricks From You

SELECT, INSERT, UPDATE, DELETE, and possibly a few JOINs are the SQL operations that most developers learn to get by. They may even be able to build indexes for columns that are often accessed. However, there are numerous database optimisation strategies that can significantly boost performance without needing your program to be completely rewritten.

I've worked as a database consultant for years, and throughout that time I've seen innumerable applications falter under slow queries that could have been resolved with a few focused optimisations. These are useful solutions that seasoned database administrators use on a daily basis; they are not arcane academic procedures.

These nine effective SQL optimisation techniques can occasionally improve database speed by orders of magnitude.

1. Partial Indexes: Focussing on What Really Counts
Partial indexes are one of the most effective techniques to speed up certain queries while minimising index overhead, yet few developers utilise them.

-- Standard index (indexes everything)
CREATE INDEX idx_orders_status ON orders(status);

-- Partial index (only indexes active orders, which might be 5% of the table)
CREATE INDEX idx_orders_active ON orders(created_at) WHERE status = 'active';


Why it functions:

  • Faster operations and less disk/memory utilisation are the results of much smaller index sizes.
  • especially useful for columns whose value distributions are skewed
  • Only when records correspond with the condition do updates take place.
  • Ideal for queries that apply a value filter (e.g., active users, current orders)

 
Just 2% of the 120GB orders table on a client's e-commerce platform were active. By moving to a partial index, they were able to reduce query times from seconds to milliseconds and reduce the size of their index by 98%.

2. Covering Indexes: Getting Rid of Table Access Completely
A covering index contains all of the columns that a query requires, so the database can use the index alone to answer the query without ever contacting the table.

-- Query that needs to be optimized
SELECT user_id, created_at, status 
FROM orders 
WHERE status = 'processing' 
ORDER BY created_at DESC 
LIMIT 10;

-- Covering index that includes all needed columns
CREATE INDEX idx_orders_status_created_user ON orders(status, created_at DESC, user_id);

Why it functions:

  • Removes costly table lookups
  • incredibly quick for workloads involving a lot of reading
  • Especially effective for pagination queries
  • able to convert random input into sequential input
Without altering a single line of application code, I have observed covering indexes cut the query time on a social media application's newsfeed query from 1.2 seconds to less than 20 ms, a 60x improvement.

3. Expression Indexes: Providing Searchability for Functions
Have you ever used a function in the WHERE clause in a query? It requires a full table scan in the absence of appropriate indexing.
-- Slow query using a function on a column
SELECT * FROM users 
WHERE LOWER(email) = 'user@example.com';

-- Create an expression index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));

Why it functions:

  • uses indexes for function-based filtering.
  • Ideal for queries that don't care about case
  • works with string operations, date manipulation, and other things.
  • enables efficient sorting based on calculated values.
A SaaS application that was performing case-insensitive email lookups on each login was saved by this method. After installation, their database CPU use decreased by 45% right away.

4. INCLUDE-Based Index-Only Scans
Non-key columns that are available for index-only scans but not used for searching can be added to an index in modern databases.
-- PostgreSQL syntax (other databases have equivalents)
CREATE INDEX idx_products_category ON products(category_id) 
INCLUDE (name, price, image_url);

-- Now this query can be satisfied entirely from the index
SELECT name, price, image_url 
FROM products 
WHERE category_id = 42;

Why it functions:

  • Benefits of covering indexes with smaller key sizes are obtained.
  • More suitable for range scan indexes
  • maintains the index structure's effectiveness
  • minimizes the requirement for numerous specialized indexes

After using this method, the category page load time for a client's product catalog with five million items decreased from 600 ms to 40 ms.

5. The Ability to Explain and Analyze
Not a secret precisely, but terribly underutilized. EXPLAIN ANALYZE displays the precise timing and execution plan for your query:

-- PostgreSQL syntax (other databases have similar commands)
EXPLAIN ANALYZE
SELECT * FROM orders 
WHERE customer_id = 123 
AND created_at > CURRENT_DATE - INTERVAL '30 days';

What you will discover:

  • Whether the usage of your indexes is what you expect
  • Where is the time being spent?
  • The number of rows that are analyzed vs returned
  • Whether the sorting is taking place on disk or in memory
  • If the optimizer's calculations are accurate
Innumerable instances have come to my attention where developers wasted days optimizing the incorrect query because they neglected to take five minutes to run EXPLAIN ANALYZE and determine what was causing the slowness.

6. Materialized Views: Calculating Complicated Queries in Advance
In exchange for read speed, materialized views sacrifice write performance and some data freshness to cache the results of a complex query for lightning-fast access.
-- PostgreSQL syntax
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT 
    date_trunc('day', created_at) AS day,
    product_id,
    SUM(quantity) AS units_sold,
    SUM(quantity * price) AS revenue
FROM order_items
JOIN orders ON order_items.order_id = orders.id
WHERE orders.status = 'completed'
GROUP BY 1, 2;

-- Refresh when needed
REFRESH MATERIALIZED VIEW daily_sales_summary;

Why it functions:

  • calculates costly aggregations and joins in advance.
  • Ideal for analytics and reporting inquiries
  • able to be indexed similarly to regular tables
  • Perfect for data that updates periodically as opposed to continuously
With materialized views that refresh every hour, a retail analytics dashboard that previously took more than 12 seconds to load now loads in less than 100ms.

7. The Trap of the Multi-Column Index
Many developers don't realize that a multi-column index is different from a distinct index for each column they filter on.
-- Ineffective approach
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_orders_status ON orders(status);

-- For a query that filters on both columns
SELECT * FROM orders 
WHERE customer_id = 123 AND status = 'shipped';

-- Effective approach
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);
Why the second strategy is more effective:
  • In a query, the database can typically employ precisely one index per table.
  • The index's column order is very important.
  • Only one of these single-column indexes is frequently used.
  • With the proper column order, multi-column indexes may accommodate a variety of query patterns.
It is simple to double or treble query performance for common filtering procedures by comprehending multi-column indexes and the column ordering technique.

8. High-Cardinality Data with Sparse Indexes
Conventional B-tree indexes may become ineffective when working with high-cardinality columns, or columns that contain a large number of unique values. Sparse indexing techniques can be beneficial.
-- Instead of indexing every value, create a filtered index for important ranges
CREATE INDEX idx_logs_recent ON logs(timestamp, level, message) 
WHERE timestamp > CURRENT_DATE - INTERVAL '7 days';

-- Or use a function to reduce cardinality
CREATE INDEX idx_temperature_readings_hour ON sensor_data(
    sensor_id, 
    DATE_TRUNC('hour', timestamp)
);
Why it functions:
  • Significantly reduced index size
  • Improved selection for frequently asked questions
  • Lower maintenance costs
  • Frequently more effective than attempting to explicitly index high-cardinality columns
By employing time-based sparse indexes rather than indexing each timestamp, a time-series application was able to enhance query performance while reducing the size of their index by 95%.

9. Beyond Simple Sharding, Intelligent Partitioning
Table partitioning keeps a single logical table for your queries while breaking up big tables into smaller, easier-to-manage chunks.
-- PostgreSQL declarative partitioning example
CREATE TABLE orders (
    id SERIAL,
    customer_id INTEGER,
    amount DECIMAL(10,2),
    created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);

-- Create partitions by month
CREATE TABLE orders_202301 PARTITION OF orders
    FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
    
CREATE TABLE orders_202302 PARTITION OF orders
    FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
    
-- And so on...
Why it functions:
  • Partition trimming allows queries to completely omit unnecessary partitions.
  • Certain partitions may be the focus of maintenance procedures.
  • increases the use of cache
  • simplifies the process of archiving old data.
  • able to divide partitions among several storage levels
After using appropriate time-based segmentation, a client's system with three years of order data (billions of rows) witnessed a 50–200x speedup in their most frequently asked queries.

Bringing Everything Together: A Practical Example
Let's examine a typical e-commerce situation: using pagination to show a user's most recent orders.

Original Query:
SELECT o.id, o.created_at, o.total_amount, o.status, 
       COUNT(i.id) AS item_count
FROM orders o
LEFT JOIN order_items i ON o.id = i.order_id
WHERE o.user_id = 12345
GROUP BY o.id, o.created_at, o.total_amount, o.status
ORDER BY o.created_at DESC
LIMIT 20 OFFSET 0;
Strategy for Optimization:
  • To filter and sort the orders database, use a covering index.
  • For columns that are regularly retrieved, include an INCLUDE.
  • A materialized view for active users is something to think about.
Optimal Method:
-- Create an efficient index
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC)
INCLUDE (total_amount, status);

-- Rewrite the query to use window functions instead of grouping
SELECT o.id, o.created_at, o.total_amount, o.status, 
       COALESCE(i.item_count, 0) AS item_count
FROM orders o
LEFT JOIN (
    SELECT order_id, COUNT(*) AS item_count
    FROM order_items
    GROUP BY order_id
) i ON o.id = i.order_id
WHERE o.user_id = 12345
ORDER BY o.created_at DESC
LIMIT 20;
Even on tables with millions of entries, this combination of methods can turn a query that took seconds into one that reliably produces results in less than 10ms.


Beyond SQL: When to Take a Look at Other Methods
Sometimes using less SQL is the best way to optimize it:

  • Caching: Take into account Redis or application-level caching if data doesn't change often.
  • Denormalization: Costly joins can be avoided by strategically duplicating data.
  • Examine the replicas: Reporting query offloading to specialized database instances
  • specialized databases: Think about document stores for intricate hierarchical data, time-series databases for metrics, etc.
The hallmark of a truly seasoned database professional is knowing when to optimize SQL and when to look beyond it, as I frequently remind clients during database consults at CodersStop.

In conclusion, database performance is approached strategically.

Database optimization is about determining your unique bottlenecks and applying the appropriate solutions based on your workload, not about putting every strategy into practice.

Begin by taking these actions:
  1. Use log analysis or monitoring tools to find slow queries.
  2. Learn about access patterns through application code analysis.
  3. Use EXPLAIN ANALYZE to determine the true situation.
  4. Use focused optimizations that are grounded in actual facts.
  5. Analyze outcomes to verify advancements.
The methods discussed in this article can be used to resolve the majority of database performance problems. They only require careful implementation and knowledge, not costly hardware upgrades or full application rewrites.

Which database optimization strategies have you learned through your work? Leave a comment below with your questions and experiences!

Hi There, I'm Yahya, and I enjoy sharing knowledge and experiences.