Introduction to PostgreSQL Indexes

(dlt.github.io)

328 points | by dlt 13 days ago

12 comments

cdiamand 13 days ago

Linking to the postgresql docs since they are very well written and surprisingly enjoyable to read.
https://www.postgresql.org/docs/current/indexes-intro.html
brudgers 12 days ago

Related, Use the Index Luke
https://use-the-index-luke.com/
jihadjihad 13 days ago

The section on multi-column indexes mirrors how I was taught and how I’ve generally handled such indexes in the past. But is it still true for more recent PG versions? I had an index and query similar to the third example, and IIRC PG was able to use an index, though I believe it was a bitmap index scan.
I am also unsure of the specific perf tradeoffs between index scan types in that case, but when I saw that happen in the EXPLAIN plan it was enough for me to call into question what had been hardcoded wisdom in my mind for quite some time.
Further essential reading is the classic Use The Index, Luke [0] site, and the book is a great buy for the whole team.
0: https://use-the-index-luke.com/

[-]
- petergeoghegan 12 days ago
  
  > The section on multi-column indexes mirrors how I was taught and how I’ve generally handled such indexes in the past. But is it still true for more recent PG versions?
  No, it isn't. PostgreSQL 18 added support for index skip scan:
  https://youtu.be/RTXeA5svapg?si=_6q3mj1sJL8oLEWC&t=1366
  It's actually possible to use a multicolumn index with a query that only has operators on its lower-order columns in earlier versions. But that requires a full index scan, which is usually very inefficient.
  
  [-]
  - dlt 12 days ago
    
    Hi Peter, author here. Thanks for weighing in with the extra context on index skip scan, and huge thanks for adding this to Postgres.
    I’m going to revise the multi-column index section to be more precise about when leftmost-prefix rules apply, and I’ll include a note on how skip scan changes the picture
- glenjamin 12 days ago
  
  A bitmap index scan allows the database to narrow down which pages could include the data, but then still has to recheck the condition on the contents of those pages - so will still not be as performant as an proper index scan
  
  [-]
  - isbvhodnvemrwvn 12 days ago
    
    With postgres indexes not containing liveness data for tuples you'll have to hit quite a lot of those pages anyway, unless they are frozen.
zozbot234 13 days ago

It would be nice to see out-of-the-box support in PostgreSQL for what's known as incremental view maintenance. It's very much an index in that it gets updated automatically when the underlying data changes, but it supports that for arbitrary views - not just special-cased like ordinary database indexes.

[-]
- BenoitP 12 days ago
  
  A hard problem, especially wrt to transactions on a moving target.
  From memory, handful of projects just dedicated to this dimension of databases: Noria, Materialize, Apache Flink, GCP's Continuous Queries, Apache Spark Streaming Tables, Delta Tables, ClickHouse streaming tables, TimescaleDB, ksqlDB, StreamSQL; and dozens more probably. IIRC, since this is about postgres, there is recently created extension trying to deal with this: pg_ivm
- lispisok 12 days ago
  
  If you have timeseries data TimescaleDB has this with continuous aggregates
turbocon 13 days ago

This looks really awesome for Postgres
For general B Tree index resources this has been my got to site for years https://use-the-index-luke.com/
augusteo 12 days ago

Good timing for this article. The multi-column index advice was always confusing because the "leading column" rules had real performance implications, but bitmap index scans made it less catastrophic than the textbooks suggested.
Skip scan in PG 18 changes a lot of that conventional wisdom. Worth updating the mental model for anyone who learned indexing on older versions.
morshu9001 12 days ago

The whole btree vs hash discussion is interesting. Many people assume "ID" columns should be hash, but iirc the default btree works best for those. Also treelike structures are fundamentally better for nearly-sequential value insertion.
The blog post that this links to comes to the opposite conclusion though, showing hash winning the benchmarks.
joaomsa 13 days ago

Essential reading. More in-depth than an introduction, but without being overly impenetrable except to those dealing with the internals.
zmmmmm 12 days ago

I love this style of writing. Simple, humble and direct transfer of knowledge.
Anonyneko 12 days ago

Is there a use-the-index-luke for MongoDB...?