Monday, February 4, 2013

Use of PostgreSQL Indexes

I'm busy with investigating how indexing works in PostgreSQL and what are the ways to improve it. One particularly useful query I came across is this:

SELECT idstat.schemaname AS schema_name, idstat.relname AS table_name,
indexrelname AS index_name,
idstat.idx_scan AS times_used,
pg_size_pretty(pg_relation_size(idstat.relid)) AS table_size, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
n_tup_upd + n_tup_ins + n_tup_del as num_writes,
indexdef AS definition
FROM pg_stat_user_indexes AS idstat JOIN pg_indexes ON (indexrelname = indexname AND idstat.schemaname = pg_indexes.schemaname)
JOIN pg_stat_user_tables AS tabstat ON idstat.relid = tabstat.relid
WHERE idstat.idx_scan  < 200
 AND indexdef !~* 'unique'
ORDER BY idstat.relname, indexrelname;

It returns the following information for each index in the database:

  • schema name
  • table name
  • index name
  • disk space used by the index, and the table
  • how many rows were inserted, deleted or updated
  • how many times the index was used

If the database was used for some time, the information may help to find out which indexes are not used at all, or used rarely but occupy a lot of space on the disk. Or it may suggest that something is not working as designed - a rarely used index that was expected to be used a lot is probably a warning sign.

The unfortunate complication that I came across was that the query returned absolutely no data after the database restore. My current understanding is that PostgreSQL does not backup the pg_catalog, and also the indexes are rebuilt when the database is restored. Therefore, if I do not have direct access to the database, I have to either ask someone to run the script (and give them the PostgreSQL/pgAdmin password), or somehow obtain a file system level copy of the database. In the future, I'll need to create a utility that extracts this information and saves it to a file.

References

Handling Growth with Postgres: 5 Tips From Instagram
Efficient Use of PostgreSQL Indexes
Finding Useless Indexes
9.20. System Administration Functions
27.2. The Statistics Collector
PostgreSQL index usage, backing up pg_catalog data
by . Also posted on my website

No comments: