Postgresql Vacuum and Analyze, Maintenance and Performance

I have been using Postgresql for a couple of years now and I am very happy with its performance and growth as a open source relational database. Today I will like to share with you how Vacuum and Analyze can help your database perform better.

VACUUM frees the disk space occupied by your database deletes and updates. When you delete or update a record in your table, it does not get removed and it remains until a VACUUM is done. For frequently-updated tables, VACUUM is highly recommended.

ANALYZE helps improve your queries by collecting statistics about the contents of tables in the database. The results of the analysis are stored in the pg_statistic system catalog. With these statistics, the query planner use it to determine the most efficient execution plans, thus improve the performance of your queries.

During routine maintenance, VACUUM ANALYZE are used together to free up storage and improve the performance of your queries.

VACUUM FULL ANALYZE is recommended for tables that have just went through a bulk delete or update. Do note that when you run VACUUM FULL, the affected tables will be locked until the process is done and extra storage space is required as a temp store. Normal VACUUM do not lock the tables, but does has its overheads while running along side a live database.

The Autovacuum Daemon is turned on by default and it is optimized to suit most scenarios. The daemon determines if the table requires VACUUM or ANALYZE or BOTH and run it in the background. Even though it is turned on, you will noticed that tables with lesser records are not part of the Autovacuum and Autoanalyze routine.

How do we know?

1
2
3
SELECT relname, last_analyze, last_vacuum, last_autoanalyze, last_autovacuum FROM pg_stat_all_tables
WHERE schemaname = 'public'
ORDER BY relname;

Here is the python script that VACUUM ANALYZE the tables not covered by the daemon. You can set the number of days you want to check if Postgresql has already done a Autovacuum.

Schedule the python script to run at 4am (low peak) using Crontab.

1
2
$crontab -e
00 04  * * *   /python/path/python /script/path/vacuum_analyze.py [host] [database] [user] [password] [days]