Those trying out the upcoming PostgreSQL 15 will notice one less background process:
postgres 1710 1 0 04:03 ? 00:00:00 /usr/pgsql-15/bin/postmaster -D /var/lib/pgsql/15/data/
postgres 1711 1710 0 04:03 ? 00:00:00 postgres: logger
postgres 1712 1710 0 04:03 ? 00:00:00 postgres: checkpointer
postgres 1713 1710 0 04:03 ? 00:00:00 postgres: background writer
postgres 1715 1710 0 04:03 ? 00:00:00 postgres: walwriter
postgres 1716 1710 0 04:03 ? 00:00:00 postgres: autovacuum launcher
postgres 1717 1710 0 04:03 ? 00:00:00 postgres: logical replication launcher
Let's compare it with PostgreSQL 14:
postgres 1751 1 0 04:04 ? 00:00:00 /usr/pgsql-14/bin/postmaster -D /var/lib/pgsql/14/data/
postgres 1752 1751 0 04:04 ? 00:00:00 postgres: logger
postgres 1754 1751 0 04:04 ? 00:00:00 postgres: checkpointer
postgres 1755 1751 0 04:04 ? 00:00:00 postgres: background writer
postgres 1756 1751 0 04:04 ? 00:00:00 postgres: walwriter
postgres 1757 1751 0 04:04 ? 00:00:00 postgres: autovacuum launcher
postgres 1758 1751 0 04:04 ? 00:00:00 postgres: stats collector
postgres 1759 1751 0 04:04 ? 00:00:00 postgres: logical replication launcher
Yes, the stats collector process is gone. But getting rid of this process is a good thing, a major bottleneck and headache is gone forever.
What is the job of stats collector?
Novice users may wonder what it is and why it is required for PG 14 and earlier. At least some users will be confused about the collection of table-level statistics (ANALYZE) used for query plans. But this is different. PostgreSQL tracks all activity of each process to obtain cumulative statistics, such as the number of times a table or index was scanned, or the last time vacuum or autovacuum was run on a table, or the number of times autovacuum was run on a table, etc. All data collected by the stats collector is available through the different pg_stat_* views.
question
Since each backend of a session is a separate process in PostgreSQL, gathering statistics and transferring them is not trivial. Each backend sends information about the activities they do to a single "stats collector" process.
This communication used to be via UDP sockets. There are a lot of problems with this approach, it's not a scalable model.
Users often report different types of issues, such as: outdated stats, stats collector not running, autovacuum not working/starting, etc.
It used to be really hard to understand what went wrong if the stats collector had a problem on a particular machine.
Another adverse effect of the "stats collector" is the IO it causes. If you enable DEBUG level 2, you may see messages that keep appearing in the PostgreSQL log, such as:
2022-08-22 03:49:57.153 UTC [736] DEBUG: received inquiry for database 0
2022-08-22 03:49:57.153 UTC [736] DEBUG: writing stats file "pg_stat_tmp/global.stat"
2022-08-22 03:49:57.153 UTC [736] DEBUG: writing stats file "pg_stat_tmp/db_0.stat"
2022-08-22 03:49:57.168 UTC [1278] DEBUG: autovacuum: processing database "postgres"
2022-08-22 03:49:57.168 UTC [736] DEBUG: received inquiry for database 13881
2022-08-22 03:49:57.168 UTC [736] DEBUG: writing stats file "pg_stat_tmp/global.stat"
2022-08-22 03:49:57.168 UTC [736] DEBUG: writing stats file "pg_stat_tmp/db_13881.stat"
2022-08-22 03:49:57.169 UTC [736] DEBUG: writing stats file "pg_stat_tmp/db_0.stat"
This can cause a lot of IO to the mount point where the data directory is located. This is where the value of the parameter stats_temp_directory points to. On many systems it will be pg_stat_tmp in the data directory.
On Ubuntu/Debian it will be in /var/run/postgresql, for example:
postgres=# show stats_temp_directory ;
stats_temp_directory
-----------------------------------------
/var/run/postgresql/14-main.pg_stat_tmp
(1 row)
New in PostgreSQL 15
Start using dynamic shared memory to collect statistics instead of using files and filesystems.
Previously, the statistics collector received statistics updates over UDP and shared statistics by periodically writing the statistics to a temporary file. These files can reach tens of megabytes and are written up to two times per second. This prevents us from adding other useful stats.
Statistics are now stored in shared memory. Statistics for variable numbered objects are stored in a dshash hash table (backed by dynamic shared memory). Fixed-numbered statistics are stored in ordinary shared memory.
The header file for pgstat.c contains an overview of the architecture.
The stats collector is no longer needed, remove it.
Replica now removes stat entries for deleted objects, starting from a cleanly shut down replica no longer needs to reset stats.
Apparently, the parameter stats_temp_directory is missing. Therefore, we don't need the pg_stat_tmp directory, which is in the data directory (or elsewhere), where all statistics files are generated and read. However, this directory is kept because many extensions that depend on it, such as pg_stat_statements, are not broken.
The directory remains empty until the extension library is loaded. For example, if we load the pg_stat_statements library, a file will appear in the directory.
ls pg_stat_tmp/
pgss_query_texts.stat
Of course, extensions are not free, they also have corresponding costs.
In the new architecture, most stat updates are first accumulated as "pending" locally in each process (each backend has a backend local hash table). "pending" means they have been accumulated but not yet committed to the shared stats system. It will then be flushed to shared memory after a commit or timeout.
Read consistency occurs because statistics are updated at the same time someone tries to read. So PostgreSQL 15 introduces a new parameter: stats_fetch_consistency, which can take three values of none, cache or snapshot:
· "none" is the most efficient. However, read consistency will not be provided. But for most usage it should be fine.
"cache" ensures that repeated accesses yield the same value, which is necessary for cases involving e.g. self-joins.
"snapshot" is useful when checking statistics interactively, but is more expensive.
Defaults to "cache"
If it is in shared memory, how to keep it after restart?
Before being shut down, the checkpoint process will write these statistics to the file system, which can be loaded again after restart. Usually, if it crashes, statistics are lost.
Will this change affect my monitoring/scripting?
The monitoring views pg_stat_* will continue to function. However, make sure to get the proper value for stats_fetch_consistency. As mentioned above, the pg_stat_tmp directory is reserved for extensions only.
other
Many people, like me, use postgresql's wait events to understand postgresql and where sessions spend their time. Data collection and analysis tools, such as pg_gather, analyze problems by using wait events.
To better monitor postgresql, three new wait events have been introduced:
PgStatsDSA: Wait for statistics dynamic shared memory allocator access
PgStatsHash: Wait for stats shared memory hash table access
PgStatsData: Waiting for shared memory statistics access
With all the overhead of the statistics collector and its maintenance gone, other subsystems (like autovacuum) have less work to do.
Additionally, monitoring tools that frequently query statistics are expected to significantly reduce system load.