Share on

Introduction

Managing performance is an ongoing task when working with database-backed applications. Slow running queries can cause timeouts, degrade the user experience, use more resources, and may even impact your budget depending on how you pay for your databases. These issues make it important to understand the performance characteristics of your database so that you can identify and fix problematic queries.

In this guide, we'll talk about different ways to identify poorly performing queries in PostgreSQL databases. Afterwards, we'll discuss different techniques you can use to fix slow queries to maintain your PostgreSQL performance.

Checking active queries and processes

The first place to check when trying to track down low performing queries is the list of currently active queries and processes. PostgreSQL makes this data available with the pg_stat_activity view.

The pg_stat_activity view is one of the views available within PostgreSQL's cumulative statistics system. It contains one row per server process, which helps you review what each process is working on at the moment.

To display all of the information within the view, type:

SELECT * FROM pg_stat_activity \gx
-[ RECORD 1 ]----+-------------------------------
datid |
datname |
pid | 1963
leader_pid |
usesysid |
usename |
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2022-11-01 11:03:44.083043+01
xact_start |
query_start |
state_change |
wait_event_type | Activity
wait_event | AutoVacuumMain
state |
backend_xid |
backend_xmin |
query_id |
query |
backend_type | autovacuum launcher
-[ RECORD 2 ]----+-------------------------------
datid |
datname |
pid | 1965
leader_pid |
usesysid | 10
usename | postgres
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2022-11-01 11:03:44.083926+01
xact_start |
query_start |
state_change |
wait_event_type | Activity
wait_event | LogicalLauncherMain
state |
backend_xid |
backend_xmin |
query_id |
query |
backend_type | logical replication launcher
-[ RECORD 3 ]----+-------------------------------
datid | 13921
datname | postgres
pid | 836027
leader_pid |
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2022-11-06 20:20:18.273218+01
xact_start | 2022-11-06 20:39:01.207078+01
query_start | 2022-11-06 20:39:01.207078+01
state_change | 2022-11-06 20:39:01.207088+01
wait_event_type |
wait_event |
state | active
backend_xid |
backend_xmin | 762
query_id |
query | select * from pg_stat_activity
backend_type | client backend
-[ RECORD 4 ]----+-------------------------------
datid |
datname |
pid | 1961
leader_pid |
usesysid |
usename |
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2022-11-01 11:03:44.082354+01
xact_start |
query_start |
state_change |
wait_event_type | Activity
wait_event | BgWriterHibernate
state |
backend_xid |
backend_xmin |
query_id |
query |
backend_type | background writer
-[ RECORD 5 ]----+-------------------------------
datid |
datname |
pid | 1960
leader_pid |
usesysid |
usename |
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2022-11-01 11:03:44.082065+01
xact_start |
query_start |
state_change |
wait_event_type | Activity
wait_event | CheckpointerMain
state |
backend_xid |
backend_xmin |
query_id |
query |
backend_type | checkpointer
-[ RECORD 6 ]----+-------------------------------
datid |
datname |
pid | 1962
leader_pid |
usesysid |
usename |
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2022-11-01 11:03:44.082653+01
xact_start |
query_start |
state_change |
wait_event_type | Activity
wait_event | WalWriterMain
state |
backend_xid |
backend_xmin |
query_id |
query |
backend_type | walwriter

Note: Using the \gx line termination sequence instead of the traditional semicolon (;) tells PostgreSQL to use the expanded output mode for the current query. This displays the columns and associated values for each record vertically instead of horizontally, which can improve readability in some cases.

There are a number of fields in the output that can be helpful when looking for slower queries. Some of the most relevant ones include:

  • state: The current state of the process. Rows listed as active are currently executing. Other states include idle for processes waiting for a new client command, idle in transaction for processes waiting commands within a transaction context, and idle in transaction (aborted) for transactions where a statement caused an error.
  • query: The most recently executed query. For active processes, this will be the currently executing query.
  • usename: The name of the user associated with the process.
  • application_name: The name of the application connected to the process.
  • datname: The name of the database the user is connected to.
  • wait_event: The name of the event the process is waiting for, if any. If a process has an active state and a wait_event is present, it means that the query is blocked by some other part of the system currently.
  • wait_event_type: The category of event the process is waiting for.
  • pid: The process's process ID.
  • query_start: For active queries, the timestamp of when the current query started.
  • xact_start: The timestamp of when the current transaction began, if the process is executing a transaction.

We can filter the query by whatever columns are relevant for our current context. One helpful pattern is to use the age() function to calculate how long the query has been running. For example:

SELECT
age(clock_timestamp(), query_start),
usename,
datname,
query
FROM pg_stat_activity
WHERE
state != 'idle'
AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY age desc;

This will display the execution time, username, database, and query text for queries that are not idle. We order the results from the longest to shortest running queries and exclude this specific query from the results.

Similarly, you can see all processes that are not idle but do have a wait event:

SELECT
usename,
datname,
query,
wait_event_type,
wait_event
FROM pg_stat_activity
WHERE
state != 'idle'
AND query wait_event != ''

This can help you see queries that are not currently progressing because of other parts of the system (for instance, lock contention).

Check other system statistics

While the pg_stat_activity view will probably provide most of the information you need to identify slower queries, it can be useful to look at other system statistics as well to help identify additional targets for optimization.

Viewing database statistics

The pg_stat_database table contains statistics about each database:

SELECT * FROM pg_stat_database \gx
. . .
-[ RECORD 2 ]------------+------------------------------
datid | 13921
datname | postgres
numbackends | 1
xact_commit | 266
xact_rollback | 9
blks_read | 229
blks_hit | 11263
tup_returned | 118708
tup_fetched | 3563
tup_inserted | 0
tup_updated | 0
tup_deleted | 0
conflicts | 0
temp_files | 0
temp_bytes | 0
deadlocks | 0
checksum_failures |
checksum_last_failure |
blk_read_time | 0
blk_write_time | 0
session_time | 5303626.534
active_time | 200.906
idle_in_transaction_time | 0
sessions | 2
sessions_abandoned | 0
sessions_fatal | 0
sessions_killed | 0
stats_reset | 2022-11-06 20:20:18.279798+01
. . .

Some interesting columns for our purposes include:

  • blks_read: Number of disk blocks read in the database.
  • blks_hit: Number of times disk blocks were found in the buffer cache instead (avoiding a slow read from disk).
  • xact_commit: Number of transactions committed.
  • xact_rollback: Number of transactions rolled back.

As the Data Egret team shows on their blog, you can use these raw values to calculate interesting statistics like your cache hit ratio:

SELECT
datname,
100 * blks_hit / (blks_hit + blks_read) as cache_hit_ratio
FROM
pg_stat_database
WHERE
(blks_hit + blks_read) > 0;
datname | cache_hit_ratio
-----------+-----------------
| 99
postgres | 98
template1 | 99
(3 rows)

This can be valuable information that can help you evaluate whether you would benefit from adding RAM to your database cluster so that your most common queries can be effectively cached.

Viewing table statistics

Another helpful family of views are pg_stat_all_tables, pg_stat_user_tables, and pg_stat_sys_tables. The pg_stat_all_tables view shows access statistics for all databases while the other two views filter the tables based on whether they are user tables or system tables.

SELECT * FROM pg_stat_all_tables \gx
. . .
-[ RECORD 104 ]-----+------------------------
relid | 1262
schemaname | pg_catalog
relname | pg_database
seq_scan | 5168
seq_tup_read | 20655
idx_scan | 20539
idx_tup_fetch | 20539
n_tup_ins | 0
n_tup_upd | 0
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 0
n_dead_tup | 0
n_mod_since_analyze | 0
n_ins_since_vacuum | 0
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0

Some interesting columns in these views include:

  • seq_scan: The number of sequential scans that were run on the table.
  • seq_tup_read: The number of rows returned from sequential scans.
  • idx_scan: The number of index scans run against the table.
  • idx_tup_fetch: The number of rows retrieved through indexes.

The numbers in these columns can help you evaluate how your indexes are performing and whether they're being effectively used by the queries you're running. If you find that your tables have many sequential scans, you would probably benefit from creating additional indexes that can be used by your most common queries.

Viewing index hits

If you need more information about indexes you currently have, you can look at the pg_stat_all_indexes, pg_stat_user_indexes, and pg_stat_sys_indexes views:

SELECT * FROM pg_stat_all_indexes \gx
. . .
-[ RECORD 6 ]-+----------------------------------------------
relid | 1249
indexrelid | 2659
schemaname | pg_catalog
relname | pg_attribute
indexrelname | pg_attribute_relid_attnum_index
idx_scan | 822
idx_tup_read | 1670
idx_tup_fetch | 1670
. . .

These provide you with information about how often each of your indexes are used. The idx_scan column shows the number of times the index has been scanned. The idx_tup_read columns shows the number of entries returned by scans, while idx_tup_fetch shows total number of rows returned by index scans.

This information can be useful to help you understand when you have indexes that are not being used by your queries. Once you identify those indexes, you can either rewrite your queries to take advantage of the index or you can remove the unused index to improve write performance.

Viewing lock information

Some of the information you gathered about slow queries might have pointed to a locking issue. You can find out more information about all of the locks that are currently held by querying the pg_locks view:

SELECT * FROM pg_locks \gx
-[ RECORD 1 ]------+----------------
locktype | relation
database | 13921
relation | 12290
page |
tuple |
virtualxid |
transactionid |
classid |
objid |
objsubid |
virtualtransaction | 3/3920
pid | 967262
mode | AccessShareLock
granted | t
fastpath | t
waitstart |
-[ RECORD 2 ]------+----------------
locktype | virtualxid
database |
relation |
page |
tuple |
virtualxid | 3/3920
transactionid |
classid |
objid |
objsubid |
virtualtransaction | 3/3920
pid | 967262
mode | ExclusiveLock
granted | t
fastpath | t
waitstart |

The output will provide information about all locks within PostgreSQL. This can help you diagnose contention issues that can occur when separate processes request control over the same objects.

Some columns that may help you investigate problematic locks include:

  • locktype: The type of lockable object
  • database/relation/page/tuple: The object ID of the locked item. For database and relations, these can be cross-referenced in the pg_database and pg_class.
  • mode: The lock mode that is implemented or requested.
  • granted: A boolean representing whether the lock was granted.

Enable slow query logging

One way to find information about long running queries more easily is to enable slow query logging. Enabling slow query logging allows PostgreSQL to automatically note any queries that take longer to execute than a given amount of time. This allows you to gather information about slow queries that are not executing at the moment of your investigation.

Check if PostgreSQL is already logging slow queries

The first thing you should do is verify the current state of slow query logging. If slow query logging is already enabled, you don't have to do anything.

You can check if slow query logging is enabled by typing:

SELECT * FROM pg_settings WHERE name = 'log_min_duration_statement'\gx
-[ RECORD 1 ]---+---------------------------------------------------------------------------
name | log_min_duration_statement
setting | -1
unit | ms
category | Reporting and Logging / When to Log
short_desc | Sets the minimum execution time above which all statements will be logged.
extra_desc | Zero prints all queries. -1 turns this feature off.
context | superuser
vartype | integer
source | default
min_val | -1
max_val | 2147483647
enumvals |
boot_val | -1
reset_val | -1
sourcefile |
sourceline |
pending_restart | f

If you check the values of the short_desc and extra_desc columns you will find the information that allows us to evaluate whether logging is currently enabled. We can see that slow query logging is currently not enabled because the setting column is currently set to -1.

Now that you know the current state, you can change it as necessary.

Configure PostgreSQL to log slow queries

Before we move on, it is important to note that while slow query logging is incredibly useful, it can potentially have an additional performance impact. PostgreSQL must perform additional operations to time each query and to record the results to a log. This can impact performance and fill up hard drive space unexpectedly.

It may not be a good idea to log slow queries at all times. Instead, enable the functionality when you are actively investigating an issue and disable it when you are finished.

Logging slow queries globally

With that in mind, you can configure slow query logging globally by modifying the PostgreSQL server's configuration file. You can also modify these values interactively, but setting good defaults in the configuration will make it easier to tweak interactively later.

Open PostgreSQL's configuration file. You can find the location of the current configuration file by typing:

SHOW config_file;
config_file
-----------------------------------------
/etc/postgresql/14/main/postgresql.conf
(1 row)

Inside the file, search for the log_min_duration_statement setting. If our example output value above was read from the configuration file, it will be set to -1 to indicate that the functionality is currently disabled. There are also a number of other related settings that you can tweak depending on your needs:

. . .
# Query logging configuration
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds
#log_min_duration_sample = -1 # -1 is disabled, 0 logs a sample of statements
# and their durations, > 0 logs only a sample of
# statements running at least this number
# of milliseconds;
# sample fraction is determined by log_statement_sample_rate
#log_statement_sample_rate = 1.0 # fraction of logged statements exceeding
# log_min_duration_sample to be logged;
# 1.0 logs all such statements, 0.0 never logs
#log_transaction_sample_rate = 0.0 # fraction of transactions whose statements
# are logged regardless of their duration; 1.0 logs all
# statements from all transactions, 0.0 never logs
. . .

Currently, the log_min_duration_statement setting is commented out with its current value set to -1 to represent the default value. The other settings are well-commented within the file and allow you to sample statements that are over the minimum instead of logging all of the statements. The last setting allows you to do sampling of statements that occur within transactions as well.

You can turn on long query logging by uncommenting the log_min_duration_statement and setting it to another value. For instance, we can set it to 5 seconds to log any statements that take longer than that to complete:

log_min_duration_statement = 5s

After saving the file, you can reload your PostgreSQL server from within PostgreSQL by typing:

SELECT pg_reload_conf();

You can verify that the server is using your new settings by checking the current value again:

SELECT * FROM pg_settings WHERE name = 'log_min_duration_statement'\gx
-[ RECORD 1 ]---+---------------------------------------------------------------------------
name | log_min_duration_statement
setting | 5000
unit | ms
category | Reporting and Logging / When to Log
short_desc | Sets the minimum execution time above which all statements will be logged.
extra_desc | Zero prints all queries. -1 turns this feature off.
context | superuser
vartype | integer
source | configuration file
min_val | -1
max_val | 2147483647
enumvals |
boot_val | -1
reset_val | 5000
sourcefile | /etc/postgresql/14/main/postgresql.conf
sourceline | 506
pending_restart | f

Now, the setting field is set to 5000 and the unit field is set to ms, indicating that our setting of 5 seconds has been translated to 5000 milliseconds and applied. The sourcefile line also confirms that this value is being read from the configuration file we modified.

Logging slow queries per database

Another option when trying to detect slow queries is to limit slow query logging to a specific database. While log_min_duration_statement can be set globally, as we showed in the last section, it can also be configured at the database level.

To turn on slow query logging for a single database, use the ALTER DATABASE command:

ALTER DATABASE helloprisma SET 'log_min_duration_statement' = 2000;
ALTER DATABASE

Note that unlike the global setting, when using the ALTER DATABASE command, the value must be a unit-less integer representing time in milliseconds.

We can verify that the setting has been applied by querying for the per-database role settings:

\drds
List of settings
Role | Database | Settings
------+-------------+-------------------------------
| helloprisma | log_min_duration_statement=2000
(1 row)

We can verify that this hasn't interfered with the global setting that we previously set to a 5 seconds threshold:

SELECT * FROM pg_settings WHERE name = 'log_min_duration_statement'\gx
-[ RECORD 1 ]---+---------------------------------------------------------------------------
name | log_min_duration_statement
setting | 5000
unit | ms
category | Reporting and Logging / When to Log
short_desc | Sets the minimum execution time above which all statements will be logged.
extra_desc | Zero prints all queries. -1 turns this feature off.
context | superuser
vartype | integer
source | configuration file
min_val | -1
max_val | 2147483647
enumvals |
boot_val | -1
reset_val | 5000
sourcefile | /etc/postgresql/14/main/postgresql.conf
sourceline | 506
pending_restart | f

Testing slow query logging

Test the setting out by issuing a statement that exceeds the minimum logging duration:

SELECT pg_sleep(10);
pg_sleep
----------
(1 row)

Check the logs and you should find statements indicating that a long running query occurred:

2022-11-11 17:58:04.719 CET [1121088] postgres@postgres STATEMENT: select sleep(10);
2022-11-11 17:58:42.635 CET [1121088] postgres@postgres LOG: duration: 10017.171 ms statement: select pg_sleep(10);

Since we have different thresholds for the global limit and a specific table, we can test that each are being applied correctly by using a query time that should trigger one but not the other.

For example, we can connect to the database that has a lower threshold and sleep for 4 seconds, which should trigger a log line:

\c helloprisma
SELECT pg_sleep(4);

Our logs show:

2022-11-13 14:46:07.361 CET [1252789] postgres@helloprisma STATEMENT: alter database helloprisma set log_min_duration_statement=2s;
2022-11-13 14:53:05.027 CET [1309069] postgres@helloprisma LOG: duration: 4022.546 ms statement: select pg_sleep(4);

Now, we can switch to a different database that should only be affected by the global setting. The same sleep statement should not trigger a log line:

\c postgres
SELECT pg_sleep(4);

No new log lines should be recorded.

Conclusion

In this article we covered how to view and understand some of the performance information that PostgreSQL makes available. Viewing this information can give you insight into different bottlenecks in your system resources, query patterns, configuration settings. When you experience slow performance, you can check the information that PostgreSQL provides to begin investigating the problematic behavior.

We also discussed how to use slow query logging to pinpoint exactly which queries are tying up system resources and taking longer to execute than expected. Recording this data and evaluating the resulting logs can help you identify places where you might need additional indexes, a different query structure, or a more efficient query design. Knowing how to identify these expensive operations is the first step towards running more functional database-backed applications.

About the Author(s)
Justin Ellingwood

Justin Ellingwood

Justin has been writing about databases, Linux, infrastructure, and developer tools since 2013. He currently lives in Berlin with his wife and two rabbits. He doesn't usually have to write in the third person, which is a relief for all parties involved.