Write Ahead Logging (WAL) is a hugely popular database technique to ensure database atomicity and durability. It operates by logging writes to secure storage before making any permanent changes to the database. In this way, even if the database server crashes, we will have all our writes secured.

💡 If you read carefully, the name Write Ahead Logging (WAL) aptly describes this technique. You log (Logging) before writing (Write Ahead) to the database.

Postgres also makes use of this technique and add log entries to the Write Ahead Log (WAL), aka Transaction Log, before actually performing any modifications to the database. The Write Ahead Log (WAL) is an append only log which means that new entries are always added to the end of the log file and once an entry is added it is not modified again.

Here is a quick overview of how a transaction is committed in Postgres ~

  1. Postgres server receives request to modify parts of the database
  2. A WAL entry is appended to describe the changes needed to be performed in database
  3. Instead of modifying the data files, changes are performed only in the primary memory. These changes are then synced to the data files at a later point in time
  4. Once step 2 and 3 are completed, the transaction is deemed committed.
None
Behind the curtains of committing transaction in Postgres

As records are added to WAL, they are assigned an identifier called the Log Sequence Number (LSN). LSN denotes the byte offset at which a particular record is present in the log file. The difference between the LSN values of two records indicates the volume of data (in bytes) that separates the two.

We need to understand that the WAL is not a single huge file. Instead, the WAL is composed up of a number of small segment files. By default these segments are of 16 MB but this size can be configured. All of these segments are kept under the pg_wal folder inside the data directory of your Postgres installation.

💻 Execute the command SHOW data_directory; as Postgres client to find the data directory. You should find the pg_wal folder there!

None

Having understood WAL, a natural question to ask at this point is ~

"Why not simply write to the database? What is the point of updating the log and then updating the database? Inserting a log entry wouldn't be cheap either!"

Writes to WAL are sequential compared to random writes to the data files. It is thus important to understand that writing to WAL is not as costly as updating the data files. When a large amount of changes are required, writing to WAL would almost always be faster than that in the data files.

It is thus wise to batch the write operations to the data files and perform them in bulk. This is exactly what is performed in Postgres!

Postgres maintains a portion of the database as cache in the shared buffer. This is done to facilitate faster retrieval of frequently accessed data. Postgres' backend process requests the Buffer manager for a page and the Buffer manager either loads the page from disk to shared buffer or directly returns it from the shared buffer.

None
Shared Buffer Pool

Whenever a database modification is performed, it is first kept as modifications to the in-memory pages kept in shared buffer. Modified pages in shared buffer differs from the corresponding pages in the memory. Such pages are called dirty pages.

Buffer manager flushes these dirty pages to disk with the assistance of two subsystems — Checkpointer and Background Writer.

Checkpointer

A checkpoint is a special point in WAL at which all the changes made to the database are guaranteed to be written to the data files. This means that at the time of a checkpoint, the database is in a consistent state, and any data modifications are safely stored on disk.

At every checkpoint, the dirty pages are flushed to the disk. The frequency of checkpointing process depends on values of checkpoint_segment and checkpoint_timeout.

The checkpoint_segment indicates the maximum number of log segments between two checkpoints and the checkpoint_timeout indicates the maximum time between two checkpoints.

None
Checkpoint Process Summary

When the checkpoint is completed, a special checkpoint record is inserted in the WAL. This record is known as the REDO record.

Since log entries before a checkpoint are all guaranteed to be already applied to the database, there is no need to keep holding on to these entries and they can thus be re-cycled and removed. This will free up the WAL storage.

Background Writer

The primary function of the background writer is to ensure a consistent availability of clean buffers (pages in memory) for the database, thus allowing backend processes that handle individual queries to avoid performing their own disk writes.

Background Writer moves some of the modified data to the disk offloading the Checkpointer.

Archiving

As we keep modifying the data in the database on a server, WAL files keep getting generated (and discarded after a while). If we could somehow save a copy of each segment generated, we could replay back the entire set of changes into another server. Doing this in real-time can help us achieve replication.

In PostgreSQL terms, copying out generated WAL files is called archiving, and getting the server to read in a WAL file and apply it is called restoring.

To instruct Postgres on how to archive our WAL files, we can provide it with an archive script. Postgres will invoke this script as and when each WAL file is ready for archiving. The script has to process it (typically copy it to a safe location) and report whether it was successful, via the exit code (that is, exit 0 from your script on successful completion).

Once a WAL segment has been successfully archived, Postgres is free to delete it.

By default, the WAL segments only contain enough information to recover from a crash or immediate shutdown. This level is known as minimal. A higher logging level of replica allows for enough information to support archiving and replication. The highest logging level is called the logical level where the log contains enough information to extract change set from the WAL.

WAL level can be configured using the wal_level configuration.

With this we reach the end of this blog. PG WAL is one of those topics that once mastered can help you a long way in your Software Engineering journey. The importance of PG WAL really kicks in when working with distributed systems where database replication is always necessary to avoid database bottlenecking!

If you enjoyed reading this article, consider clapping and leaving a comment. Also, here are some other blogs that would enjoy reading! So go ahead and don't let the learning stop!

Level Up Coding

Thanks for being a part of our community! Before you go:

🔔 Follow us: Twitter | LinkedIn | Newsletter

🚀👉 Join the Level Up talent collective and find an amazing job