DEV Community: Raouf Chebri

Autoscaling in Action: Postgres Load Testing with pgbench

Raouf Chebri — Fri, 23 Feb 2024 09:25:39 +0000

In this article, I’ll show Neon autoscaling in action by running a load test using one of Postgres’ most popular benchmarking tool, pgbench. The test simulates 30 clients running a heavy query.

While 30 doesn’t sound like a lot, the query involves a mathematical function with high computational overhead, which signals to the autoscaler-agent that it needs to allocate more resources to the VM.

We will not cover how autoscaling works, but for those interested in knowing the details, you can read more about how we implemented autoscaling in Neon.

For this load test, you will need:

The load test

Ensuring your production database can perform under varying loads is crucial. That’s why we implemented autoscaling to Neon, a feature that dynamically adjusts resources allocated to a database in real-time, based on its current workload.

However, the effectiveness and efficiency of autoscaling are often taken for granted without thorough testing. To showcase autoscaling in action, we turn to Postgres and pgbench.

pgbench is a benchmarking tool included with Postgres, designed to evaluate the performance of a Postgres server. The tool simulates client load on the server and runs tests to measure how the server handles concurrent data requests.

pgbench is executed from the command line, and its usage can vary widely depending on the specific tests or benchmarks being run. Here is the command we will use in our test:

pgbench -f test.sql -c 30 -T 120 -P 1 &lt;CONNECTION_STRING&gt;

In this example, pgbench executes the query in test.sql. The parameter -c 30 specifies 30 client connections, and -T 120 runs the test for 120 seconds against your database. -P 1 specifies that pgbench should report the progress of the test every 1 second. The progress report typically includes the number of transactions completed so far and the number of transactions per second.

30 clients don’t seem like enough do stress a database. Well, it depends on the query you’re executing, which we’ll see next.

Query execution plan

Here is the query we’ll use for our load test:

SELECT log(factorial(32000)) / log(factorial(20000));

Mathematically, this query essentially compares the growth rates of the factorials of 32,000 and 20,000 by examining the ratio of their logarithms.

Remember factorials? The factorial of a number n (denoted as n!) is the product of all positive integers less than or equal to n. For example, the factorial of 5 (5!) is 5 * 4 * 3 * 2 * 1 = 120. Factorials grow very rapidly with increasing numbers.

To give you a sense of scale, the factorial of just 20 is already a 19-digit number: 20!=2,432,902,008,176,640,000

The natural logarithmic function (log), on the other hand, is the power to which e (Euler’s number = 2.71828) must be raised to obtain the value x.

In other words, this operation should take a long time to process. How long? Let’s examine the query execution plan using EXPLAIN ANALYZE:

EXPLAIN ANALYZE SELECT log(factorial(32000)) / log(factorial(20000));

Output:

QUERY PLAN                                      

-------------------------------------------------------------------------------------

 Result  (cost=0.00..0.01 rows=1 width=32) (actual time=0.000..0.001 rows=1 loops=1)

 Planning Time: 1921.630 ms

 Execution Time: 0.005 ms

(3 rows)

This query was executed on ¼ vCPU. EXPLAIN ANALYZE includes the planner’s estimates and real execution metrics. Execution Time appears to be quite fast. However, Planning Time (the time taken by the Postgres query planner to generate the execution plan) takes almost 2 seconds and suggests that preparing to run this mathematical function involves significant computational overhead.

Combine 30 of those, and we should stress Postgres enough to trigger autoscaling.

Enabling autoscaling

Autoscaling is the process of automatically increasing or decreasing the CPU and memory allocated to a database based on its current load. It dynamically adjusts the compute resources allocated to a Neon compute instance in response to the current load, eliminating the need for manual intervention. Learn more about autoscaling in the docs.

You can enable autoscaling by defining the minimum and maximum compute units (CU) you’d like to allocate to your Postgres instance. This way, you remain in control of your resource consumption. For example, 1 CU allocates 1vCPU and 4GB of RAM to your instance.

You can set your instance size when you create a new project or by navigating to the Branches page on your Neon Console, clicking on the database branch, and setting the CU range.

I will set the range for this load test from ¼ to 7 CUs.

Executing & monitoring the load test

Let’s run our load test now and observe its effect on our Postgres instance. We recently added graphs to monitor the resources allocated to your Postgres instance and its usage, which will come in handy later. After enabling autoscaling, follow these steps to execute the load test:

Create your project folder and test.sql file:

mkdir pgbench-load-test
cd pgbench-load-test
echo "SELECT log(factorial(32000)) / log(factorial(20000));" &gt; test.sql'

Execute the load test by running the following command:

pgbench -f test.sql -c 8 -T 120 -P 1 &lt;YOUR_CONNECTION_STRING&gt;

you can create a Neon project if you don’t have a connection string.

Navigate to the autoscaling graph to monitor usage:

You should observe a rapid change in CPU and memory allocated. The result should look similar to the graph below.

The performance summary returned by pgbench should look like this:

latency average = 6000.891 ms
latency stddev = 2768.066 ms
initial connection time = 3712.770 ms
tps = 4.978907 (without initial connection time)

On average, each operation took slightly over 6 seconds to complete. A standard deviation of 2768.066 ms means that the latencies of individual operations varied quite a bit around the average latency. A higher standard deviation indicates more variability in how long each operation took to complete.

Establishing this connection took approximately 3.7 seconds before any operations could be performed. A TPS of around 4.98 means that, on average, the database was able to complete nearly five transactions every second during the test, after excluding the initial connection time.

Conclusion

pgbench is a simple yet powerful tool to test your database and simulate multiple clients running heavy SQL queries. We also saw how to examine the query execution plan with EXPLAIN ANALYZE, which provides insights to optimize your SQL queries.

If you’re running an application that can be subject to varying workloads, autoscaling offers you the confidence that your database will always under the stress of real-world demands.

Thanks for reading. If you are curious about autoscaling, give Neon a try and join our Discord. We look forward to seeing you there and hearing your feedback.

Happy scaling!

Point In Time Recovery Under the Hood in Serverless Postgres

Raouf Chebri — Thu, 22 Feb 2024 12:44:01 +0000

Imagine working on a crucial project when suddenly, due to an unexpected event, you lose significant chunks of your database. Whether it’s a human error, a malicious attack, or a software bug, data loss is a nightmare scenario. But fear not! We recently added support for Point-In-Time Restore (PITR) to Neon, so you can turn back the clock to a happier moment before things went south.

In the video below and in the PITR announcement article, my friend Evan shows you can recover your data in a few clicks. He also uses Time Travel Assist to observe the state of the database at a given timestamp to confidently and safely run the restore process.

How is this possible? This article is for those interested in understanding how PITR works under the hood in Neon. To better explain this, we will:

Cover the basics of PITR in Postgres
Explore the underlying infrastructure that allows for PITR in Neon.

We’ll ensure by the end of this post that you’re always prepared for disaster strikes.

Understanding the basics of Point In-Time Recovery in Postgres

PITR in Postgres is made possible using two key components:

Write-Ahead Logging : Postgres uses Write-Ahead Logging (WAL) to record all changes made to the database. Think of WAL as the database’s diary, keeping track of every detail of its day-to-day activities.
Base backups : Base backups are snapshots of your database at a particular moment in time.

With these two elements combined, you define a strategy to restore your database to any point after the base backup was taken, effectively traveling through your database’s timeline. However, you’d need to do some groundwork, which consists of the following:

Setting up WAL archiving: By defining an archive_command and setting archive_mode to on in your postgresql.conf.
Creating base backups: You can use the pg_basebackup to create daily backups.

If, for any reason, you need to restore your database, you need to recover the latest backup and replay the WAL on top of it. The same logic applies to restoring from a point in time in the retention period.

Let’s say we want to restore the database to its state on February 1st at 14:30. We first locate the last backup file created before that target time, restore it, and then replay the WAL up to that time.

Great! We now know how to perform a PITR in Postgres. However, there are a few limitations to this approach:

You might notice a drop in performance while performing backups,
Because you have a finite storage capacity, you must define a limit to your archived WAL. This limit is known as the retention period (a.k.a history retention), which determines how far back in time your data can be restored.
You have a single point of failure (SPOF) since all base backups and WAL archives are in the same location.

We can enhance our architecture by adopting disaster recovery tools like Barman to avoid SPOF and downtime. With Barman, Postgres streams base backups and WAL archives to an external backup server. Or, if you know what you’re doing, you can configure Postgres to stream base backups and WAL archives to an AWS S3 bucket, and add a standby, which serves as an exact copy of your database, to avoid downtime. Your setup would look like this:

To sum it up and to perform a PITR in Postgres without downtime, you need to:

Have a backup server
Set up WAL archiving and stream it to the backup
Schedule daily backups

Additionally, you need to install a bunch of packages and configure and maintain this infrastructure, a time that can be spent focused on your application instead. It’s that convenience, simplicity, and confidence in your data of use that Neon offers.

So, how do we make it look so easy? Let’s step back and explain how Neon’s storage engine works.

Understanding Neon’s architecture

Neon’s philosophy is that the “database is its logs”. In our case: “Postgres is its WAL records”.

Neon configures Postgres to stream the WAL to a custom Rust-based storage engine. Neon’s storage engine is composed of three parts:

A persistence layer called “Safekeepers” makes sure the written data is never lost, using Paxos as a consensus algorithm.
A storage layer called “Pageservers”: multi-tenant storage that can reconstruct the data from WAL and send it to Postgres.
A second persistence layer to durably store the WAL in AWS S3.

And since all the data is stored in Neon’s storage engine, Postgres doesn’t need to persist data on the local disk. This turns Postgres into a stateless compute instance that can start in under 500ms, making Neon serverless.

As a result, we no longer require:

A standby: because, in the case of a Postgres crash, we can quickly spin up another instance.
Backups: Neon’s storage engine stores the WAL and creates and performs compactions

The data flow would look like the following:

Check out the Architecture decisions in Neon article by Heikki Linnakangas to learn more.

To understand the magic behind PITR in Neon, we’ll explore how the Pageservers work.

Pageservers: under the hood

Each transaction in the WAL is associated with a Log Sequence Number (LSN), marking the byte position in the WAL stream where the record of that transaction starts. If we follow our initial analogy of WAL being a detailed diary of everything in the database, then the LSN is the page number in that diary.

The Pageserver can be represented by a 2-dimensional graph, where the Y-axis is the LSN, and the X-axis is the key that points to the database, relation, and then block number. A key for example can point to certain rows in your database.

When data is written in Neon, the role of Pageservers is to accumulate WAL records. Then, when these records reach approximately 1GB in size, Pageservers create two types of immutable layer files:

Image layers (bars): contain a snapshot of a key range for a specific LSN. You can see Image Layers as the state of rows in certain tables or indexes at a given time.
Delta layers (rectangles): contain the incremental changes within a key range. You can see Delta layers as a log of all the changes that happened to your rows.

Does this sound familiar?

Indeed, it employs the same principle as the traditional Postgres setups for PITR we’ve previously discussed, which include base backups and WAL archiving. The main difference here is that you don’t need to initiate a lengthy and complex restore procedure every time you wish to read data from a previous state of the database. This is because Pageservers inherently know how to reconstruct the state of the page at any given LSN or timeline.

Ephemeral branches

We mentioned previously that, in Postgres, each WAL record is associated with an LSN. In Neon, Postgres tracks the last evicted LSN in the buffer cache, so Postgres knows at which point in time it should fetch the data.

When Postgres requests a page from the Pageserver, it triggers the GetPage@LSN function, which returns the state of a given key at that specific LSN.

Read the Deep dive in Neon’s storage engine article to learn more about Neon’s architecture.

In practice, you can access different timelines through database branches. These branches are copy-on-write clones of your database, representing the state of your data at any point in its history. When you create a branch, you specify the LSN (or a timestamp), and Neon’s control plane generates a timeline associated with your project, keeping track of it.

We’ve enhanced the Point In Time Recovery (PITR) feature in Neon with Time Travel Assist. This functionality allows you to perform Time Travel queries to review the state of your database at a specific timestamp or LSN, following the same underlying steps:

Creating a timeline, and
Running GetPage@LSN.

However, these branches are ephemeral, having a Time To Live (TTL) of 10 seconds. We refer to these as ephemeral branches, and they will soon become a crucial part of your development workflows.

Ephemeral branches enable you to connect to a previous state of your database by merely specifying the LSN or timestamp in your connection string. This capability is natively supported by Pageservers, and Neon’s PITR feature is the first step towards making ephemeral connections available to developers. Stay tuned for more development in this area.

Conclusion

While Postgres’ features offer powerful options and tools like Barman to help with disaster recovery, Neon’s approach makes PITR reliable, accessible, efficient, and integrated into a seamless database management experience.

By first exploring how to do PITR in Postgres, we’ve learned about the importance of continuous archiving and creating base backups.

Neon’s storage engine saves WAL records and snapshots of your database and can natively reconstruct data for any point in time in your history. This capability allows for the Time Travel Assist to query your database at a given timestamp before you proceed to its restoration using short-lived or ephemeral branches.

Ephemeral branches introduce a unique way to interact with your data’s history by allowing developers to access different timelines and perform Time Travel queries to provide the ability to review prior states and understand your data’s lifecycle.

What about you? How often do you use PITR in your projects? Join us on Discord and let us know how we can enhance your Postgres experience in the cloud.

Special thanks to skeptrune for reviewing and suggesting adding a mention to Barman.

PgBouncer: The one with prepared statements

Raouf Chebri — Thu, 15 Feb 2024 09:43:20 +0000

The latest release of PgBouncer 1.22.0 increases query throughput by 15% to 250% and includes support for DEALLOCATE ALL and DISCARD ALL, as well as protocol-level prepared statements released in 1.21.0.

In this article, we’ll explore what prepared statements are and how to use PgBouncer to optimize your queries in Postgres.

What are Prepared Statements?

In Postgres, a prepared statement is a feature that allows you to create and optimize an SQL query once and then execute it multiple times with different parameters. It’s a template where you define the structure of your query and later fill in the specific values you want to use.

Here’s an example of creating a prepared statement with PREPARE:

PREPARE user_fetch_plan (TEXT) AS
SELECT * FROM users WHERE username = $1;">

Here, user_fetch_plan is the name of the prepared statement, and $1 is a placeholder for the parameter.

Here is how to execute the prepared statement:

EXECUTE user_fetch_plan('alice');">

This query will fetch all columns from the users table where the username is alice.

Why Use Prepared Statements?

Performance : Since the SQL statement is parsed and the execution plan is created only once, subsequent executions can be faster. However, this benefit might be more noticeable in databases with heavy and repeated traffic.
Security : Prepared statements are a great way to avoid SQL injection attacks. Since data values are sent separately from the query, they aren’t executed as SQL, making injecting malicious SQL code difficult.

What is PgBouncer?

Before diving into what PgBouncer is, let’s take a step back and briefly touch on how Postgres operates.

Postgres runs on a system of several interlinked processes, with the postmaster taking the lead. This initial process kicks things off, supervises other processes, and listens for new connections. The postmaster also allocates a shared memory for these processes to interact.

Whenever a client wants to establish a new connection, the postmaster creates a new backend process for that client. This new connection starts a session with the backend, which stays active until the client decides to leave or the connection drops.

Here’s where it gets tricky: Many applications, such as serverless backends, open numerous connections, and most eventually become inactive. Postgres needs to create a unique backend process for each client connection. When many clients try to connect, more memory is needed. In Neon, for example, the default maximum number of concurrent direct connections is set to 100.

The solution to this problem is connection pooling with PgBouncer, which helps keep the number of active backend processes low.

PgBouncer is a lightweight connection pooler which primary function is to manage and maintain a pool of database connections to overcome Postgres’ connection limitations. Neon projects come by default with direct and pooled connections. The latter uses PgBouncer and currently offers up to 10,000 connections.

Depending on your database provider, you'll have different ways to access to PgBouncer. On Neon, you can check the “Pooled connection” box in the connection details widget and make sure is contains the -pooler suffix.

postgres://johndoe:mypassword@ep-billowing-wood-25959289-pooler.us-east-1.aws.neon.tech/neondb">

Using Prepared Statements with PgBouncer in client libraries:

PgBouncer supports prepared statements at the protocol level, and therefore, the above SQL-level prepared statement using PREPARE and EXECUTE will not work with PgBouncer. See PgBouncer’s documentation for more information.

However, you can use prepared statements with pooled connections in a client library. Most PostgreSQL client libraries offer support for prepared statements, often abstracting away the explicit use of PREPARE and EXECUTE. Here’s how you might use it in a few popular languages:

// using psycopg2
cur = conn.cursor()
  query = &quot;SELECT * FROM users WHERE username = %s;&quot;
  cur.execute(query, ('alice',), prepare=True)
  results = cur.fetchall()">

// using pg  
const query = {
   // give the query a unique name
   name: 'fetch-user',
      text: 'SELECT * FROM users WHERE username = $1',
      values: ['alice'],
  };
  client.query(query);">

In these client libraries, the actual SQL command is parsed and prepared on the server, and then the data values are sent separately, ensuring both efficiency and security.

Under the hood, PgBouncer examines all the queries sent as a prepared statement by clients and assigns each unique query string an internal name (e.g. PGBOUNCER_123). PgBouncer rewrites each command that uses a prepared statement to use the matching internal name before forwarding the corresponding command to Postgres.

                +-------------+
                | Client |
                +------+------+
                       |
                       | Sends Prepared Statement (e.g., &quot;SELECT * FROM users WHERE id = ?&quot;)
                       |
                +------v------+
                | PgBouncer |
                | |
                | 1. Examines and tracks the client's statement. |
                | 2. Assigns an internal name (e.g., PGBOUNCER_123).|
                | 3. Checks if the statement is already prepared |
                | on the PostgreSQL server. |
                | 4. If not, prepares the statement on the server. |
                | 5. Rewrites the client's command to use the |
                | internal name. |
                +------^------+
                       |
                       | Forwards Rewritten Statement (e.g., &quot;SELECT * FROM users WHERE id = ?&quot; as PGBOUNCER_123)
                       |
                +------v------+
                | PostgreSQL |
                | Server |
                | |
                | Executes the forwarded statement with the internal name. |
                +-------------+">

In Summary

PgBouncer bridges the gap between the inherent connection limitations of Postgres and the ever-growing demand for higher concurrency in modern applications.

Leveraging prepared statements can be a valuable asset to boost your Postgres query performance and adds a layer of security against potential SQL injection attacks when using pooled connections.

You can try prepared statements in PgBouncer with Neon today. We can’t wait to see what you build using it. Happy querying.

If you have any questions or feedback, don’t hesitate to get in touch with us on Discord. We’d love to hear from you.

pgvector: 30x Faster Index Build for your Vector Embeddings

Raouf Chebri — Wed, 07 Feb 2024 15:43:47 +0000

We are Neon, the serverless Postgres. We power thousands of AI apps with the pgvector extension and separate storage and compute enabling your database resources to scale independently. In this article, Raouf explains how you can use Neon’s elasticity, and parallel HNSW index build in pgvector (0.5.1 for now, and 0.6.0 soon) to scale your AI apps.

Postgres’ most popular vector search extension, pgvector, recently implemented a parallel index build feature, which significantly improves the Hierarchical Navigable Small World (HNSW) index build time by a factor of 30.

Congratulations to Andrew Kane and pgvector contributors for this release, which solidifies Postgres’ position as one of the best databases for vector search and allows you to utilize the full power of your database to build the index.

Tests run by Johnathan Katz using a 10M dataset with 1,536-dimension vectors on a 64 vCPU, 512GB RAM instance.

With Neon’s elastic capabilities and its architecture that separates storage and compute, you can, from the console or using the Neon API, allocate additional resources to your Postgres instance specifically for your HNSW index build process and then scale down to meet user demands, making Neon and pgvector a match made in heaven for efficient AI applications that scale to millions of users.

This article details how you can use pgvector with Neon.

The power of pgvector

Pgvector is Postgres’ most popular extension for vector similarity search. Vector search has become increasingly crucial to semantic search and Retrieval Augmented Generation (RAG) applications, enhancing the long-term memory of large language models’ (LLMs).

In both semantic search and RAG use cases, the database contains a knowledge base that the LLM wasn’t trained on, split into a series of texts or chunks. Each text is saved in a row and is associated with a vector generated by an embedding model such as OpenAI’s ada-embedding-002 or Mistral-AI’s mistral-embed.

Vector search is then used to find the most similar (closer) text to the query vector. This is achieved by comparing the query vector with every row in the database, making vector search hard to scale. This is why pgvector implemented approximate nearest neighbor (ANN) algorithms (or indexes), which conduct the vector search over a subset of the database to avoid lengthy sequential scans.

One of the most efficient ANN algorithms is the Hierarchical Navigable Small World (HNSW) index. Its graph-based and multi-layered nature is designed for billions-of-row vector search. This makes HNSW extremely fast and efficient at scale and one of the most popular indexes in the vector store market.

HNSW’s Achilles heel: memory and build time

HNSW was first introduced by Yu A Malkov and Dmitry A. Yashunin in their paper titled Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs.

HNSW is a graph-based approach to indexing high-dimensional data. It constructs a hierarchy of graphs, where each layer is a subset of the previous one, which results in a time complexity of O(log(rows)). During the search, it navigates through these graphs to quickly find the nearest neighbors.

As fast and efficient as HNSW is, the index has two drawbacks:

1. Memory: The index requires significantly more memory than other indexes, such as the Inverted File Index (IVFFlat). You can solve the memory issue by having a larger database instance. But if you use standalone Postgres such as AWS RDS, you will find yourself in a position where you over-provision just for the index build. With Neon scaling capabilities, however, you can scale up, build the HNSW index, and then scale back down to save on cost.

2. Build time: The HNSW index can take hours to build for million-row datasets. This is mainly due to the time spent calculating the distance among vectors. And this is precisely what pgvector 0.6.0 solves by introducing Parallel Index Build. By allocating more CPU and workers, you build your HNSW index 30x faster.

But wait! The HNSW index supports updates, so why is this feature parallel index build necessary if you only need to build the index once?

Well, there are two cases where you need to create an HNSW index:

When you want faster queries and to optimize for vector search
When you already have an HNSW index, and you delete vectors from the table

The latter might cause the indexed search to return false positives, negatively impacting the quality of the LLM response and the overall performance of your AI application.

Scale up and boost index build time

pgvector 0.6.0 speeds up index build time up to 30 times compared to previous versions when using parallel workers. This improvement is especially notable when dealing with large data sets and vector sizes, such as OpenAI 1536 dimension vector embeddings.

Creating an HNSW index could require significant resources. The reason is you need to allocate enough maintenance_work_mem to fit the index in memory. Otherwise, the hnsw graph will take significantly longer to be built.

NOTICE:  hnsw graph no longer fits into maintenance_work_mem after 100000 tuples
DETAIL:  Building will take significantly longer.
HINT:  Increase maintenance_work_mem to speed up builds.

With Neon, you can scale up your Postgres instance using the Console or the API, configure it to build the index, and then scale back down to save on cost.

To effectively use parallel index build, it’s essential to configure Postgres with suitable settings. Key parameters to consider are:

maintenance_work_mem: This parameter determines the memory allocated for creating or rebuilding indexes. This parameter affects the performance and efficiency of these operations. Setting this to a high value, such as 8GB, allows for more efficient handling of the index build process.

SET maintenance_work_mem = '8GB';

max_parallel_maintenance_workers: This dictates the number of parallel workers that can be employed. The default value of max_parallel_maintenance_workers is typically set to 2 in Postgres. Setting this to a high number enables the utilization of more computing resources for faster index builds.

SET max_parallel_maintenance_workers = 7; -- plus leader

Note: Neon supports for pgvector 0.5.1. However, our engineering team is working on adding support for 0.6.0. Stay tuned.

How does this affect recall performance?

Recall is as important of a metric as query execution time in RAG applications. Recall is the percentage of correct answers the ANN provides. In the HNSW index, ef_search is the parameter that determines the number of neighbors to scan at search time. The higher ef_search is, the higher the recall and the higher the query execution time.

The tests conducted by Johnathan Katz show that using parallel builds has negligible impact on recall, with most changes swinging positively by over 1%. Despite the substantial speed improvements, this remarkable stability in recall rates highlights the effectiveness of pgvector 0.6.0’s parallel build process.

Conclusion

pgvector 0.6.0 represents a significant leap forward, proving that Postgres is an important player in the vector search space. By harnessing the power of parallel index building, developers can now construct HNSW indexes more rapidly and efficiently, significantly reducing the time and resources traditionally required for such tasks.

Neon’s flexible and scalable serverless Postgres offering complements pg vector’s capabilities perfectly. Users can scale their database resources according to their specific needs for index building and then scale down to optimize costs, ensuring an economical yet powerful solution.

What AI applications are you currently building? Try pgvector on Neon today, join us on Discord, and let us know how we can improve your experience with serverless PostgreSQL.

Bring Your Own Extensions to Serverless PostgreSQL

Raouf Chebri — Wed, 17 Jan 2024 14:07:36 +0000

Extensions in PostgreSQL are comparable to libraries in programming languages or plugins in web browsers. They are pivotal in the PostgreSQL ecosystem, providing additional functionalities ranging from encryption and AI to handling time series and geospatial data. More complex extensions can transform PostgreSQL into a graph or analytical database, and some companies even create custom private extensions for specific business logic.

Neon’s compute in stateless PostgreSQL, which runs as a VM or a Kubernetes pod. The compute image comes with a list of supported extensions. However, supporting a wide range of PostgreSQL extensions can pose performance and security risks in a multi-tenant serverless environment like Neon. This is why we are excited to announce we added support for private and custom extensions using Dynamic Extension Loading.

This feature is currently in beta on request only. You can contact support if you want to bring your own extensions to Neon. In this article, we’ll introduce Dynamic Extension Loading, its implementation, its benefits, and our future plans.

Extensions in PostgreSQL

PostgreSQL is a robust and versatile database system that is further enhanced by its support for extensions. Some of the most popular extensions are PostGIS for geolocation, pg_stat_statement, or pgvector for vector similarity search.

Extensions in PostgreSQL come in various forms:

SQL Object Packages: These are the most common, comprising domain-specific data types, functions, triggers, etc.
Procedural Languages: Extensions like PLPython or PLV8 enable the use of different programming languages within PostgreSQL.
Internal API Enhancements: Written in C, these powerful extensions can introduce new storage methods, volume replication, background jobs, and configuration parameters.
Extensions in Other Languages: Beyond C, extensions can be developed in languages like C++ or Rust, broadening the scope of functionality.

To use an extension, it must be built against the correct major version of PostgreSQL. The installation involves placing files in the shared directory and library files in the libdir, paths that vary across platforms. After placing the files, the CREATE EXTENSION command is executed in the database, prompting PostgreSQL to locate and run the installation scripts for the extension.

Extension support limitations in serverless environments

In Neon's serverless PostgreSQL environment, each compute runs as an ephemeral Kubernetes pod or VM. A compute instance can be scaled up, down, and descheduled whenever the workload changes. Therefore, supporting a wide range of PostgreSQL extensions presents significant challenges such as:

Compatibility: Many extensions are not designed for serverless architectures, particularly those needing persistent storage or deep system integration, such as pg_cron and file_fdw.
Performance Issues: Embedding all extensions in the compute image significantly increases its size, leading to slower start times and reduced performance.
Maintenance Overhead: Traditional methods require frequent updates to the entire compute image for each extension update, causing potential service disruptions.
Security Risks: A larger set of extensions in the base image increases the potential attack surface, especially with extensions that remain unused by many users.
Limited Customization: The open-source nature of compute images restricts the inclusion of custom or closed-source extensions, limiting tailored solutions for specific customer needs.

Therefore, the conventional method of bundling extension files into compute images is impractical due to the sheer number of extensions and the varied needs of users. This led us to rethink how we provide extensions with Dynamic Extension Loading.

Dynamic Extension Loading: A New Approach

At Neon, we've addressed these challenges with our dynamic extension loading mechanism. Here's how it works:

Building and Storing Extensions: We build extensions in a separate repository and store the resulting files in an S3 bucket.
Configuring Extensions: Extensions are configured per user in the Neon control plane, enhancing customization.
On-Demand Loading: Compute instances download control files at startup, and library files are fetched as needed when extension functions are called.

With Dynamic Extension Loading, private and default extensions can be added to compute instances without restarting, reducing maintenance overhead. Additionally, it brings performance benefits to Neon. Our plans with Dynamic Extension Loading include moving all default-supported extensions to the extension storage, resulting in a smaller compute image size and faster start times.

How to bring your own extension to Neon

To request support for a Postgres extension, paid plan users can open a support ticket. Free plan users can submit a request via the feedback channel on our Discord Server.

Our engineers will then evaluate the compatibility of your extensions with Neon, build it, and upload the artifacts to the extension storage once it pass all the security tests.

Conclusion

This feature is currently in beta, with plans for general availability in the near future. This development marks a significant step forward in making PostgreSQL more adaptable and efficient in a serverless environment.

What about you? Do you use PostgreSQL extensions in your projects? Join us on Discord and let us know which extensions you use and how we can enhance your PostgreSQL experience in the cloud.

Change Data Capture with Serverless Postgres

Raouf Chebri — Thu, 21 Dec 2023 12:23:07 +0000

Modern applications often require loosely coupled components and services that help teams and systems to scale. These data pipelines generate continuous data streams that need to be replicated, processed, or analyzed.

However, moving data between different data stores can seriously compromise the quality and reliability of your decisions because inconsistent data or corruption occurs during transformation. This is why Change Data Capture (CDC) has emerged as one of the most popular methods to synchronize data across multiple data stores. One way to use CDC in Postgres is with logical replication.

Today, we’re excited to announce the release of logical replication in beta on Neon. This feature lets you stream your data hosted on Neon to external data stores, allowing for change data capture and real-time analytics.

Why CDC matters?

CDC refers to the process of capturing changes made to data in a database – such as inserts, updates, and deletes – and then delivering these changes to downstream processes or systems.

CDC operates by monitoring and capturing data changes in a source database as they occur. This is a departure from traditional batch processing, where data updates are transferred at scheduled intervals. CDC ensures that every change is captured and can be acted upon almost instantaneously.

Why CDC Matters

Data synchronization: In a distributed system architecture, keeping data synchronized across various platforms and services is critical. CDC facilitates this by providing a mechanism for real-time data replication.
Minimizing Latency: By capturing changes as they happen, CDC minimizes the latency in data transfer. This is essential for applications where even a slight delay in data availability can lead to significant issues, such as financial trading systems.
Enabling Event-Driven Architectures: CDC is a cornerstone for building event-driven systems. In such architectures, actions are triggered in response to data changes, making real-time data capture essential.
Data warehousing and real-time analytics: For organizations relying on data warehouses and analytics tools for decision-making, CDC ensures that the data in these systems is current, enhancing the accuracy of insights.

Now that we understand it better, let’s explore the technical mechanics of how CDC is implemented in Postgres through logical replication.

Logical replication: under the hood

In Postgres, logical replication is one of the methods of implementing CDC and streaming data from your database to an external source. It uses a publisher-subscriber model.

Your Neon database works as a publisher, copying first a snapshot of the data and then streaming changes to one or more target data stores (subscribers). This model allows for selective replication, where only specified tables or even specific columns within a table can be replicated.

Learn more about connecting Neon to different data stores in the documentation.

The Write-Ahead-Log (WAL) is a fundamental component in Postgres, designed to ensure data integrity and facilitate recovery. It records every change made to the database, including transactions and their states.

For logical replication, the WAL serves as the primary data source. The WAL captures the comprehensive sequence of data changes, which are then decoded for replication purposes. Logical replication transforms the WAL to a format accepted by the subscriber through logical decoding, and the walsender then streams the transformed data using the replication protocol.

The walsender initiates the logical decoding of the WAL using an output plugin. Postgres ships with several logical decoding plugins that can output the data in various formats. In addition, new plugins can be developed.

For instance, in a Postgres-to-Postgres logical replication, the standard pgoutput plugin transforms the data changes to the logical replication protocol. The transformed data is subsequently streamed using the replication protocol, which maps it to local tables and applies the changes in the exact sequence of the original transactions. However, integrations with non-Postgres systems require an output format different from the standard one specifically designed for Postgres-to-Postgres logical replication.

Today’s data pipelines involve more than one data store type. For example, you can integrate all your Postgres databases into a data warehouse or streaming platform, such as Materialize or Kafka, to process and analyze data at higher scales. This is why, with the release of logical replication on Neon, we added support for wal2json, which outputs changes in the JSON format to be easily consumed by other systems and data stores.

You can read more on Change Data Capture using Neon and Materialize by our friend Marta Paes, to learn how to integrate your database with external systems.

Logical vs. physical replication

Logical replication differs from physical replication in that it replicates changes at the data level (row-level changes) rather than replicating the entire database block. This allows for more selective replication and reduces the amount of data transferred. Unlike snapshot replication, which provides a full copy of the data at a specific point in time, logical replication ensures continuous streaming of changes, making it more suitable for applications that require near real-time data availability.

This comparison highlights the distinct characteristics, advantages, and applications of logical and physical replication.

Logical Replication	Physical Replication
Row-Level Changes: focuses on replicating specific row-level changes (INSERT, UPDATE, DELETE) in selected tables.	Block-Level Replication: replicates the entire database at the block level. It creates an exact copy of the source database, including all tables and system catalogs.
Flexibility: Offers the flexibility to replicate specific tables and even specific columns within tables.	Limitations: Doesn’t allow for selective table replication and requires the same PostgreSQL version on both the primary and standby servers.
WAL-based: Uses the WAL for capturing changes, but with logical decoding to convert these changes into a readable format for the subscriber.	Streaming Replication: Changes are streamed as they are written to the WAL, minimizing lag.
Use Cases: Ideal for situations requiring selective replication, minimal impact on the source database, or cross-version compatibility.	Use Cases: Best suited for creating read-only replicas for load balancing, high availability, and disaster recovery solutions.

Get started with logical replication

To enable logical replication, navigate to your project’s settings in the console and click on the “Beta” tab, locate Logical Replication then on the “Enable” button.

Note that enabling logical replication will restart your compute instance, which will drop existing connections. A subscriber may also keep the connection to your Neon database active, preventing your Neon instance from scaling to zero.

This action is also irreversible, and you will not be able to disable logical replication for your project.

Ensure logical replication is enabled by running the following query in the SQL Editor within the Neon console or using psql on your terminal.

SHOW wal_level;

 wal_level 
-----------
 logical

Create a publication

Let’s assume you have the following users table:

CREATE TABLE users (

  id SERIAL PRIMARY KEY,

  username VARCHAR(50) NOT NULL,

  email VARCHAR(100) NOT NULL

);

Execute the following query to create a publication for the users table:

CREATE PUBLICATION users_publication FOR TABLE users;

Learn more about how to connect Neon to different data stores in the documentation.

Limitations

While logical replication in Neon Postgres offers numerous benefits for real-time data synchronization and flexibility, it has some limitations:

Publisher, not a subscriber

This release of logical replication on Neon is in beta, and for security reasons, it does not include subscriber capabilities at the moment. We are currently working on these security constraints, which should be supported in future releases.

Logical replication and Auto-suspend

In a logical replication setup, a subscriber may keep the connection to your Neon publisher database active to poll for changes or perform sync operations, preventing your Neon compute instance from scaling to zero. Some subscribers allow you to configure connection or sync frequency, which may be necessary to continue taking advantage of Neon’s Auto-suspend feature. Please refer to your subscriber’s documentation or contact their support team for details.

Data Definition Language (DDL) Operations

Logical replication in Postgres primarily handles Data Manipulation Language (DML) operations like INSERT, UPDATE, and DELETE. However, it does not automatically replicate Data Definition Language (DDL) operations such as CREATE TABLE, ALTER TABLE, or DROP TABLE. This means that schema changes in the publisher database are not directly replicated to the subscriber database.

Manual intervention is required to replicate DDL changes. This can be done by applying the DDL changes separately in both the publisher and subscriber databases or by using third-party tools that can handle DDL replication.

Replication Lag

In high-volume transaction environments, there is potential for replication lag. This is the time delay between a transaction being committed on the publisher and the same transaction being applied on the subscriber.

It’s important to monitor replication lag and understand its impact, especially for applications that require near-real-time data consistency. Proper resource allocation and optimizing the network can help mitigate this issue.

Conclusion

Logical replication is undoubtedly one of the most important features for modern applications. As we continue to develop its capabilities, we encourage you to test, experiment, and push the boundaries of what logical replication can do. Join us on Discord, and share your experiences, suggestions, and challenges with us.

We can’t wait to see what you build with Neon.