What is Postgres Row Level Security? Link to heading

Row Level Security (RLS for short) is a PostgreSQL security feature provided by the open source PostgreSQL database. It allows database administrators to define policies to control how specific rows of data display and operate for one or more user roles.

RLS was added to Postgres in 9.5, and has had several performance improvements since. It is being used by several frameworks, like Postgraphile to provide serverless experiences, with the power of a Postgres database.

RLS Performance Link to heading

Depending on the way that you construct your RLS policies, performance can vary significantly. In the best case, Postgres can optimize the RLS policy to be as cheap as an additional WHERE clause on your query, or a simple function call before a write. This is cheap and scalable. In a much worse case, it can cause a sub-query, per row returned from a database query, which results in queries scaling exponentially. We have to create rules and best practices for our teams to ensure we stay away from the latter.

1. Never Pass Row Data to a Function in an RLS Policy Link to heading

This is one of the most common mistakes when I see people writing slow RLS policies. By passing row data to a function, even if that function only does a simple boolean check, every single row that returns from the WHERE filter must be a separate function call. Functions in postgres are slow, so calling a function n times for each row that returns is even slower.

2. Functions used in an RLS policy should be marked STABLE Link to heading

Functions are fine to use in an RLS policy, as long as those functions are STABLE, and don’t receive row data as parameters. This allows Postgres to cache the result of the function, only executing the function once per transaction, and use the in-memory value to evaluate each row, which is significantly faster.

3. Understand and Use SECURITY DEFINER Functions Link to heading

Postgres superusers and roles with the BYPASSRLS attribute always bypass the row security system when accessing a table. If the postgres role that runs your migrations has BYPASSRLS applied to it, you can use SECURITY DEFINER for the function, which will run the function with the role that created / defined the function, and will thus not trigger any RLS checks. While this means you need to be careful about exposing data to the user they shouldn’t have access to, it also means that they can be used in performance critical situations to avoid unnecessary checks.

RLS policies respect other RLS policies, so using subqueries in an RLS policy may invoke additional (chained) RLS policies which can get expensive quickly. If you instead call a SECURITY DEFINER function in your RLS policy, you can perform a query without incurring the cost of any other RLS policies. This is particularly useful when the tables you are calling have subquery based RLS policies, or if the user may not have access to the data needed to validate access to the table.

This approach is also useful when querying data across several tables, particularly if those tables have RLS policies based on subqueries. SECURITY DEFINER functions are an easy way to avoid compounding RLS policies in complex joins.

General Patterns for RLS Performance Link to heading

There are a few general patterns for consistent RLS performance, and depending on the situation, you can use one or both of them.

1. Denormalization Link to heading

While it goes against what we would normally do in a relational database, denormalization is a solid, high performance strategy for RLS. It’s is my default strategy for securing data.

Denormalization Example Link to heading

Let’s create a simple users, and blogs table, and let’s use denormalization and RLS to secure the blogs.

CREATE TABLE users
(
    id SERIAL PRIMARY KEY
);

CREATE TABLE blogs
(
    id SERIAL PRIMARY KEY,
    owner_user_id INT REFERENCES users(id) ON DELETE CASCADE,
    name VARCHAR(30) NOT NULL,
    private BOOLEAN NOT NULL
);

CREATE INDEX ON blogs(owner_user_id);
CREATE INDEX ON blogs(private);

Let’s create a few helper functions for interacting with the properties set on our database connection. For each database connection, we’ll set the user.id property so that it can be accessed for RLS. Let’s create a function user_id() for accessing the function in a standard way.

CREATE OR REPLACE FUNCTION user_id() RETURNS INT AS $$
SELECT current_setting('user.id')::INT;
$$ LANGUAGE SQL
    STABLE;

Let’s create our RLS policies:

ALTER TABLE blogs ENABLE ROW LEVEL SECURITY;

CREATE POLICY blogs_authenticated_user_select ON blogs FOR SELECT TO authenticated_user
    USING (private = false OR owner_user_id = user_id());

CREATE POLICY blogs_authenticated_user_insert ON blogs FOR INSERT TO authenticated_user
    WITH CHECK (owner_user_id = user_id());

CREATE POLICY blogs_authenticated_user_update ON blogs FOR UPDATE TO authenticated_user
    USING (owner_user_id = user_id());

CREATE POLICY blogs_authenticated_user_delete ON blogs FOR DELETE TO authenticated_user
    USING (owner_user_id = user_id());

Now when we make a simple query, with the role authenticated_user:

`SELECT * FROM blogs;`

we’ll get very similar performance to us running the query:

SELECT * FROM blogs WHERE private = false OR owner_user_id = user_id());

When evaluating approaches like this based on denormalization, I generally think of performance in my head as the RLS policy being appended to the WHERE clause. This generally holds true.

2. SECURITY DEFINER Functions Link to heading

While I find that denormalization is a great approach, in some cases it can start to feel unnecessary as you get farther away from your core tables, towards leaf tables that other tables are less likely to reference.

An example might be a messaging system, where you have the tables: users, conversations, conversation_users, and messages:

CREATE TABLE users
(
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) NOT NULL UNIQUE
);

CREATE TABLE conversations
(
    id SERIAL PRIMARY KEY
);

CREATE TABLE conversation_users
(
    conversation_id INT REFERENCES conversations(id) ON DELETE CASCADE,
    user_id INT REFERENCES users(id) ON DELETE CASCADE,
    PRIMARY KEY (conversation_id, user_id)
);

CREATE INDEX ON conversation_users(user_id);

CREATE TABLE messages
(
    id SERIAL PRIMARY KEY,
    conversation_id INT REFERENCES conversations(id) ON DELETE CASCADE,
    from_user_id INT REFERENCES users(id) ON DELETE SET NULL,
    content VARCHAR(5000) NOT NULL
);

CREATE INDEX ON messages(conversation_id);

While we could use denormalization on an RLS policy for each message, it would be more natural to decide whether or not the user has access to the conversation that the message belongs to. The best way to do that is with a SECURITY DEFINER function.

CREATE OR REPLACE FUNCTION conversations_for_user() RETURNS TABLE(conversation_id INT) AS $$
SELECT DISTINCT conversation_id
FROM conversation_users
WHERE user_id = user_id();
$$ LANGUAGE SQL
    SECURITY DEFINER
    STABLE;
ALTER TABLE messages ENABLE ROW LEVEL SECURITY;

CREATE POLICY messages_authenticated_user_select ON messages FOR SELECT TO authenticated_user
    USING (conversation_id in conversations_for_user());

CREATE POLICY messages_authenticated_user_insert ON messages FOR INSERT TO authenticated_user
    WITH CHECK (from_user_id = user_id() AND conversation_id in conversations_for_user());

CREATE POLICY messages_authenticated_user_update ON messages FOR UPDATE TO authenticated_user
    USING (from_user_id = user_id());

CREATE POLICY messages_authenticated_user_delete ON messages FOR DELETE TO authenticated_user
    USING (false);

Now for each INSERT and SELECT statement to the messages table, our RLS policy will query a set of all the conversation ids that our user has access to, and only return messages that are in that set of conversations.

Isn’t This Inefficient? Link to heading

Continuing the SECURITY DEFINER example from above, let’s dig in a little more.

When running a simple query:

SELECT * FROM messages WHERE conversation_id = 9000;

we’ll get performance more equivalent to this query:

SELECT * FROM messages WHERE conversation_id = 9000 AND conversation_id in conversations_for_user();

If we unwrap the function, we get

SELECT * FROM messages WHERE conversation_id = 9000 AND conversation_id in
                                                        (SELECT DISTINCT conversation_id FROM conversation_users WHERE user_id = user_id());

I’m being explicit here to show that we probably would not normally write a query like this, and it’s more inefficient than any single query we would probably otherwise write in this situation.

There are a few points I’d like to make:

  1. I’ve profiled approaches like this in production, and generally don’t see performance issues. Most users would have hundreds of conversations or less, and this approach would scale to users having thousands. Even in the case of a user having thousands of conversations, queries would still be measured less than or within tens of milliseconds.
  2. Limits can and should be placed on the number of conversations a user can have, which would allow you to measure the worst case scenario for your own use case.
  3. If you measure a performance issue with your approach, you can always optimize with a SECURITY DEFINER function.

As I mention above, we can always optimize using a SECURITY DEFINER function to query for a specific use case, not use in an RLS policy. Here’s what that might look like:

CREATE OR REPLACE FUNCTION messages_for_conversation_id(p_conversation_id INT)
   RETURNS SETOF messages AS $$
BEGIN
   -- Check if the user has access to the conversation
   IF EXISTS (
      SELECT 1
      FROM conversation_users cu
      WHERE cu.conversation_id = p_conversation_id
        AND cu.user_id = user_id()
   ) THEN
      -- Return messages if the user has access
      RETURN QUERY
         SELECT *
         FROM messages
         WHERE conversation_id = p_conversation_id;
   ELSE
      -- Return an empty set if the user doesn't have access
      RETURN;
   END IF;
END;
$$ LANGUAGE plpgsql
   SECURITY DEFINER
   STABLE;

Test Performance For Yourself Link to heading

At the end of the day, you need to make the decision that’s best for you and your team. You should be able to see what queries you run that are performing well, and what needs your attention.

Look how simple it is to open up a pg console and run some tests for yourself:

[deleteme_test] # create table blogs (id serial primary key, name text not null); create table blog_members (blog_id int not null, person_id int not null, primary key (blog_id, person_id)); create index on blog_members (person_id);
CREATE TABLE
Time: 10.124 ms
CREATE TABLE
Time: 3.972 ms
CREATE INDEX
Time: 3.676 ms
[deleteme_test] # insert into blogs (id, name) select i, 'Blog ' || i::text from generate_series(1, 1000000) i;
INSERT 0 1000000
Time: 1630.274 ms (00:01.630)
[deleteme_test] # insert into blog_members (blog_id, person_id) select blogs.id, i from blogs, generate_series(1, 3) i where i = 1 or (i = 2 and blogs.id % 2 = 0) or (i = 3 and blogs.id % 3 = 1);INSERT 0 1833334
Time: 4321.856 ms (00:04.322)
[deleteme_test] # \timing on
Timing is on.
[deleteme_test] # select count(*) from blog_members where person_id = 3;
┌────────┐
 count  
├────────┤
 333334 
└────────┘
(1 row)

Time: 31.162 ms
[deleteme_test] # select count(*) from blogs where id in (select blog_id from blog_members where person_id = 3);
┌────────┐
 count  
├────────┤
 333334 
└────────┘
(1 row)

Time: 106.628 ms
[deleteme_test] #