What is Postgres Row Level Security? Link to heading
Row Level Security (RLS for short) is a PostgreSQL security feature provided by the open source PostgreSQL database. It allows database administrators to define policies to control how specific rows of data display and operate for one or more user roles.
RLS was added to Postgres in 9.5, and has had several performance improvements since. It is being used by several frameworks, like Postgraphile to provide serverless experiences, with the power of a Postgres database.
RLS Performance Link to heading
Depending on the way that you construct your RLS policies, performance can vary significantly. In
the best case, Postgres can optimize the RLS policy to be as cheap as an additional WHERE
clause
on your query, or a simple function call before a write. This is cheap and scalable. In a much worse
case, it can cause a sub-query, per row returned from a database query, which results in queries
scaling exponentially. We have to create rules and best practices for our teams to ensure we stay
away from the latter.
1. Never Pass Row Data to a Function in an RLS Policy Link to heading
This is one of the most common mistakes when I see people writing slow RLS policies. By passing row
data to a function, even if that function only does a simple boolean check, every single row that
returns from the WHERE
filter must be a separate function call. Functions in postgres are slow, so
calling a function n
times for each row that returns is even slower.
2. Functions used in an RLS policy should be marked STABLE
Link to heading
Functions are fine to use in an RLS policy, as long as those functions are
STABLE
, and don’t receive row
data as parameters. This allows Postgres to cache the result of the function, only executing the
function once per transaction, and use the in-memory value to evaluate each row, which is
significantly faster.
3. Understand and Use SECURITY DEFINER
Functions
Link to heading
Postgres superusers and roles with the
BYPASSRLS
attribute always bypass
the row security system when accessing a table. If the postgres role that runs your migrations has
BYPASSRLS
applied to it, you can use SECURITY DEFINER
for the function, which will run the
function with the role that created / defined the function, and will thus not trigger any RLS
checks. While this means you need to be careful about exposing data to the user they shouldn’t have
access to, it also means that they can be used in performance critical situations to avoid
unnecessary checks.
RLS policies respect other RLS policies, so using subqueries in an RLS policy may invoke additional
(chained) RLS policies which can get expensive quickly. If you instead call a SECURITY DEFINER
function in your RLS policy, you can perform a query without incurring the cost of any other RLS
policies. This is particularly useful when the tables you are calling have subquery based RLS
policies, or if the user may not have access to the data needed to validate access to the table.
This approach is also useful when querying data across several tables, particularly if those tables
have RLS policies based on subqueries. SECURITY DEFINER
functions are an easy way to avoid
compounding RLS policies in complex joins.
General Patterns for RLS Performance Link to heading
There are a few general patterns for consistent RLS performance, and depending on the situation, you can use one or both of them.
1. Denormalization Link to heading
While it goes against what we would normally do in a relational database, denormalization is a solid, high performance strategy for RLS. It’s is my default strategy for securing data.
Denormalization Example Link to heading
Let’s create a simple users
, and blogs
table, and let’s use denormalization and RLS to secure
the blogs.
CREATE TABLE users
(
id SERIAL PRIMARY KEY
);
CREATE TABLE blogs
(
id SERIAL PRIMARY KEY,
owner_user_id INT REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(30) NOT NULL,
private BOOLEAN NOT NULL
);
CREATE INDEX ON blogs(owner_user_id);
CREATE INDEX ON blogs(private);
Let’s create a few helper functions for interacting with the properties set on our database
connection. For each database connection, we’ll set the user.id
property so that it can be
accessed for RLS. Let’s create a function user_id()
for accessing the function in a standard way.
CREATE OR REPLACE FUNCTION user_id() RETURNS INT AS $$
SELECT current_setting('user.id')::INT;
$$ LANGUAGE SQL
STABLE;
Let’s create our RLS policies:
ALTER TABLE blogs ENABLE ROW LEVEL SECURITY;
CREATE POLICY blogs_authenticated_user_select ON blogs FOR SELECT TO authenticated_user
USING (private = false OR owner_user_id = user_id());
CREATE POLICY blogs_authenticated_user_insert ON blogs FOR INSERT TO authenticated_user
WITH CHECK (owner_user_id = user_id());
CREATE POLICY blogs_authenticated_user_update ON blogs FOR UPDATE TO authenticated_user
USING (owner_user_id = user_id());
CREATE POLICY blogs_authenticated_user_delete ON blogs FOR DELETE TO authenticated_user
USING (owner_user_id = user_id());
Now when we make a simple query, with the role authenticated_user
:
`SELECT * FROM blogs;`
we’ll get very similar performance to us running the query:
SELECT * FROM blogs WHERE private = false OR owner_user_id = user_id());
When evaluating approaches like this based on denormalization, I generally think of performance in
my head as the RLS policy being appended to the WHERE
clause. This generally holds true.
2. SECURITY DEFINER
Functions
Link to heading
While I find that denormalization is a great approach, in some cases it can start to feel unnecessary as you get farther away from your core tables, towards leaf tables that other tables are less likely to reference.
An example might be a messaging system, where you have the tables: users
, conversations
,
conversation_users
, and messages
:
CREATE TABLE users
(
id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE
);
CREATE TABLE conversations
(
id SERIAL PRIMARY KEY
);
CREATE TABLE conversation_users
(
conversation_id INT REFERENCES conversations(id) ON DELETE CASCADE,
user_id INT REFERENCES users(id) ON DELETE CASCADE,
PRIMARY KEY (conversation_id, user_id)
);
CREATE INDEX ON conversation_users(user_id);
CREATE TABLE messages
(
id SERIAL PRIMARY KEY,
conversation_id INT REFERENCES conversations(id) ON DELETE CASCADE,
from_user_id INT REFERENCES users(id) ON DELETE SET NULL,
content VARCHAR(5000) NOT NULL
);
CREATE INDEX ON messages(conversation_id);
While we could use denormalization on an RLS policy for each message
, it would be more natural to
decide whether or not the user has access to the conversation
that the message belongs to. The
best way to do that is with a SECURITY DEFINER
function.
CREATE OR REPLACE FUNCTION conversations_for_user() RETURNS TABLE(conversation_id INT) AS $$
SELECT DISTINCT conversation_id
FROM conversation_users
WHERE user_id = user_id();
$$ LANGUAGE SQL
SECURITY DEFINER
STABLE;
ALTER TABLE messages ENABLE ROW LEVEL SECURITY;
CREATE POLICY messages_authenticated_user_select ON messages FOR SELECT TO authenticated_user
USING (conversation_id in conversations_for_user());
CREATE POLICY messages_authenticated_user_insert ON messages FOR INSERT TO authenticated_user
WITH CHECK (from_user_id = user_id() AND conversation_id in conversations_for_user());
CREATE POLICY messages_authenticated_user_update ON messages FOR UPDATE TO authenticated_user
USING (from_user_id = user_id());
CREATE POLICY messages_authenticated_user_delete ON messages FOR DELETE TO authenticated_user
USING (false);
Now for each INSERT
and SELECT
statement to the messages
table, our RLS policy will query a
set of all the conversation
ids that our user has access to, and only return messages that are in
that set of conversation
s.
Isn’t This Inefficient? Link to heading
Continuing the SECURITY DEFINER
example from above, let’s dig in a little more.
When running a simple query:
SELECT * FROM messages WHERE conversation_id = 9000;
we’ll get performance more equivalent to this query:
SELECT * FROM messages WHERE conversation_id = 9000 AND conversation_id in conversations_for_user();
If we unwrap the function, we get
SELECT * FROM messages WHERE conversation_id = 9000 AND conversation_id in
(SELECT DISTINCT conversation_id FROM conversation_users WHERE user_id = user_id());
I’m being explicit here to show that we probably would not normally write a query like this, and it’s more inefficient than any single query we would probably otherwise write in this situation.
There are a few points I’d like to make:
- I’ve profiled approaches like this in production, and generally don’t see performance issues. Most users would have hundreds of conversations or less, and this approach would scale to users having thousands. Even in the case of a user having thousands of conversations, queries would still be measured less than or within tens of milliseconds.
- Limits can and should be placed on the number of conversations a user can have, which would allow you to measure the worst case scenario for your own use case.
- If you measure a performance issue with your approach, you can always optimize with a
SECURITY DEFINER
function.
As I mention above, we can always optimize using a SECURITY DEFINER
function to query for a
specific use case, not use in an RLS policy. Here’s what that might look like:
CREATE OR REPLACE FUNCTION messages_for_conversation_id(p_conversation_id INT)
RETURNS SETOF messages AS $$
BEGIN
-- Check if the user has access to the conversation
IF EXISTS (
SELECT 1
FROM conversation_users cu
WHERE cu.conversation_id = p_conversation_id
AND cu.user_id = user_id()
) THEN
-- Return messages if the user has access
RETURN QUERY
SELECT *
FROM messages
WHERE conversation_id = p_conversation_id;
ELSE
-- Return an empty set if the user doesn't have access
RETURN;
END IF;
END;
$$ LANGUAGE plpgsql
SECURITY DEFINER
STABLE;
Test Performance For Yourself Link to heading
At the end of the day, you need to make the decision that’s best for you and your team. You should be able to see what queries you run that are performing well, and what needs your attention.
Look how simple it is to open up a pg console and run some tests for yourself:
[deleteme_test] # create table blogs (id serial primary key, name text not null); create table blog_members (blog_id int not null, person_id int not null, primary key (blog_id, person_id)); create index on blog_members (person_id);
CREATE TABLE
Time: 10.124 ms
CREATE TABLE
Time: 3.972 ms
CREATE INDEX
Time: 3.676 ms
[deleteme_test] # insert into blogs (id, name) select i, 'Blog ' || i::text from generate_series(1, 1000000) i;
INSERT 0 1000000
Time: 1630.274 ms (00:01.630)
[deleteme_test] # insert into blog_members (blog_id, person_id) select blogs.id, i from blogs, generate_series(1, 3) i where i = 1 or (i = 2 and blogs.id % 2 = 0) or (i = 3 and blogs.id % 3 = 1);INSERT 0 1833334
Time: 4321.856 ms (00:04.322)
[deleteme_test] # \timing on
Timing is on.
[deleteme_test] # select count(*) from blog_members where person_id = 3;
┌────────┐
│ count │
├────────┤
│ 333334 │
└────────┘
(1 row)
Time: 31.162 ms
[deleteme_test] # select count(*) from blogs where id in (select blog_id from blog_members where person_id = 3);
┌────────┐
│ count │
├────────┤
│ 333334 │
└────────┘
(1 row)
Time: 106.628 ms
[deleteme_test] #