Performance of stored functions

2019-03-09

Something that users coming from other database systems easily overlook is, that Postgres supports different languages for writing stored functions (and procedures). Choosing the right language can influence the performance of your functions.

The most popular language for stored functions (or procedures) is probably PL/pgSQL. But very often a function can just as well written in plain SQL which makes the it more efficient, because you get rid of the overhead of calling the PL/pgSQL engine (or any other PL language).

The overhead can be quite substantial. Consider this simple function that does some calculation with two numbers:

create function do_stuff(p_one int, p_two int)
  returns integer
as
$$
begin
  return p_one + (p_two/2);
end;
$$
language plpgsql
immutable;

This is a pretty straightforward implementation, let’s see how fast it is by calling it 10 million times:

explain analyze
select do_stuff(i::int, i::int + 42)
from generate_series(1,10e6) as g(i);

This returns the following on my computer:

QUERY PLAN                                                                                                                      
--------------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series g  (cost=0.00..267.50 rows=1000 width=4) (actual time=1508.189..7787.481 rows=10000000 loops=1)
Planning Time: 0.037 ms                                                                                                         
Execution Time: 8098.882 ms

However the same function implemented as a SQL function:

create function do_stuff2(p_one int, p_two int)
  returns integer
as
$$
  select p_one + (p_two/2);
$$
language sql
immutable;

This now runs substantially faster:

QUERY PLAN                                                                                                                     
-------------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series g  (cost=0.00..22.50 rows=1000 width=4) (actual time=1579.476..4234.626 rows=10000000 loops=1)
Planning Time: 0.063 ms                                                                                                        
Execution Time: 4438.009 ms

Which is essentially the same runtime as if we had used the expression directly inside a SELECT statement. So the overhead of calling a SQL function is extremely low.

So, functions that don’t need loops or other procedural constructs are better implemented as a SQL function, rather than with PL/pgSQL.