This article introduces the very basics of writing stored procedures with PostgreSQL (aka Postgres), an open-source database system. It is intended for both Postgres newbies and MS SQL Server/MySQL stored procedure veterans who are interested learning more about Postgres.
I originally come from the world of MS SQL Server where stored procedures (“sprocs”) were a fact of life. I routinely used/created them for use in classic ASP, and later ASP.Net, mainly just to move messy SQL statements out of the application and into the database. But besides making for cleaner code, sprocs are also handy because they:
- Reduce roundtrips across the network – Stored procedures take zero or more input parameters, do something, and return a result, much like a function in any language. By “black boxing” that functionality on the database server, we avoid the need to move large datasets across the wire to the application server for processing. This can decrease network utilization and application latency.
- Can make security easier to manage – While views can help simplify the permissions needed for complex queries, stored procedures make it even easier. Instead of giving a user/application the correct rights to specific tables (or columns), a person only needs execution rights to the stored procedure.
- Are precompiled – Stored procedures are stored in a precomplied state on the server, meaning that the query optimizer doesn’t have to recalculate the most efficient execution path each time the query is run. This reduces server overhead and can speed things up a bit.
Like pretty much everything in this industry though, there are almost as many opinions about “best practices” as there are developers. The most common objection to using stored procedures is that they off-load too much of the application’s job to the database. Some architects feel that a RDBMS ought to stick to just being a bit bucket and the logic that “does stuff” should be the realm of the application. I normally hear this line of reasoning from three-tier purists. They may have a valid point when planning enterprise-level apps that will have whole teams responsible for supporting different pieces of the puzzle. But for projects that aren’t as complex or ambitious, I feel that the benefits of stored procedures outweigh the semantics. Of course, common sense dictates that you ought not stick so much logic into sprocs that server performance is affected.
The usual suspects
Before I plow ahead directly into Postgres, I want to rewind a bit for the benefit of those who are already familiar with stored procedures in Microsoft SQL Server (versions 7.0 and 2000–I’m guessing 2005 isn’t much different, but I haven’t used it all that much) and/or MySQL (version 5.0). If this doesn’t interest you, feel free to skip ahead to the next section.
Postgres has a deserved reputation for being more complex than MySQL and is considerably less documented than SQL Server. On the other hand, Postgres is arguably more advanced than MySQL and a whole lot cheaper than SQL Server. Hence, a good number of people in both camps are beginning to take a harder look at it. So, to ease the ascent up the learning curve: a bit of review…
Microsoft SQL Server and MySQL (the later which, oddly, didn’t have sprocs until its most recent release little more than a year ago) both share a similar stored procedure syntax. First, SQL Server:
CREATE PROCEDURE [getQtyOrders] @customerID int, @qtyOrders int OUT AS SET NOCOUNT ON SELECT @qtyOrders = COUNT(*) FROM Orders WHERE accnum = @customerID
Executing a MS SQL stored procedure is largely dependant on your programming environment, but directly using T-SQL it looks a bit like this:
DECLARE @qty int EXEC getQtyOrders @customerID=12677, @qtyOrders=@qty OUT SELECT @qty (1 row(s) affected) 42
With MySQL, the same stored procedure is defined as:
DELIMITER $$ CREATE PROCEDURE getQtyOrders ( customerID INT, OUT qtyOrders INT) BEGIN SELECT COUNT(*) INTO qtyOrders FROM Orders WHERE accnum=customerID; END; $$ DELIMITER ;
The “DELIMITER” directive is a bit of a kludge that wasn’t needed with SQL Server. We use it to temporarily change the command delimiter so that the semicolons inside the definition don’t prematurely end the CREATE PROCEDURE command. Other than that, the BEGIN/END block, and a slightly different way of assigning the result to the variable, things look quite similar to SQL Server. Executing it directly isn’t much different either:
CALL getQtyOrders(12677, @qty); SELECT @qty; (1 row(s) returned) (0 ms taken) 42
Enter the dark horse
Note: I’m using PostgreSQL 8.0. If you’re not, this example won’t work. Ya might as well upgrade the development server now anyhow…
With Postgres things get a tad more complex; though not overly so. Flexibility is a feature, not a bug.
Both SQL Server and MySQL use a proprietary extension of SQL to provide stored procedure functionality (amoungst other features). Microsoft calls it Transact SQL, or “T-SQL”, and as far as I know, MySQL just calls it “MySQL”. Both variants are different enough to occasionally trip up a developer but are written around the common SQL-92/2003 standards. The net result is that the syntax used inside stored procedures resembles SQL code used without sprocs; the same “SELECTs”, “INSERTs”, etc. but with a few extra commands added for control flow and the like. Postgres stored procedures can also be written with a SQL-like language called “PL/pgSQL”. But, as an added twist, a developer could instead chose to write it in Perl, Python, or Tcl using PL/Perl, PL/Python, or PL/Tcl. Perl/Python/Tcl hackers may rejoice at this thought and it does open up a great many possibilities, but for now we’ll just stick with using the basic PL/pgSQL language.
Out-of-the-box, a Postgres database doesn’t have any procedural languages loaded. Assuming that your friendly, neighborhood DBA hasn’t already loaded in PL/pgSQL, we can easily do this with the following command:
/usr/local/pgsql/bin/createlang -U postgres plpgsql <database name>
The “-U” argument specifies which user account is used to execute the command. Since normal users can’t load languages, a privileged account will have to be used. Here I specified the account that the Postgres server utilizes, postgres. Now, onwards to setting up our getQtyOrders sproc:
CREATE FUNCTION getQtyOrders(customerID int) RETURNS int AS $$ DECLARE qty int; BEGIN SELECT COUNT(*) INTO qty FROM Orders WHERE accnum = customerID; RETURN qty; END; $$ LANGUAGE plpgsql;
Yup–pretty much the same as MS’s T-SQL and MySQL. The mentionable differences (besides the self-explanatory LANGUAGE command):
Microsoft SQL Server has distinct roles for both user-defined functions (UDFs) and stored procedures. A UDF is typically used within the context of a query, while a sproc takes input and/or output from/to an application. MySQL also has both procedures and functions, but they are defined almost identically and work almost identically. Postgres dispenses with the difference and everything is simply defined as a “function.”
As with MySQL, we also see the double dollar-sign (“$$”). However, here we’re using it for a different kind of work-around. With Postgres, the function definition is just one long string and if single quotes are contained somewhere within it, they’d have to be escaped using a pair of single quotes: ”. If there were already escaped quotes within the function, they’d then need to be escaped again, giving us a total of four quotes. To make life easier (and our code more legible), we can surround the definition with double dollar signs which negates the need for escaping.
Calling the spro… er, function is also a bit different, but not horribly so:
SELECT getQtyOrders(12677) AS qty; 42
And that’s about the extent of the basics… you’re now on your way to fruitful career programming PostgreSQL.
While there are differences to be expected between the control-of-flow structures, the basic framework of a stored procedure in Postgres is very similar those of the more familiar MS SQL Server and MySQL. I held off on exploring Postgres because I was so heavily invested in other database systems, both experience-wise and with many thousands of lines of stored procedures. However, I’m finding that I appreciate (and actually use) the added features of Postgres: sequences and very flexible authentication options are my current favorites. It also has “free-er” licensing than MySQL (BSD vs. GPL), making it an easier sell to overly-concerned management-types. Admittedly the docs could be better, but there are a number of good books available. I recommend “Beginning Databases with PostgreSQL” as a good starting point.