Overview
I'm in the process of reviewing a large number of SQL stored procedures, and occasionally I find that a table join isn't properly expressed. This can result in hard to find bugs, but how can you make your SQL code simpler? Isn't using a stored procedure best-practice anyway?
Scenario
It is quite common to have a series of related tables. Consider a stock control system. Perhaps you have tables for customers, orders, order lines and stock items. In this simple scenario, you have four tables and clearly there are joins between several of the tables.
Common solution
A common solution is to describe the table joins in SQL code in each stored procedure. This way the database is only loosely-coupled to the application code, but the stored procedure is tightly-coupled to the database schema. Suppose the stored procedure returns the number of times a customer has ordered each of the items he/she has ordered. The SQL might look like the following
SELECT items.item_description, COUNT(order_lines.item_id)
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id
INNER JOIN order_lines ON orders.order_id = order_lines.order_id
INNER JOIN items ON order_lines.item_id = items.item_id
WHERE customers.customer_name = @customer_name
GROUP BY order_lines.item_id, items.description
This isn't wrong, but it is easy to forget a join field, particularly where a join is on multiple fields, and you have to remember the joins every time. Essentially, the stored procedure encompasses both schema knowledge and business logic. Wouldn't it be better to separate the two?
A better solution
Key Point: Views allow you to express table relationships without affecting server performance. They allow for reusable table expressions.
Let's build some useful views, getting closer to our original SQL as we go along:
Note: I've omitted several columns in the resulting view for brevity, but there is no reason to do so unless you are also using a clustered index on the view (in which case the columns determine the size of the index)
CREATE VIEW all_customer_orders AS
SELECT customers.customer_name, orders.*
FROM customers INNER JOIN orders
ON customers.customer_id = orders.customer_id
CREATE VIEW all_customer_lines AS
SELECT customer_id, customer_name, order_lines.*
FROM all_customer_orders INNER JOIN order_lines
ON all_customer_orders.order_id = order_lines.order_id
CREATE VIEW all_line_item_details AS
SELECT order_lines.order_line_id,
order_lines.item_id,
items.item_description
FROM order_lines INNER JOIN items
ON order_lines.item_id = items.item_id
CREATE VIEW all_customer_line_item_details AS
SELECT all_customer_lines.*,
all_line_item_details.item_description
FROM all_customer_lines INNER JOIN all_line_item_details
ON all_customer_lines.order_line_id = all_line_item_details.order_line_id
Finally, the stored procedure code can be expressed as
SELECT item_description, COUNT(item_id)
FROM all_customer_line_item_details
GROUP BY item_id, item_description
WHERE customer_name = @customer_name
This SQL no longer assumes any underlying knowledge of the table structure, it simply refers to a single view. Also, each view only has a single join and is therefore much easier to both test and understand.
Which views to create is a matter of experience, but whenever you are joining several tables, consider using a view. The view is easier to maintain and easier to test. It is also reusable. If you need to parameterise the view, consider using a table-valued function to represent it.
Remember as well that views can be indexed. This can introduce significant performance benefits without the penalties of denormalised data. The indexes aren't free from a performance perspective, but judicious use can result in optimal and simple queries.
If you are using SQL Server 2005 Enterprise edition you also get the benefit of having the SQL Server optimiser consider your views (and the associated indices) automatically.
References
Versions
Metadata
- Categories: SQL Server, IT Management, Software Development, Views, Stored Procedures
- Additional keywords: simplicity, queries, clustered index, views, best practice, primer, user-defined functions, UDF
- Technorati Tags:
SQL,
SQL Server,
software development,
architecture,
database design,
database views,
best practice