Although Microsoft SQL Server provides a great set
of built-in functions, you can also create your own function to
encapsulate a routine that performs an important action in your
application, such as a complex calculation. For example, you can create
a function that calculates the number of hours an employee worked in a
month, based on his or her work schedule.
When
you create a user-defined function, Microsoft SQL Server stores its
definition (the code) inside the database. Every time you need to
execute that piece of code, you j ust need to call the desired UDF: the
routine inside this function is automatically executed, and the result
is returned to you. Like the built-in functions, UDFs also can accept
parameters. You define them at the creation of the UDF, and the user
inputs them upon execution.
According to the return type, UDFs are classified in two groups: scalar functions or table-valued functions. Scalar functions return a single scalar value, such as a number or a date. Table-valued functions return a set of rows, like a table. A function always returns a result in one of these types.
Advantages and Limitations of UDFs
User-defined
functions have many advantages for database developers like you and me.
First of all, they allow you to create a function and to call it as
many times as you want from your application. As a module of your
application, UDFs can be modified independently of the application
source code, and their changes will be propagated for all scripts that
call these functions. They also reduce the compilation cost of the
T-SQL code and accelerate the execution time by caching the execution
plan and reusing them for repeated executions. As a result, UDFs don’t
need to be reparsed and reoptimized for every use. Also, when a
function is invoked in the WHERE clause of a SELECT statement, it reduces the number of rows sent to the application, thereby cutting down on the network traffic.
Although
user-defined functions have great advantages, they can only read data.
Thus, the UDF cannot be used to insert, update, or delete data; nor can
it be used to create, alter, or drop objects. Another limitation is the
UDF’s availability for only the database where it is stored. If you
need to use an existing function for another database, you must
re-create it in the desired database.
Scalar Functions
As
you learned earlier, scalar functions are routines that return a single
scalar value. This return can be any data type except text, ntext,
image, cursor, and timestamp. To create a scalar function, you use the CREATE FUNCTION T-SQL statement. The syntax of the statement is as follows:
CREATE FUNCTION [ schema_name. ] function_name
( [ { @parameter_name [ AS ][ type_schema_name. ] parameter_data_type
[ = default ] [ READONLY ] }
[,...n]
]
)
RETURNS return_data_type
[ WITH <function_option> [,...n] ]
[ AS ]
BEGIN
function_body
RETURN scalar_expression
END
[ ; ]
<function_option>::=
{
[ ENCRYPTION ]
| [ SCHEMABINDING ]
| [ RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ]
| [ EXECUTE_AS_Clause ]
}
Basically, the CREATE FUNCTION statement has a two-part structure: header and body.
The header defines the function name, which must have a unique name
that conforms to the rules for objects in SQL Server. The header also
defines the input parameter names and data types, and the return
parameter data type and optional name, although you don’t have to
define them to have your function. For each input parameter, you can
define a default value and the READONLY option. When you define a default value, the function will be executed without specifying a value for that parameter. The READONLY option indicates that a parameter cannot be updated or modified within the definition of your function.
In
the real world, it’s rare to encounter a function without input
parameters in your database. Parameters extend the functionality of
functions, allowing you to create complex calculations and to customize
the return, according to the input of user.
Usually, you will define one or more input parameters in your functions along with their respective data types.
|
Still talking about the header part, we have the RETURNS clause, for which you specify the data type of the scalar value that the function will return, and the optional clause WITH, for which you define some function options.
In the WITH clause, four options are available to you: ENCRYPTION, SCHEMABINDING, RETURNS NULL ON NULL INPUT (CALLED ON NULL INPUT) and EXECUTE AS. You specify the ENCRYPTION option when you want to encrypt the function definition. The SCHEMABINDING
option prevents any object that your function depends on from being
dropped. The behavior of your function when it receives a null input is
defined by the RETURNS NULL ON NULL INPUT and CALLED ON NULL INPUT options. If you specify the first option, it indicates that SQL Server will return NULL
when any of the input values received is null, without executing the
body of the function. However, if you specify the second option, SQL
Server will execute the body of the function, regardless of the null
input. By default, when you don’t specify anything, SQL Server will set
the CALLED ON NULL INPUT option. The EXECUTE AS
option specifies the security context under which the function is
executed. Therefore, you can control which user account SQL Server will
use to validate permissions on objects referenced in your function.
The body of the CREATE FUNCTION
statement is the main part of it. It is here where you define the
routine of actions that the function will perform. It contains one or
more T-SQL statements that perform the function logic. This part is delimited by a BEGIN...END statement, where you place all the code, and the RETURNfunctions to change data and objects, you can use a function as part of a SET clause in an UPDATE statement, or a WHERE clause in a DELETE
clause, which is responsible for outputting the value that the function
returns as a result of it. Be aware about the read-only restrict that
we discussed before. Although you can’t use statement. Also, you can change objects local to the function, such as variables.
Example: Creating and Consuming a Scalar Function
In the AdventureWorks2008 database, you have the table Product, which is responsible for product data, such as name, color, and size. This table includes a column called SafetyStockLevel, which stores the minimal stock of a product (see Figure 1).
Imagine that you need to retrieve this stock level in many parts of your application. A single SELECT could solve the problem, as shown in the following:
SELECT SafetyStockLevel
FROM [Production].[Product]
WHERE ProductID = @ProductID
But
what if you need to use this query ten times? Will you copy and paste
it ten times? And if you need to change a name of the column SafetyStockLevel to StockLevel, will you alter your code ten times? And what if you need to use the return of this query as an argument of another?
As a database developer, you will use your knowledge and create a function that executes this SELECT and returns the desired value. Figure 2 shows how you can define a scalar-valued function that returns the safety stock level of a given product.
That Returns the Safety Stock Level of a Product
As you can see, the new function has the name udfGetSafetyStockLevel and the variable @ProductID as an input parameter and returns a value with the INT data type. It has the SCHEMABIDING option, and the routine is a SELECT statement.
Once created, you can use this function in many ways. You can call it using a SELECT statement, a PRINT command, or inside a WHERE clause of a SELECT statement, for example (see Figure 3).
When you create a scalar-valued function, it is stored in the active database, inside the Scalar-valued functions folder. This folder is set under Programmability | Functions (see Figure 4).
Table-Valued Functions
Table-valued
functions have the same rules and options as the scalar function. The
difference is that the table-valued functions return a table as output,
allowing you to create complex calculations and return a set of rows.
The general syntax of a table-valued function is as follows:
CREATE FUNCTION [ schema_name. ] function_name
( [ { @parameter_name [ AS ][ type_schema_name. ] parameter_data_type
[ = default ] [ READONLY ] }
[,... n]
]
)
RETURNS @return_variable TABLE < table_type_definition >
[ WITH <function_option> [,...n ] ]
[ AS ]
BEGIN
function_body
RETURN scalar_expression
END
[ ; ]
<function_option>::=
{
[ ENCRYPTION ]
| [ SCHEMABINDING ]
| [ RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ]
| [ EXECUTE_AS_Clause ]
}
The TABLE data type and the table_type_definition part in the RETURNS clause are the differences from the syntax of a scalar function. You must inform SQL Server that the return will be a TABLE data type, and you must define its columns and data types, just as you do when you are creating a common table.
One
of the most common areas of confusion that database developers create
involves the definition and use of temporary tables and table variables.
Temporary tables are temporary table structures with data that are stored in the tempdb
system database. These tables are removed from SQL Server only when a
connection that was used to create the table is closed or when a user
employs the DROP TABLE
statement. You use temporary tables to store rows that must be
available after its creation. For example, you can create a temporary
table to store data of a report. This table can be created inside a
function or a procedure, but this table will be available after the
execution of the object.
On the other hand, table variables are table structures with data that are stored in memory. Unlike temporary tables, table variables only exist in
the local context of a function, stored procedure, or trigger, and they
are removed from memory when the object that created it has exited. You
use table variables to temporarily store rows inside a function or a
procedure until its execution is complete. For example, you can create
a table variable to store data of a SELECT statement that will be used inside the object later.
The
return of a table-valued function is a table variable, although you can
use temporary tables inside the routine for better data manipulation
and memory optimization.
|
A
special type of table-valued function, one that returns a table data
type, can be used to achieve the functionality of parameterized views.
This type is called the Inline user-defined function. You can use Inline UDF to support parameters in the search conditions specified in the WHERE
clause, reducing the number of rows that SQL Server manipulates. The
difference between Inline UDFs and views is that a view doesn’t support
parameters in the WHERE clause.
For
example, imagine that you need to create a report that lists the
clients of a given city. If you have seven cities in all, you will
create seven views, one for each city; or you will create a single view
with all clients, and filter them using the WHERE clause of a SELECT
statement. The problem of the first way is that you will maintain seven
objects. The second one will first select all the clients, and only
after that, will it filter data, consuming resources. Using an Inline
UDF allows you to have a single object to maintain, and it will only
retrieve the data according to the city input by users.
Example: Creating and Consuming a Table-Valued Function and an Inline UDF
Returning to the AdventureWorks2008 database, let’s now collect the product name and number from the table Products. The table structure and its columns are the same as those shown in Figure 1.
Your goal is to retrieve, given a product ID, its name and product
number. In this case, you need to return an array of two values. Scalar
functions can be used here, since they return a single scalar value.
So, you conclude that the solution will be using a table-valued
function. Figure 5 shows how you can define a table-valued function that returns the name and number of a given product.
That Returns the Name and Number of a Product
Let’s take a better look at the code in Figure 5. After the definition of the name and the parameter, you see that the RETURNS clause has improved from the last example. Now it shows the definition of a return variable, called @table, whose data type is TABLE
and the specifications of the table variable that the function will
return to you. Be careful about the data type of columns: they must
match the data they will receive.
In the body part of the function, there is nothing special until an INSERT statement appears. This query retrieves the value of the two variables selected before and stores them inside table variable @table. Then, the RETURN clause returns this variable and ends the function. You don’t need to specify the variable that will be returned in the RETURN clause: it is already defined in the RETURNS clause in the header of the function.
Once created, you can call the function using a SELECT statement, specifying the function as the object of the FROM clause. When you create a table-valued function, it is stored in the active database, inside the Table-valued functions folder. Like the scalar-valued functions folder, the Table-valued functions folder is set under Programmability | Functions, as shown in Figure 6.
Functions and the Execution of the New Function
You can solve this problem using the Inline user-defined function too. Figure 7
shows how you can define an Inline UDF that returns the name and number
of a given product, exactly the same return of the table-valued
function.
That Returns the Name and Number of a Product
As you can see, the RETURNS clause contains only the TABLE
data type. You don’t have to define the structure of the table variable
because it’s set by the format of the result set of the SELECT statement in the RETURN clause. Also, observe that there is no BEGIN...END statement delimiting the function. The whole function consists of a single SELECT statement, and the result set of this query forms the table returned by the function.
Managing User-Defined Functions
Managing
the existing UDFs is a task as important as creating them. As a
database developer, you need to learn some basic important operations,
such as altering a function and viewing its definition. Basically,
there are two ways to manage functions: using the SQL Server Management Studio (SSMS) or using Transact-SQL statements.
In SQL Server Management Studio, the functions are stored under Programmability | Functions of the selected database. You can see the properties of a function right-clicking it and choosing Properties to open a new window. In the Function Properties window, you can see some information about the selected function, such as if its definition were encrypted (see Figure 8).
You can also alter the code of the function by right-clicking it and choosing Modify to open a new query tab (see Figure 9).
This tab shows the actual definition of the UDF and allows you to make
changes at this function as you wish. After the changes are made, you
can commit them by clicking on the Execute button. Also, to drop a function, right-click the desired function and choose the Delete
option. You can see which objects depend on a function and which
objects a function depends on by right-clicking it and choosing use the
View Dependencies option.
You can also manage functions using Transact-SQL statements. You can alter and remove a function using the ALTER FUNCTION and the DROP FUNCTION statements. The ALTER FUNCTION alters an existing UDF that you previously created by executing the CREATE FUNCTION
statement, without changing permissions and without affecting any
dependent functions, stored procedures, or triggers. Its syntax and
options are the same as the CREATE statement. The DROP FUNCTION removes one or more user-defined functions from the current database. The syntax is as follows:
DROP FUNCTION { [ schema_name. ] function_name } [, ... n ]
You can also use system stored procedures and catalog views
to provide information about functions and their definitions. Some of
the most common system procedures that you can use to retrieve the
function definition and properties are the sp_help and sp_helptext.
The first procedure lets you view information about a user-defined
function. The second allows you to view the definition of a
user-defined function.
The following example shows you how to retrieve the definition of and information on the scalar-valued function udfGetSafetyStockLevel, created in the first example:
--View the definition
sp_helptext udfGetSafetyStockLevel
--View the information
sp_help udfGetSafetyStockLevel
Managing User-Defined Function Security
As
with all objects in SQL Server, you must specify the security context
of functions. Every time you call a UDF, SQL Server first verifies if
you have the permission to execute it. Once verified and approved, SQL
Server then checks whether you have permission to access the objects
involved in the routine of the function. Therefore, you should
establish the security context of a function and set up the permission
for users; these are two essential tasks that you, as a database
developer, should execute to protect your functions and the objects
involved. Table 1 shows the permissions available for a user-defined function.
Table 1. User-Defined Functions Permissions
Permission | Description |
---|
EXECUTE | Execute a scalar-valued function. |
SELECT | Select the data returned by a table-valued function. |
VIEW DEFINITION | View the information and definition of a function. |
ALTER | Alter the properties of a function. |
CONTROL | Assign the permission of a function to other users. |
TAKE OWNERSHIP | Take ownership of a function. |
You can set the user’s permission using the GRANT, REVOKE, or DENY T-SQL statements. These three statements are components of the Data Control Language (DCL)
and are responsible for setting the permission of users in objects. The
first statement grants permissions on a securable to a principal; the
second statement removes granted permissions; and the last statement
prevents a user from gaining a specific permission through a GRANT. The following two examples show how to grant a user permission on a function:
--Grant EXECUTE to a user (Scalar-valued Function)
GRANT EXECUTE ON [dbo].[udfGetSafetyStockLevel] TO Priscila
--Grant SELECT to a user (Table-valued Function)
GRANT SELECT ON [dbo].[udfGetProductNameNumber] TO Herleson
Working with Deterministic and Nondeterministic Functions
As
a database developer, it’s important for you to know whether a function
you are using is deterministic or nondeterministic. A function is deterministic when it always returns the same value any time for a specific set of input values. On the other side, a function is nondeterministic when it returns different values for the same set of input values every time you call.
For example, the built-in function DAY,
which returns an integer that represents the day of a date, is a
deterministic function: it will always return the same value if you
input the same date. However, the built-in function GETDATE, which returns the current database system timestamp as a datetime value, is a nondeterministic function: it will generate a different value every time you call it.
Several
properties of the UDF determine SQL Server’s ability to index the
results of your function, and the determinism of a function is one of
these properties. For example, you can’t create a clustered index on a
view if this view calls a nondeterministic function. Therefore, to
maintain and optimize database transactions, it’s important that you
keep these two concepts in mind.
Exercise 1. Creating Functions
In
this exercise, you will create a scalar function to return the average
due value of a given customer. You will then create an Inline UDF that
selects the sales according to their values. At the end, you will
review these functions’ information and definitions.
Launch SQL Server Management Studio (SSMS), connect to the instance, open a new query window, and change the context to the AdventureWorks2008 database.
Create the scalar function udfGetAvgCustomer by executing the following code:
CREATE FUNCTION [dbo].[udfGetAvgCustomer]
(
@CustomerID INT
)
RETURNS MONEY
AS
BEGIN
DECLARE @Amount MONEY
SELECT @Amount = AVG(TotalDue)
FROM [Sales].[SalesOrderHeader]
WHERE CustomerID = @CustomerID
RETURN @Amount
END
Test the created function calling it using the SELECT statement, as follows:
SELECT [dbo].[udfGetAvgCustomer](29825)
Now, let’s create an Inline UDF to see the information about the highest sales. Execute the following code to create the udfGetSales function:
CREATE FUNCTION [dbo].[udfGetSales]
(
@Amount MONEY
)
RETURNS TABLE
AS
RETURN
SELECT SalesOrderID, CustomerID, TotalDue
FROM [Sales].[SalesOrderHeader]
WHERE TotalDue > @Amount
Test the created function calling it, using the SELECT statement, as follows:
SELECT * FROM dbo.udfGetSales (170000)
To finish, view the information and definition of udfGetSales function using the following code:
sp_help udfGetSales
sp_helptext udfGetSales