Why should I consider using an auxiliary calendar table?
A calendar table can make it much easier to develop solutions around any business model which involves dates. Last I checked, this encompasses pretty much any business model you can think of, to some degree.
Constant problems that end up requiring verbose, complicated and inefficient methods include the following questions:
A common practice is to have a calendar table pre-populated with some or all of the information we need, enabling us to accomplish most programming tasks with a much simpler set-based solution. The following is an example of a table that could be used in a production environment, though it has all the bells and whistles—you may not need to populate all of the columns I define here.
Now, let's populate it with data. Let's assume we want 30 years of data, from 2000-01-01 through 2029-12-31. You can do it the old fashioned way:
However, this will take a significant amount of time. A better way, if you create a numbers table starting at 1 (as described in Article #2516), is to run this query without the looping:
Which executes almost instantly. Now, I went ahead beforehand and determined that 10,957 was the number of days I was interested in (I did not just pick that number out of the sky). You can experiment a bit with the following query, in order to determine what your highest value of n should be (just change the start and end dates):
Yes, this all seems a little complex, but keep in mind that this is going to reduce your after-the-fact workload and programming logic significantly.
And yes, I understand that the time it takes to generate the numbers table overrides the time you save in generating the calendar table. However, I included the logic here for two reasons:
(I describe all of these columns in the chart below.)
Before that, I'd like to add a column to account for UTC offset calculations, which many of us have to deal with (on the East Coast of the U.S., for example, we set the clocks forward one hour on the first Sunday of April, and set them back one hour on the last Sunday in October).
The easiest way I've found to do this is run through a cursor. Sounds painful, but you will only have to do it once, and it beats checking every single date in the range to determine whether it falls inside or outside the range. Keep in mind that this specific portion of the script is geared to the Eastern time zone in the U.S.; you may need to make adjustments to the @offset values, as well as the logic checks, depending on what part of the world you live in (and whether or not you care about UTCOffset at all).
Now that the data is populated (for the most part), let's talk about each column we've created, what it means, and how it can be used. Remember that not every column is required: the example I'm giving here is really a superset of options you can use, but you can certainly feel free to only use a subset for your particular needs.
Note that I didn't include any indexes, other than the one that is created by default on the dt column. You will need to experiment with the types of queries you are running to see where indexes make the most sense.
Many, many, many people are dissatisfied with the way SQL Server calculates week number. Its strange algorithm used in DATEPART(WEEK, <date>) does not always yield the same number as the ISO standard, which dictates that week 1 is the first week with 4 days. To deal with that, we can borrow from Books Online and create the following function:
Then we can update the table:
Next you will need to make sure you update your calendar table appropriately with your company's holidays. I included a brief example in the chart above, but you will have to repeat for all of your company's holidays. Some holidays are a little more difficult to calculate than others, because they do not fall on the same day every year. Here are some sample scripts, thanks to John Gilson and Michael C#. First, I added a HolidayDescription column so that when you review the holidays in the table, it will be less cryptic than just a date:
Now we can focus on updating the data. Notice that some of the calculations are quite straight forward, but others are more difficult.
I'll leave it as an exercise to the reader to figure out when the Monday or the Friday should be updated to reflect a holiday... this often depends on your company's rules. Some holidays, like Presidents' Day, are up to the states, so that's another case you'll have to account for individually. And finally, I left Easter until last, because us crazy Catholics come up with so many rules to make our lives miserable! :-)
To handle Easter, first I created this function, which returns Easter Sunday for any given year:
Then I created two functions to get the real days you get off work (Good Friday and Easter Monday) based on that single calculation. (Might be a slight performance hit, but it makes sense to keep that logic in one single piece of code. If you only have 30 or 50 years in your table, it won't be noticeable; besides, this is only something you have to do when the calendar table is created, not in real time.)
Now, we can easily update the table to apply the Easter holidays. Obviously, if your company only observes the Monday, then don't run the script for Friday. :-)
Okay, I have a table; now what???
All right, that's all great and everything, but how can we use this information to help solve real problems?
Getting a date range as a set
If someone asked you to list all the days in January 2004, you would probably write a quick script in VBScript or C# that iterated through the days manually. Or, you would rhyme them off on your fingers. With a calendar table, you can bring back these dates (and any extraneous information we've stored) as a contiguous set, without any looping.
To make this even easier, you can create a table-valued function in SQL Server 2000:
Now, you can obtain this range simply by calling:
You can create different functions that take different arguments, for example if you are frequently requesting a month at a time (e.g. for a calendar application), you can create a table-valued function or stored procedure that accepts month and year as inputs.
Getting recurring events across months
Here's one that would require complex looping, conditional logic, and maintaining state between loop iterations. Let's say your club meets on the third Friday of every month, and the president wants a quick printout of the dates of all the meetings in 2004. Do you even want to think about that in a loop in T-SQL or the language of your choice? Are you going to get out your Dilbert calendar and jot them down? Of course not. With a calendar table, you can do this in a query with a subquery:
Another thing you might want is to get information about the first day of each month, in the past year. This is a simple query against the calendar table:
Finally, what if we want all the Mondays in the second quarter of 2004? Even if you have a standard fiscal year, this calculation would normally involve looping through all of the days and applying logic that determines if the day is a Monday, and what constitutes being in the second quarter. You would also have to store those results in a #temp table during the loop if you wanted to return the set as a single resultset. With a calendar table, this is so much easier:
Getting the number of business days between two dates
This question comes up quite often, and was the catalyst for this article. Many people would consider writing a UDF that loops through all the days in a range, determines whether each one is a working day, apply business logic to determine if the day is considered a holiday, and keep a running count. Luckily, we can do this much more easily against a calendar table. Let's say we wanted all the business days in April 2004:
An added bonus of the set-based solution is that you can easily change the SELECT list to obtain the actual set of days:
For more treatment on this topic, see Article #2453.
Getting all dates, even those not represented
A calendar table makes it easy to return a full set of dates from a query, regardless of whether or not those dates exist in the actual data. We can use an outer join against the calendar table instead of using a cursor to loop through all of the relevant dates. For example, let's say we wanted a sales breakdown for every day in April 2004, when we only have certain dates populated:
Pre-determine delivery dates
You can easily determine scheduled dates in advance, for example, the arrival date for a car, or the delivery time of a piece of mail. Let's say you know that it takes 9 business days for a particular process; well, we can ask the database what the estimated date is based on how long we assume it takes. Without a calendar table, we'd have to loop from today until some future date, and hope that as we loop through the days and apply logic to determine if the day counts toward delivery, that we don't reach the end and run out of days. :-)
Getting missing dates in overlapping date ranges
This is a problem that happens often. You've got insurance policies, or car rental contracts, and you want to find out which days are not covered. But some of the days overlap. How do you do it? With a resultset as small as this sample, it is easy to determine visually which days are not covered for each contract. With a larger resultset, however, you'd have to get out your cursors and/or a legal pad and a very sharp pencil, if it weren't for the calendar table. Let's assume this sample data:
So, the information the pointy-haired boss is looking for is the following result, since he asked for all dates after the first date of a contract and before the last day that weren't covered:
Even with a calendar table, this wasn't the most intuitive solution, but it is much faster than any cursor-based scheme you will work out.
When you're done:
Calculating 'days to go' for recurring events
This example uses a Birthdays table to determine, given a current date, how many days to go before each person's birthday.
So, we have three birthdays, and we want to determine (given a specific date, usually today) how many days before we hit their birthdays. So we create this stored procedure:
Now, the results when running this stored procedure:
Note that if the birthday has already passed this year, they will not be included in the resultset.
Hopefully I have demonstrated that there are many uses for a calendar table, and even if you think that 36,000 or 72,000 rows are "way more" than you need to solve a particular problem, it will end up saving your butt several times over in the long run. Remember than you should only populate the calendar table with the years of data you need. If you only need to go 10 years out, that's less than 4,000 rows—which should yield acceptable performance even on the most beat-up systems.
The only date-related topic I *don't* recommend a calendar table for is birthdays, because (a) leap years cause funny things to happen, and (b) there are already decent solutions out there (for example, see Article #2233). It can be done with a calendar table, but I'll leave that as an exercise for the reader.
If you have any comments, questions or suggestions about the use of a calendar table, let us know.
Related ArticlesAre there tools available for auditing changes to SQL Server data?
Can I create an index on a BIT column?
Can I have optional parameters to my stored procedures?
Can I implement an input mask in SQL Server?
Can I make SQL Server format dates and times for me?
Can I start IDENTITY values at a new seed?
Can SQL Server tell me which row was inserted most recently?
How can I learn more about undocumented SQL Server stored procedures?
How can I make my SQL queries case sensitive?
How do I audit changes to SQL Server data?
How do I connect to a non-default instance of SQL Server?
How do I connect to SQL Server on a port other than 1433?
How do I create a cross-tab (or "pivot") query?
How do I determine if a table exists in a SQL Server database?
How do I drop a SQL Server database?
How do I find all the available SQL Servers on my network?
How do I get a list of SQL Server tables and their row counts?
How do I get rid of Named Pipes / DBNMPNTW errors?
How do I get the correct date/time from the msdb.sysjob* tables?
How do I get the nth row in a SQL Server table?
How do I get the result of dynamic SQL into a variable?
How do I handle REPLACE() within an NTEXT column in SQL Server?
How do I hide system tables in SQL Server's Enterprise Manager?
How do I know which version of SQL Server I'm running?
How do I limit the number of rows returned in my resultset?
How do I load text or csv file data into SQL Server?
How do I manage changes in SQL Server objects?
How do I monitor SQL Server performance?
How do I prevent linked server errors?
How do I reclaim space in SQL Server?
How do I recover data from SQL Server's log files?
How do I search for special characters (e.g. %) in SQL Server?
How do I start SQL Server Agent from ASP?
How do I time my T-SQL code?
How do I upsize from Access to SQL Server?
How do I upsize my MSDE database to full-blown SQL Server 2000?
How do I use a variable in a TOP clause in SQL Server?
How do I use GETDATE() within a User-Defined Function (UDF)?
How should I store an IP address in SQL Server?
Schema: How do I find all the foreign keys in a database?
SQL Server & MSDE
What are reserved Access, ODBC and SQL Server keywords?
What are the capacities of Access, SQL Server, and MSDE?
What are the main differences between Access and SQL Server?
What do I need to know about SQL Server 2000 SP4?
Where else can I learn about SQL Server?
Where is SP4 for SQL Server 2000?
Why am I having problems with SQL Server 2000 SP3 / SP3a?
Why can't I install SQL Server on Windows Server 2003?
Why can't I install SQL Server on Windows XP?
Why can't I use LIKE '%datepart%' queries?
Why do I get "Login failed for user '\'."?
Why do I get 'object could not be found' or 'invalid object name'?
Why do I get errors about master..spt_values?
Why do I get script errors in Enterprise Manager's 'taskpad' view?
Why do I get SQLSetConnectAttr Failed errors?
Why do I have problems with views after altering the base table?
Why does EM crash when I get an error in a stored procedure?
Why does Enterprise Manager return 'Invalid cursor state'?
Why does my DELETE query not work?
Why does sp_spaceused return inaccurate values?
Why is Enterprise Manager slow at expanding my database list?
Why is my app slow after upgrading from SQL Server 7 to 2000?
Why is tempdb full, and how can I prevent this from happening?
Why should I consider using an auxiliary numbers table?