SELECT knowledge FROM sql_resources WHERE category='sql' AND slug='dates'

Dates

Introduction

Dates and times can often trip up new SQL users due to the surprising amount of functionality exposed by most SQL databases. Getting the manipulation of dates and times correct when writing a query is often crucial in generating the right results, so it's important to understand the basics.

Date and time functionality and syntax differs between databases. In this article the code snippets are written in the Google BigQuery Standard SQL syntax

The basics

Data types

Most databases support a variety of date and time data types appropriate for different situations.

Most often you'll see

  • DATE : calendar date (e.g. 2020-01-01)
  • DATETIME : calendar date and time (e.g. 2020-01-01 13:04:11)
  • TIMESTAMP : a particular moment in time, can include timezone but defaults to UTC (e.g. 2020-01-01 13:04:11-5:00)
  • TIME : a time as seen on a watch (e.g. 13:04:11)

For example,

Loading code...
datedatetimetimestamptime
2020-01-012020-01-01T00:00:002020-01-01T00:00:00.000Z00:00:00

The 'Z' you see in a timestamp refers to the 'Zulu' timezone, the old nautical name for GMT or UTC.

Converting between date & time data types

Loading code...

There are many ways to convert between these data types, often a source of confusion. Here are the common categories of conversion functions.

The CAST function is an explicit instruction to the database to interpret a value as the given type (as opposed to implicit type conversions which commonly occur, for example, when using the SQLite database).

Loading code...
datedatetimetimestampdate_2
2020-01-012020-01-01T00:00:002020-01-01T00:00:00.000Z2020-01-01

Converting from a string

Loading code...

This function has different names in different databases, e.g. PARSE / PARSE_DATE / CONVERT / DATE_PARSE. These functions all have something in common though - they take a string and return some form of date / time type.

Some also require the explicit date/time format string too - note how the following all return the same date:

Loading code...
iso_dateuk_dateill_allow_itjust_no
2020-01-012020-01-012020-01-012020-01-01

Converting from a number

Loading code...

Dates and times are most commonly represented as a number in the form of unix time - the number of seconds (or milliseconds) since the start of 1970. Again, these conversion functions have a variety of different names, such as TIMESTAMP_SECONDS / FROM_UNIXTIME / UNIX_TIMESTAMP.

It is important to know what format the arguments need to take (e.g. seconds / milliseconds / days), and what format the output takes (e.g. date / timestamp).

timestampdate
2020-01-01T00:00:00.000Z2020-01-01

Formatting dates and times

Loading code...

This is often the opposite of the process above - converting a date/time object into a human-readable string. These functions will have different names (e.g. TO_CHAR / DATE_FORMAT / FORMAT) but have one thing in common - you'll have to provide a format string to tell the database exactly how to format the date/time.

Some common parts of format strings include:

Loading code...
  • %Y - the full year
  • %d - the day of the month
  • %m - the month as a number
  • %b - the month as an abbreviated name
year_onlyyear_monthmonth_dayoh_do_grow_up
20202020-01Jan-01January
1 (💩)

Date arithmetic

Dates and times represent a single point in time, so directly adding or subtracting two dates doesn't really make sense. Instead, you should think of adding or subtracting intervals

Date + Interval = DateDate - Interval = DateInterval + Interval = IntervalDate + Date = ??

The function names you may see include DATE_ADD / DATETIME_ADD / TIMESTAMP_ADD  / DATEADD / ADDDATE / DATE_SUB / DATESUB / SUBDATE etc.

tomorrowyesterdayall_our_troubles_seemed_so_far_away
2020-01-022019-12-312030-01-01

Grouping dates and times

When grouping by a date/time column in a SQL query, it is often appropriate to round the date/time to the nearest day/month/week etc - there tend not to be that many events occurring in the same millisecond!

There are multiple ways to do this:

  • Truncation - removing the extra, finer time information and creating a new date/time object
  • Extraction - extracting the relevant part of the date/time as a number

Truncating

Truncation rounds date/times down to the nearest 'datetime part' that you specify. These functions often have 'TRUNC' in the name, for example DATE_TRUNC / DATETIME_TRUNC / TRUNC.

Note in the following that the output timestamp retains the same precision to the millisecond, but has been rounded down.

to_yearto_monthto_day
2007-01-01T00:00:00.000Z2007-06-01T00:00:00.000Z2007-06-05T00:00:00.000Z

Extracting

Extraction takes a part of a date/time and returns it as a simple number which has no notion of coming from a date/time object. For this reason it is best to leave any use of the EXTRACT / DATEPART function until the end of your analysis.

yearmonthday
202021