BigQuery offers no shortage of functionality to help you get the most out of date and time data, but it can be hard to know what to use when. This notebook will review and compare BigQuery's date and time functionality.
In BigQuery, dates and times can be one of the following datatypes:
In general, if you want to work with time zones, you'll need to stick with TIMESTAMPs. Otherwise DATETIME is the most flexible since you can take advantage of date and time functionality.
If your data isn't in one of these data types, you can convert them by using CAST , one of the PARSE functions, or unix time conversion functions.
To convert your STRING to one of the Date data types, your STRING must be in the following formats:
SELECT
CAST('2017-06-04 14:44:00' AS DATETIME) AS datetime,
CAST('2017-06-04 14:44:00 Europe/Berlin' AS TIMESTAMP) AS timestamp,
CAST('2017-06-04' AS DATE) AS date,
CAST( '14:44:00' as TIME) time
To use one of the PARSE functions, your STRING can be formatted any way you like, you'll just tell the function how it should read it. There is a PARSE function for each Date/Time Data type:
.You can see the full list of format stringshere
For example, if my date was in this format: Thursday, January 7, 2021 12:44:02, I could use the following:
SELECT PARSE_DATETIME('%A %B %d, %Y %H:%M:%S', 'Thursday January 7, 2021 12:04:33') AS parsed_datetime
BigQuery offers several helper functions to get dates represented as numbers converted to a Date/Time date type.
These include:
SELECT TIMESTAMP_SECONDS(1577836800)
If you need to BigQuery also offers a series of functions to convert your Date/Time data types into Unix time.
When working with dates and times, it's often handy (and helpful for anyone else looking at the data) so see the data in a more approachable format. To do that we can make use of the FORMAT functions in BigQuery.
There is a FORMAT function for each Date/Time Datatype:
The format strings here are the same as for the PARSE function here.
So if I want to show my dates in a more common format of: YYYY/DD/MM then I could do FORMAT_DATETIME('%Y/%d/%m', datetime_column
SELECT
FORMAT_DATETIME('%Y/%d/%m', CAST('2017-06-30 14:44:00' AS DATETIME)) AS day_first,
FORMAT_DATETIME('%A', CAST('2017-06-30 14:44:00' AS DATETIME)) AS weekday,
FORMAT_DATETIME('%r', CAST('2017-06-30 14:44:00' AS DATETIME)) AS time_of_day
⚠️ It's worth noting that FORMAT functions return STRINGs so if you wanted to use the result of FORMAT as a DATE, it won't work.
For example, this query:
SELECT
DATE_DIFF(CAST('2017-06-04' AS DATE), FORMAT_DATE('%Y/%m/%d', CAST('2017-06-04' AS DATE)), DAY)
will return:
No matching signature for function DATE_DIFF for argument types: DATE, STRING, DATE_TIME_PART. Supported signatures: DATE_DIFF(DATE, DATE, DATE_TIME_PART); DATE_DIFF(DATETIME, DATETIME, DATE_TIME_PART); DATE_DIFF(TIMESTAMP, TIMESTAMP, DATE_TIME_PART)
One of the most common things we use dates for is to compare them. We'll do this to filter for the last week of data, or only for data that was within certain dates.
The easiest way to compare dates is to use a comparison operator:
SELECT
date
FROM
(
SELECT
CAST('2021-01-01' AS DATE) AS date
UNION ALL
( SELECT
CAST('2021-01-15' AS DATE) AS date)
UNION ALL
( SELECT
CAST('2021-02-01' AS DATE) AS date)
) AS table_3
WHERE
(date BETWEEN '2021-01-01' AND '2021-01-31')
Alternatively, you can use >=,< to achieve the same result:
SELECT
date
FROM
(
SELECT
CAST('2021-01-01' AS DATE) AS date
UNION ALL
( SELECT
CAST('2021-01-15' AS DATE) AS date)
UNION ALL
( SELECT
CAST('2021-02-01' AS DATE) AS date)
) AS table_3
WHERE
((date >= '2021-01-01') AND (date < '2021-02-01'))
Sometimes we want our query to always pull the last n days so comparing to a single date means we'll have to keep updating our query. For that we can use dynamic comparisons instead.
The most common functions we'll use for dynamic comparisons are CURRENT_[date_part] and [date_part]_DIFF
To return the current date, datetime, time, or timestamp, you can use the CURRENT_[date part] function in BigQuery. They will return the type you've specified, so you can use it to compare against other dates.
SELECT
CURRENT_DATE() AS current_date,
CURRENT_DATETIME() AS current_datetime,
CURRENT_TIMESTAMP() AS current_timestamp,
CURRENT_TIME() AS current_time
To find the difference in two dates, use [date_part]_DIFF:
What if we wanted to filter our data for the last 12 months? To do that, we'll compare our date column to the CURRENT_DATE and make sure the difference is 12 or fewer months.
WHERE date_diff(current_date(),date,DAY) <= X
SELECT
date
FROM
(
SELECT
CAST('2021-04-06' AS DATE) AS date
UNION ALL
( SELECT
CAST('2021-06-05' AS DATE) AS date)
UNION ALL
( SELECT
CAST('2020-04-04' AS DATE) AS date)
) AS table_3
WHERE
(DATE_DIFF(CURRENT_DATE(), date, MONTH) < 12)
What if we wanted to find the difference between 2 dates- the expected arrival date, and the actual arrival date, but we wanted to exclude Sundays when no deliveries take place?
SELECT
(DATE_DIFF(actual_delivery, est_delivery, DAY) - DATE_DIFF(actual_delivery, est_delivery, WEEK)) AS days_late
FROM
(
SELECT
CAST('2021-01-22' AS DATE) AS est_delivery,
CAST('2021-01-25' AS DATE) AS actual_delivery
) AS table_1
To do this we can take the total days (3) and subtract the weeks difference (1) which will include Sundays.
To perform any kind of data transformations like adding a year to a date, or subtracting 2 weeks, then we can use the [date_part]_ADD and [date_part]_SUB functions.
To add an interval to a date/time in BigQuery we can use any of:
SELECT
CAST('2020-08-05 12:00:00' AS DATETIME) AS original_date,
DATETIME_ADD(CAST('2020-08-05 12:00:00' AS DATETIME), INTERVAL 1 HOUR) AS one_hour_later,
DATETIME_ADD(CAST('2020-08-05 12:00:00' AS DATETIME), INTERVAL 1 WEEK) AS one_week_later,
DATETIME_ADD(CAST('2020-08-05 12:00:00' AS DATETIME), INTERVAL 1 QUARTER) AS one_quarter_later
What if we had contract start dates, contract durations, and we wanted to calculate the contract end date?
SELECT
*,
DATE_ADD(CAST(contract_start AS DATE), INTERVAL duration_months MONTH) AS contract_end
FROM
(
SELECT
'2019-05-20' AS contract_start,
8 AS duration_months
UNION ALL
( SELECT
'2019-09-18' AS contract_start,
3 AS duration_months)
UNION ALL
( SELECT
'2019-12-11' AS contract_start,
18 AS duration_months)
) AS table_3
To subtract an interval from a date/time in BigQuery, we can use any of:
SELECT
CAST('2020-08-05 12:00:00' AS DATETIME) AS original_date,
DATETIME_SUB(CAST('2020-08-05 12:00:00' AS DATETIME), INTERVAL 1 HOUR) AS one_hour_earlier,
DATETIME_SUB(CAST('2020-08-05 12:00:00' AS DATETIME), INTERVAL 1 WEEK) AS one_week_earlier,
DATETIME_SUB(CAST('2020-08-05 12:00:00' AS DATETIME), INTERVAL 1 QUARTER) AS one_quarter_earlier
How to subtract 3 business days from a date?
For this we'll use different DATE_SUB intervals depending on what day of the week it is.
SELECT
due_date,
FORMAT_DATE('%A', due_date) AS Weekday,
CASE WHEN EXTRACT(DAYOFWEEK FROM due_date) IN (2, 3, 4) THEN DATE_SUB(due_date, INTERVAL 5 DAY) WHEN (EXTRACT(DAYOFWEEK FROM due_date) = 1) THEN DATE_SUB(due_date, INTERVAL 4 DAY) ELSE DATE_SUB(due_date, INTERVAL 3 DAY) END AS three_days_ago,
FORMAT_DATE('%A', CASE WHEN EXTRACT(DAYOFWEEK FROM due_date) IN (2, 3, 4) THEN DATE_SUB(due_date, INTERVAL 5 DAY) WHEN (EXTRACT(DAYOFWEEK FROM due_date) = 1) THEN DATE_SUB(due_date, INTERVAL 4 DAY) ELSE DATE_SUB(due_date, INTERVAL 3 DAY) END) AS weekday_3d_ago
FROM
(
SELECT
CAST('2019-05-20' AS DATE) AS due_date
UNION ALL
( SELECT
CAST('2019-09-18' AS DATE) AS due_date)
UNION ALL
( SELECT
CAST('2019-12-15' AS DATE) AS due_date)
) AS table_3
When we're analysing date/time data we want to group our data by a different date part (e.g. Yearly, Quarterly, etc.). There are a few ways to group our Date/Times BigQuery.
Truncating a date / time means you group the date by a specific date part. For example truncating Tuesday 15 December 2020 to the WEEK would return the first day of the week: Sunday 13 December 2020, to the YEAR would return Wednesday 1 Jan 2020, etc.
SELECT
CAST('2020-12-15' AS DATE) AS original_date,
DATE_TRUNC(CAST('2020-12-15' AS DATE), WEEK) AS first_day_of_week,
DATE_TRUNC(CAST('2020-12-15' AS DATE), MONTH) AS first_day_of_month
⚠️ It's important to note that TRUNC functions return another Date / Time object. So you can use the results to compare to other date / times.
To get the LAST day of each date_part you can use the LAST_DAY function instead of the TRUNC functions above.
SELECT
CAST('2020-12-15' AS DATE) AS original_date,
LAST_DAY(CAST('2020-12-15' AS DATE), WEEK) AS last_day_of_week,
LAST_DAY(CAST('2020-12-15' AS DATE), MONTH) AS last_day_of_month
Alternatively, you can EXTRACT a date_part from a Date / Time. This is helpful if you want to do some arithmetic with your date_part, like comparing the number of visitors on your website regardless of the date.
SELECT
date,
EXTRACT(YEAR FROM date) AS year,
EXTRACT(WEEK FROM date) AS week,
EXTRACT(DAYOFWEEK FROM date) AS weekday
FROM
UNNEST(GENERATE_DATE_ARRAY('2015-12-23', '2016-01-09')) AS date
ORDER BY
date ASC
The key difference is the data types returned. TRUNC will return a DATE, DATETIME, TIMESTAMP, TIME object, and in most cases EXTRACT returns an INT64.
SELECT
CAST('2020-04-02 13:22:44' AS DATETIME) AS original,
EXTRACT(HOUR FROM CAST('2020-04-02 13:22:44' AS DATETIME)) AS hour_extacted,
DATETIME_TRUNC(CAST('2020-04-02 13:22:44' AS DATETIME), HOUR) AS hour_truncated