MEDIAN function. Definition, syntax, examples and common errors using BigQuery Standard SQL. There is no MEDIAN function in BigQuery, but it can be calculated using the PERCENTILE_CONT function.
There is no MEDIAN function in BigQuery, but it can be calculated using the PERCENTILE_CONT function.
Using this function, you can calculate the median with the following:
PERCENTILE_CONT (value_expression, percentile [{RESPECT | IGNORE} NULLS])By default NULL values are ignored in the calculation but they can be included with the key words RESPECT NULLS. If you include NULL values then the following behaviour is important to know.
SELECT
PERCENTILE_CONT(x, 0.5) OVER () AS median
FROM
UNNEST([1, 2, 4, 5, NULL]) AS x
LIMIT
1NULL values returns NULL.NULL and a non-NULL value returns the non-NULL value. (NULL is not just simply treated as 0).In the example below the results are different because we now take the NULL value in the list of numbers into account, shifting the result to 2.
SELECT
PERCENTILE_CONT(x, 0.5 RESPECT NULLS) OVER () AS median
FROM
UNNEST([1, 2, 4, 5, NULL]) AS x
LIMIT
1PERCENTILE_CONT is a window function. You can learn more about window functions here.value_expression must be either NUMERIC, BIGNUMERIC, FLOAT64.percentile must be a literal between 0 and 1.percentile to 0 or 1 you can use percentile_cont to calculate the minimum and maximum values respectively.PERCENTILE_DISC has the same arguments as PERCENTILE_CONT but provides a percentile value for a discrete set of values including strings and any data type that can be ordered.MEDIAN function in BigQuery?There's no official explanation why MEDIAN function isn't supported. BigQuery has a number of approximate aggregate functions which are designed to give approximate results but are much less computationally expensive across big data sets.
This error occurs when the table you're analysing is too big for BiqQuery to fit in memory. This can happen with big data tables and particularly if the window function has no partition and it is therefore running the calculation across the whole table. Instead you can use the approx_quantiles function to give you an estimated value.