SQL Resources
SQL
Arrays Explained

Arrays Explained

Introduction

Arrays are ordered lists in BigQuery. They are very powerful once you know how to use them, and can help with:

  • Transforming non-normalized data (e.g. Survey responses)
  • Advanced string manipulation
  • Optimising storage and performance
  • And more!
  • The Basics

    ARRAYs are their own Data Type in BigQuery. They can be comprised of any data type EXCEPT for ARRAYs. So no ARRAY of ARRAYs, which is a relief, honestly. To do that kind of logic you'll need to use STRUCTs.

    Otherwise, the elements in an ARRAY must all be of the same Supertype. For example:

    SELECT
      [1,2,3] int_array,
      [1.5, cast(2 as INT64), 6.7] mixed_numerical_array,
      ['2020-01-01' , '2020-01-01 12:01:00'] mixed_date_array
    int_arraymixed_numerical_arraymixed_date_array
    [1,2,3][1.5,2,6.7]["2020-01-01","2020-01-01 12:01:00"]

    In the query above there is an ARRAY of INT64, an ARRAY of INT64 and FLOAT64, and an ARRAY with a DATE and a DATETIME.

    You can also explicitly define the data types in an ARRAY using ARRAY<DATA TYPE> before the ARRAY is specified:

    SELECT ARRAY<FLOAT64>[1,2,3] float_array
    float_array
    [1,2,3]

    Let's get into the good stuff now!

    Creating Arrays

    Generating an array using literals

    As in the examples above the simplest way to generate an array is using Literals, or using brackets [ , ] to build an ARRAY.

    SELECT
      ['hello', ' ', 'there'] AS welcome_array
    welcome_array
    ["hello"," ","there"]

    Generating an array using GENERATE functions

    Using Literals is fine for a few arrays, but doesn't work when you're trying to do something with more than a few elements and a few rows. For faster and more scalable ARRAY creation you can use one of the GENERATE functions:

    GENERATE_ARRAY(start_expression, end_expression[, step_expression])
    GENERATE_DATE_ARRAY(start_date, end_date[, INTERVAL INT64_expr date_part])
    GENERATE_TIMESTAMP_ARRAY(start_timestamp, end_timestamp, INTERVAL step_expression date_part)
    SELECT
      GENERATE_ARRAY(0, 10, 2) AS even_numbers,
      GENERATE_DATE_ARRAY('2020-01-01', '2020-03-01', INTERVAL 1 MONTH) AS month_starts
    even_numbersmonth_starts
    [0,2,4,6,8,10]["2020-01-01","2020-02-01","2020-03-01"]

    Turn rows to arrays using ARRAY_AGG

    To create an array from data in your table, you can use ARRAY_AGG:

    ARRAY_AGG(
    [DISTINCT] expression [{IGNORE|RESPECT} NULLS] [ORDER BY key [{ASC|DESC}] [, ... ]] [LIMIT n]
    ) [OVER (...)]

    You can read more about ARRAY_AGG here.

    SELECT
      ARRAY_AGG(fruit) AS fruit_basket
    FROM
      (
        SELECT
          "apple" AS fruit
        UNION ALL
    (    SELECT
          "pear" AS fruit)
        UNION ALL
    (    SELECT
          "banana" AS fruit)
      ) AS table_3
    fruit_basket
    ["apple","pear","banana"]

    How to combine columns into an array?

    To use different columns of data into an ARRAY, we can do the following:

    SELECT
      id,
      ARRAY_AGG(STRUCT(value1, value2)) arr
    FROM (
      SELECT 1 id, 'a' value1, 'x' value2
      UNION ALL (
        SELECT 1 , 'b' value1, 'y' value2
      ) UNION ALL (
        SELECT 2 id, 'c' value1, 'z' value2
      )
    )
    GROUP BY id
    idarr
    1[{"value1":"a","value2":"x"},{"value1":"b","value2":"y"}]
    2[{"value1":"c","value2":"z"}]

    Accessing Array Elements

    Indexing an array

    To access particular elements in an ARRAY we can use either OFFSET (0-based) or ORDINAL (1-based).

    SELECT
      interests[OFFSET(1)] AS offset_1,
      interests[ORDINAL(1)] AS ordinal_1
    FROM
      (
        SELECT
          ['football', 'basketball', 'music'] AS interests
      ) AS table_1
    offset_1ordinal_1
    basketballfootball

    There's no advantage to using either OFFSET or ORDINAL, it simply comes down to personal preference.

    Finding the length of an array

    Sometimes to do more dynamic indexing of Arrays, you may want to know the total number of elements, or length, of an array. To do that, you can use ARRAY_LENGTH:

  • ARRAY_LENGTH(array_expression)
  • SELECT
      ARRAY_LENGTH([1, 2, NULL])
    f0_
    3

    Reversing arrays

    It's also handy to reverse Arrays if you want to work with elements at the end of your ARRAY. To do that you can use ARRAY_REVERSE:

    SELECT
      ARRAY_REVERSE(['a', 'b', 'c']) AS reversed
    
    reversed
    ["c","b","a"]

    How to get the last element of an array?

    There are 2 options here:

  • Use OFFSET/ORDINAL with ARRAY_LENGTH
  • Use OFFSET/ORDINAL with ARRAY_REVERSE
  • SELECT
      interests[ORDINAL(ARRAY_LENGTH(interests))] AS last_with_array_length,
      ARRAY_REVERSE(interests)[ORDINAL(1)] AS last_with_reverse_array
    FROM
      (
        SELECT
          ['football', 'basketball', 'music'] AS interests
      ) AS table_1
    last_with_array_lengthlast_with_reverse_array
    musicmusic

    Transforming Arrays

    Flattening an array

    To convert an array to a set of rows we can use UNNEST:

    SELECT
      *
    FROM
      UNNEST(['A', 'B', 'C']) AS element WITH OFFSET AS `offset`
    elementoffset
    A0
    B1
    C2

    How to create a column with every day in a year?

    To do this we'll make use to UNNEST and GENERATE_DATE_ARRAY

    SELECT
      *
    FROM
      UNNEST(GENERATE_DATE_ARRAY(CAST('2020-01-01' AS DATE), CAST('2020-12-31' AS DATE))) AS date_in_2020
    date_in_2020
    2020-01-01
    2020-01-02
    2020-01-03
    2020-01-04
    2020-01-05
    2020-01-06
    2020-01-07
    2020-01-08
    2020-01-09
    2020-01-10
    2020-01-11
    2020-01-12
    2020-01-13
    2020-01-14
    2020-01-15
    2020-01-16
    2020-01-17
    2020-01-18
    2020-01-19
    2020-01-20
    2020-01-21
    2020-01-22
    2020-01-23
    2020-01-24
    2020-01-25
    2020-01-26
    2020-01-27
    2020-01-28
    2020-01-29
    2020-01-30
    2020-01-31
    2020-02-01
    2020-02-02
    2020-02-03
    2020-02-04
    2020-02-05
    2020-02-06
    2020-02-07
    2020-02-08
    2020-02-09
    2020-02-10
    2020-02-11
    2020-02-12
    2020-02-13
    2020-02-14
    2020-02-15
    2020-02-16
    2020-02-17
    2020-02-18
    2020-02-19
    2020-02-20
    2020-02-21
    2020-02-22
    2020-02-23
    2020-02-24
    2020-02-25
    2020-02-26
    2020-02-27
    2020-02-28
    2020-02-29
    2020-03-01
    2020-03-02
    2020-03-03
    2020-03-04
    2020-03-05
    2020-03-06
    2020-03-07
    2020-03-08
    2020-03-09
    2020-03-10
    2020-03-11
    2020-03-12
    2020-03-13
    2020-03-14
    2020-03-15
    2020-03-16
    2020-03-17
    2020-03-18
    2020-03-19
    2020-03-20
    2020-03-21
    2020-03-22
    2020-03-23
    2020-03-24
    2020-03-25
    2020-03-26
    2020-03-27
    2020-03-28
    2020-03-29
    2020-03-30
    2020-03-31
    2020-04-01
    2020-04-02
    2020-04-03
    2020-04-04
    2020-04-05
    2020-04-06
    2020-04-07
    2020-04-08
    2020-04-09

    There are times when you want to unnest an ARRAY without affecting other columns in your data. To do that we can use UNNEST in combination with a CROSS JOIN:

    SELECT
      id,
      flattened_numbers
    FROM
      (
        SELECT
          1 AS id,
          [0, 1, 1, 2, 3, 5] AS some_numbers
        UNION ALL
    (    SELECT
          2 AS id,
          [2, 4, 8, 16, 32] AS some_numbers)
        UNION ALL
    (    SELECT
          3 AS id,
          [5, 10] AS some_numbers)
      ) AS sequences
      CROSS JOIN UNNEST(sequences.some_numbers) AS flattened_numbers
    idflattened_numbers
    10
    11
    11
    12
    13
    15
    22
    24
    28
    216
    232
    35
    310

    Here, we've replicated our id for each flattened ARRAY row.

    Filtering an array

    To filter an ARRAY, we can make use of the ARRAY function:

  • ARRAY(subquery)
  • In the subquery argument, we can put in an entire SELECT... statement with our own criteria including:

  • DISTINCT
  • WHERE
  • SELECT ARRAY(
      SELECT DISTINCT x
      FROM UNNEST((
        SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
      )) AS x WHERE x <= 3
    ) AS unique_numbers
    unique_numbers
    [0,1,2,3]

    Here, we've filtered our ARRAY [0,1,1,2,3,5] to only unique elements that are less than or equal to 3.

    Aggregating Arrays

    Many times you want to consolidate an array - either by aggregating them into a single value or string, or by combining it with other arrays.

    ARRAY to STRING

    To convert an ARRAY of STRINGs to a single STRING, you can use ARRAY_TO_STRING:

  • ARRAY_TO_STRING(array_expression, delimiter[, null_text])
  • SELECT
      ARRAY_TO_STRING(greeting, " ") AS greetings
    FROM
      (
        SELECT
          ['Hello', 'World'] AS greeting
      ) AS table_1
    greetings
    Hello World

    ARRAY to VALUE

    To turn an ARRAY of numerical values to a single value, you can use any Aggregation function with the UNNEST function:

    SELECT
      SUM(x) AS total,
      AVG(x) AS avg,
      MAX(x) AS max
    FROM
      UNNEST([1, 2, 3]) AS x
    
    totalavgmax
    623

    Aggregate arrays across rows

    To combine ARRAYs in several rows to a single ARRAY, you can use ARRAY_CONCAT_AGG:

    ARRAY_CONCAT_AGG(expression  [ORDER BY key [{ASC|DESC}] [, ... ]]  [LIMIT n])
    SELECT
      `group`,
      ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
    FROM
      (
        SELECT
          'A' AS `group`,
          [1, 2] AS numbers
        UNION ALL
    (    SELECT
          'A' AS `group`,
          [3, 4] AS numbers)
        UNION ALL
    (    SELECT
          'B' AS `group`,
          [5, 6] AS numbers)
      ) AS table_3
    GROUP BY
      `group`
    groupcount_to_six_agg
    A[1,2,3,4]
    B[5,6]

    Combining arrays

    To merge several ARRAYs into a single ARRAY, you can use ARRAY_CONCAT:

    ARRAY_CONCAT(array_expression_1 [, array_expression_n])
    SELECT
      ARRAY_CONCAT([1, 2], [6, 4]) AS count_to_six_agg
    count_to_six_agg
    [1,2,6,4]
    Subscribe to newsletter

    Subscribe to receive the latest blog posts to your inbox every week.

    By subscribing you agree to our Privacy Policy.
    Thank you! Your submission has been received!
    Oops! Something went wrong while submitting the form.

    Start solving your organization's biggest problems with Count today.

    Stay up to date with all things count, data and problem solving.