SUBSTR
Definition
The SUBSTR
function allows you to extract a section of a longer string in BigQuery.
Syntax
SUBSTR(value, position[, length])
SELECT
location,
SUBSTR(location, (-2), 2) AS state
FROM
(
SELECT
'New York, NY' AS location
UNION ALL
( SELECT
'Miami, FL' AS location)
) AS table_2
The query above gets the last 2 characters of the location
string.
Practical Info
SUBSTR
works with STRING
or BYTES
data typesposition
argument is 1-based, so the first character is at position 1
-1
to indicate the last character positionlength
< 0 then the function returns an errorCommon Questions
How to dynamically extract sub-strings?
What if we have a series of email addresses: [email protected]
, [email protected]
, etc. and we want to find the domain (e.g. gmail.com
, yahoo.com
), we can use SUBSTR in combination with STRPOS
and LENGTH
to dynamically extract everything after the @
.
SELECT
email,
SUBSTR(email, (STRPOS(email, '@') + 1), (LENGTH(email) - STRPOS(email, '@'))) AS domain
FROM
(
SELECT
'[email protected]' AS email
UNION ALL
( SELECT
'[email protected]' AS email)
) AS table_2
Alternatively, you can use REGEXP_EXTRACT
to achieve the same result:
SELECT
email,
REGEXP_EXTRACT(email, r'@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)') AS domain
FROM
(
SELECT
'[email protected]' AS email
UNION ALL
( SELECT
'[email protected]' AS email)
) AS table_2
Troubleshooting Common Errors
It's rare SUBSTR
returns an error unless you call it with the wrong data types (e.g. pass a STRING
to the position parameter), but it's often that you don't get what you want.
The best thing to do here is to either use REGEXP
functions, or start get your query working with 1-2 example rows before trying to apply it to the entire column.