Understanding FROM
The FROM
clause is one of the fundamental building blocks of a SQL SELECT
statement. In this article we'll explain how it works.
-- Generic SQL SELECT statement
select
-- (some columns)
from
-- (some tables)
where
-- (A 'predicate' expression to be used as a filter)
group by
-- (some columns)
The FROM
clause is formed of the reserved 'from' keyword followed by either:
- A table name (most common)
- A subquery
- A JOIN clause (read more about the JOIN clause)
The sources defined in the FROM
clause determine the set of rows that the SELECT
statement can operate on, and so is the first part of the query to be executed.
In this article, we'll look at some examples of different FROM
clauses.
In this article the code snippets are written in the Google BigQuery Standard SQL syntax.
FROM
a table
The simplest FROM
clause just lists a single table name - then the entire SELECT
statement is operating on a single table.
with my_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
]))
-- ^ The SELECT above is just defining some data to use
select
*
from
my_table
FROM
a subquery
Instead of referring to a table by name, we can also refer to it as the result of a SELECT
statement (since SELECT
statements return tables). For example, we can rewrite the query above as
select
*
from
(select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
]))
or indeed as
with my_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
]))
select
*
from
(
select * from (
select * from (
select * from my_table
)
)
)
These subqueries can be as complex as you'd like, though for the sake of readability you should try to avoid putting too much logic into a subquery. In a Count notebook we encourage splitting complex queries into separate cells for this reason.
Including a JOIN
clause
When you want to select rows from multiple tables, you'll need to tell the database how to merge the tables together (read more here). These instructions take the form of one or more JOIN
clauses after the initial FROM
clause. For example:
with table_1 as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
])),
table_2 as (select * from unnest([
struct('a' as strings_2, 4 as numbers_2),
struct('b' as strings_2, 5 as numbers_2),
struct('c' as strings_2, 6 as numbers_2)
]))
select
numbers,
numbers_2
from
table_1
left join table_2 on table_1.strings = table_2.strings_2
In the query above, our SELECT
statement says that:
- We want the
numbers
andnumbers_2
columns (the database is smart enough to know which tables those columns come from)
FROM
table_1
andtable_2
- The rows of
table_2
should be associated with those fromtable_1
by comparing the columnsstrings
andstrings_2