Understanding FROM
The FROM clause is one of the fundamental building blocks of a SQL SELECT statement. In this article we'll explain how it works.
-- Generic SQL SELECT statement
select
-- (some columns)
from
-- (some tables)
where
-- (A 'predicate' expression to be used as a filter)
group by
-- (some columns)The FROM clause is formed of the reserved 'from' keyword followed by either:
- A table name (most common)
- A subquery
- A JOIN clause (read more about the JOIN clause)
The sources defined in the FROM clause determine the set of rows that the SELECT statement can operate on, and so is the first part of the query to be executed.
In this article, we'll look at some examples of different FROM clauses.
In this article the code snippets are written in the Google BigQuery Standard SQL syntax.
FROM a table
The simplest FROM clause just lists a single table name - then the entire SELECT statement is operating on a single table.
with my_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
]))
-- ^ The SELECT above is just defining some data to use
select
*
from
my_tableFROM a subquery
Instead of referring to a table by name, we can also refer to it as the result of a SELECT statement (since SELECT statements return tables). For example, we can rewrite the query above as
select
*
from
(select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
]))
or indeed as
with my_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
]))
select
*
from
(
select * from (
select * from (
select * from my_table
)
)
)These subqueries can be as complex as you'd like, though for the sake of readability you should try to avoid putting too much logic into a subquery. In a Count notebook we encourage splitting complex queries into separate cells for this reason.
Including a JOIN clause
When you want to select rows from multiple tables, you'll need to tell the database how to merge the tables together (read more here). These instructions take the form of one or more JOIN clauses after the initial FROM clause. For example:
with table_1 as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
])),
table_2 as (select * from unnest([
struct('a' as strings_2, 4 as numbers_2),
struct('b' as strings_2, 5 as numbers_2),
struct('c' as strings_2, 6 as numbers_2)
]))
select
numbers,
numbers_2
from
table_1
left join table_2 on table_1.strings = table_2.strings_2In the query above, our SELECT statement says that:
- We want the
numbersandnumbers_2columns (the database is smart enough to know which tables those columns come from) FROMtable_1andtable_2- The rows of
table_2should be associated with those fromtable_1by comparing the columnsstringsandstrings_2
