The optional JOIN clause can be placed in the FROM part of a SQL SELECT statement (read more about the FROM clause). In this article we'll explain how it works.
-- Generic SQL SELECT statement select -- (some columns) from -- (some tables) -- (zero or more JOIN clauses) where -- (A 'predicate' expression to be used as a filter) group by -- (some columns)
You need a JOIN clause whenever you want your SELECT statement to retrieve rows from multiple tables at the same time - it tells the database exactly how it should match those tables together.
There are several different types of JOIN clause, common ones are
In this article, we'll look at some examples to see how they work.
A basic LEFT JOIN statement looks like
select
*
from
table_1
left join
table_2 on table_1.column_1 = table_2.column_2
This query will return
This is why it's called a LEFT join - the table on the left of the join clause keeps all of its rows, even if there isn't a matching row in the table on the right.
If there is no matching row in the table on the right, the columns from the right will be filled with NULLs.
with left_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
])),
right_table as (select * from unnest([
struct('a' as strings, 4 as numbers),
struct('b' as strings, 5 as numbers)
]))
select
*
from
left_table
left join
right_table on left_table.strings = right_table.strings
A basic RIGHT JOIN statement looks like
select
*
from
table_1
right join
table_2 on table_1.column_1 = table_2.column_2
This query will return
This is why it's called a RIGHT join - the table on the right of the join clause keeps all of its rows, even if there isn't a matching row in the table on the left.
If there is no matching row in the table on the left, the columns from the left will be filled with NULLs.
with left_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers)
])),
right_table as (select * from unnest([
struct('a' as strings, 4 as numbers),
struct('b' as strings, 5 as numbers),
struct('c' as strings, 6 as numbers)
]))
select
*
from
left_table
right join
right_table on left_table.strings = right_table.strings
A basic INNER JOIN statement looks like
select
*
from
table_1
inner join
table_2 on table_1.column_1 = table_2.column_2
This query will return
This is why it's called an INNER join - only the rows which match from both tables will be included in the result.
Note in the following that the values 'c' and 'd' don't appear in the output, as those values don't exist in both tables.
with left_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
])),
right_table as (select * from unnest([
struct('a' as strings, 4 as numbers),
struct('b' as strings, 5 as numbers),
struct('d' as strings, 6 as numbers)
]))
select
*
from
left_table
inner join
right_table on left_table.strings = right_table.strings
A basic FULL OUTER JOIN statement looks like
select
*
from
table_1
full outer join
table_2 on table_1.column_1 = table_2.column_2
This query will return
This is why it's called an OUTER join - all rows will be included in the result, even if they don't exist in one of the tables.
If there is no matching row in one of the tables, the missing columns will be filled with NULLs.
Note in the following that the values 'c' and 'd' appear in the output, even though they only exist in one table each.
with left_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers),
struct('c' as strings, 3 as numbers)
])),
right_table as (select * from unnest([
struct('a' as strings, 4 as numbers),
struct('b' as strings, 5 as numbers),
struct('d' as strings, 6 as numbers)
]))
select
*
from
left_table
full outer join
right_table on left_table.strings = right_table.strings
A basic CROSS JOIN statement looks like
select
*
from
table_1
cross join
table_2 -- Note - no 'on' expression required
This query will return
This is why it's called a CROSS join - the number of rows returned is the product of the number of rows of the two tables. You should be careful when using CROSS JOINs as they can return really massive tables, and are often not what you require.
Note in the following that the number of rows in the output is 2 x 3 = 6.
with left_table as (select * from unnest([
struct('a' as strings, 1 as numbers),
struct('b' as strings, 2 as numbers)
])),
right_table as (select * from unnest([
struct('a' as strings, 4 as numbers),
struct('b' as strings, 5 as numbers),
struct('c' as strings, 6 as numbers)
]))
select
*
from
left_table
cross join
right_table