Pivot Tables

We're all familiar with the power of Pivot tables, but building them in SQL tends to be tricky.

Snowflake has some bespoke features that make this a little easier though. In this context:

Pivoting a table means turning rows into columns

In the example below I've looked at the total hours I've watched some of my favorite programs on Netflix by year, then I've pivoted the data to show each year as a column:

SELECT
  SUM(DURATION) / (60 * 60) HOURS_WATCHED, TITLE, YEAR(START_TIME) YEAR
FROM PUBLIC.NETFLIX 
where TITLE IN ('Friends', 'GLOW', 'Arrested Development', 'The Good Place')
GROUP BY TITLE, YEAR

HOURS_WATCHED	TITLE	YEAR
15.726388889	GLOW	2020
78.595555556	Friends	2019
19.2725	Friends	2015
192.049166667	Friends	2018
58.134166667	Friends	2020
2.197777778	Friends	2016
4.356388889	Friends	2017
9.827777778	The Good Place	2020
9.428888889	The Good Place	2018
2.958888889	The Good Place	2019
4.449444444	The Good Place	2021
0.3686111111	The Good Place	2017
23.560277778	Arrested Development	2012
1.806388889	Arrested Development	2014
8.503888889	Arrested Development	2017
0.4994444444	Arrested Development	2016
0.9927777778	Arrested Development	2020
1.868611111	Arrested Development	2018
2.521111111	Arrested Development	2019
14.368055556	Arrested Development	2013
2.156388889	Arrested Development	2015

We'll take our above summary table and create a new column for each year:

SELECT * FROM SUMMARIZED_DATA 
PIVOT(sum(hours_watched) for year in (2018, 2019, 2020)) 
ORDER BY "2019" DESC

TITLE	2018	2019	2020
GLOW	null	null	15.726388889
Friends	192.049166667	78.595555556	58.134166667
The Good Place	9.428888889	2.958888889	9.827777778
Arrested Development	1.868611111	2.521111111	0.9927777778

Displaying the data in this way helps emphasise changes across the years for each show.

Step-by-Step

Option 1: CASE

In most SQL dialects, to construct a pivot table, you'd be forced to use a string of CASE statements:

SELECT 
  <INDEX COLS>, 
  AGGREGATE_FUNCTION(CASE WHEN <CONDITION> THEN <VALUE> END) <COL_1>, 
  AGGREGATE_FUNCTION(CASE WHEN <CONDITION> THEN <VALUE> END) <COL_2>
FROM
  <TABLE>

SELECT TITLE, 
  SUM(CASE WHEN YEAR(START_TIME) = 2018 THEN DURATION END ) / (60 * 60) "2018",
  SUM(CASE WHEN YEAR(START_TIME) = 2019 THEN DURATION END ) / (60 * 60) "2019",
  SUM(CASE WHEN YEAR(START_TIME) = 2020 THEN DURATION END ) / (60 * 60) "2020"
FROM PUBLIC.NETFLIX
WHERE TITLE IN ('Friends', 'GLOW', 'Arrested Development', 'The Good Place')
GROUP BY TITLE

TITLE	2018	2019	2020
GLOW	null	null	15.726388889
Friends	192.049166667	78.595555556	58.134166667
The Good Place	9.428888889	2.958888889	9.827777778
Arrested Development	1.868611111	2.521111111	0.9927777778

Option 2: PIVOT

In Snowflake there is a bespoke function, PIVOT, which simplifies this behavior.

SELECT ...
FROM ...
   PIVOT ( <aggregate_function> ( <pivot_column> )
            FOR <value_column> IN ( <pivot_value_1> [ , <pivot_value_2> ... ] ) )

[ ... ]

SELECT 
  YEAR , 
  "'Friends'" as "Friends", 
  "'GLOW'" AS GLOW, 
  "'The Good Place'" as "The Good Place" 
FROM 
  SUMMARIZED_DATA
  PIVOT(SUM(HOURS_WATCHED) FOR TITLE IN ('Friends', 'GLOW','The Good Place'))
ORDER BY YEAR desc

YEAR	Friends	GLOW	The Good Place
2021	null	null	4.449444444
2020	58.134166667	15.726388889	9.827777778
2019	78.595555556	null	2.958888889
2018	192.049166667	null	9.428888889
2017	4.356388889	null	0.3686111111
2016	2.197777778	null	null
2015	19.2725	null	null
2014	null	null	null
2013	null	null	null
2012	null	null	null

The downside of both of these approaches is that you have to specify the resulting pivoted columns before you create the table, meaning you can't have a dynamic pivot table.

There is a post here that details a way to make dynamic pivot tables with stored procedures.

Unpivot

Naturally, you may want to turn columns into rows, which involves unpivoting. Snowflake, handily, has an UNPIVOT function that will do just that.

SELECT ...
FROM ...
   UNPIVOT ( <value_column>
             FOR <name_column> IN ( <column_list> ) )

[ ... ]

SELECT * FROM PIVOT_WAY UNPIVOT(HOURS_WATCHED FOR TITLE IN ("Friends","GLOW","The Good Place"))

YEAR	TITLE	HOURS_WATCHED
2021	The Good Place	4.449444444
2020	Friends	58.134166667
2020	GLOW	15.726388889
2020	The Good Place	9.827777778
2019	Friends	78.595555556
2019	The Good Place	2.958888889
2018	Friends	192.049166667
2018	The Good Place	9.428888889
2017	Friends	4.356388889
2017	The Good Place	0.3686111111
2016	Friends	2.197777778
2015	Friends	19.2725

How We Built This

This page was built using Count. It combines the best features of a SQL IDE, Data Visualization Tool, and Computational Notebooks. In the Count notebook, each cell acts like a CTE, meaning you can reference any other cell in your queries.

This makes not only for far more readable reports (like this one) but also a much faster and more powerful way to do your analysis, essentially turning your analysis into a connected graph of data frames rather than one-off convoluted queries and CSV files. And with a built-in visualization framework, you won't have to export your data to make your charts. Go from raw data to interactive report in one document.

Business Improvement

Product

Use cases

Pricing

Professional Services

Customers

Resources

Step-by-Step

Option 1: CASE

Option 2: PIVOT

Unpivot

How We Built This

Start solving your organization's biggest problems with Count today.