SELECT knowledge FROM sql_resources WHERE category='snowflake' AND slug='pivot-tables'

Pivot Tables

We're all familiar with the power of Pivot tables, but building them in SQL tends to be tricky.

Snowflake has some bespoke features that make this a little easier though. In this context:

Loading code...

Pivoting a table means turning rows into columns

In the example below I've looked at the total hours I've watched some of my favorite programs on Netflix by year, then I've pivoted the data to show each year as a column:

Loading code...
HOURS_WATCHEDTITLEYEAR
15.726388889GLOW2020
78.595555556Friends2019
19.2725Friends2015
192.049166667Friends2018
58.134166667Friends2020
2.197777778Friends2016
4.356388889Friends2017
9.827777778The Good Place2020
9.428888889The Good Place2018
2.958888889The Good Place2019
4.449444444The Good Place2021
0.3686111111The Good Place2017
23.560277778Arrested Development2012
1.806388889Arrested Development2014
8.503888889Arrested Development2017
0.4994444444Arrested Development2016
0.9927777778Arrested Development2020
1.868611111Arrested Development2018
2.521111111Arrested Development2019
14.368055556Arrested Development2013
2.156388889Arrested Development2015

We'll take our above summary table and create a new column for each year:

Loading code...
TITLE201820192020
GLOWnullnull15.726388889
Friends192.04916666778.59555555658.134166667
The Good Place9.4288888892.9588888899.827777778
Arrested Development1.8686111112.5211111110.9927777778

Displaying the data in this way helps emphasise changes across the years for each show.

Step-by-Step

Option 1: CASE

In most SQL dialects, to construct a pivot table, you'd be forced to use a string of CASE statements:

Loading code...
TITLE201820192020
GLOWnullnull15.726388889
Friends192.04916666778.59555555658.134166667
The Good Place9.4288888892.9588888899.827777778
Arrested Development1.8686111112.5211111110.9927777778

Option 2: PIVOT

In Snowflake there is a bespoke function, PIVOT, which simplifies this behavior.

YEARFriendsGLOWThe Good Place
2021nullnull4.449444444
202058.13416666715.7263888899.827777778
201978.595555556null2.958888889
2018192.049166667null9.428888889
20174.356388889null0.3686111111
20162.197777778nullnull
201519.2725nullnull
2014nullnullnull
2013nullnullnull
2012nullnullnull

The downside of both of these approaches is that you have to specify the resulting pivoted columns before you create the table, meaning you can't have a dynamic pivot table.

There is a post here that details a way to make dynamic pivot tables with stored procedures.

Naturally, you may want to turn columns into rows, which involves unpivoting. Snowflake, handily, has an UNPIVOT function that will do just that.

To unpivot our pivot table of YEAR | Friends | GLOW | The Good Place into a table of: YEAR | TITLE | HOURS WATCHED, we can do:

Loading code...
YEARTITLEHOURS_WATCHED
2021The Good Place4.449444444
2020Friends58.134166667
2020GLOW15.726388889
2020The Good Place9.827777778
2019Friends78.595555556
2019The Good Place2.958888889
2018Friends192.049166667
2018The Good Place9.428888889
2017Friends4.356388889
2017The Good Place0.3686111111
2016Friends2.197777778
2015Friends19.2725

How We Built This

This page was built using Count. It combines the best features of a SQL IDE, Data Visualization Tool, and Computational Notebooks. In the Count notebook, each cell acts like a CTE, meaning you can reference any other cell in your queries.

This makes not only for far more readable reports (like this one) but also a much faster and more powerful way to do your analysis, essentially turning your analysis into a connected graph of data frames rather than one-off convoluted queries and CSV files. And with a built-in visualization framework, you won't have to export your data to make your charts. Go from raw data to interactive report in one document.

Loading code...
Loading code...
Loading code...