Running Totals
Running Totals or Cumulative Sums are a powerful way to see not just a trend of data, but also the cumulative results.
For example, if you wanted to monitor your monthly sales, but also make sure you're on track to achieve your annual goal, a running total will sort you out:
Step-by-Step
Running Totals rely on the use of Window Functions, which you can read more about here. Â
To find a running total we'll use the SUM window function:
expr1This is an expression that evaluates to a numeric data type (INTEGER, FLOAT, DECIMAL, etc.).expr2This is the optional expression to partition by.expr3This is the optional expression to order by within each partition. (This does not control the order of the entire query output.)
Or, in layman's terms:
In the following example we'll take the running total of hours of Netflix I watched in a given week:
| WEEK | DAY | DAILY_HOURS_WATCHED |
|---|---|---|
| 2018-11-26T00:00:00.000Z | 0 | 4.412777778 |
| 2018-11-26T00:00:00.000Z | 1 | 1.467222222 |
| 2018-11-26T00:00:00.000Z | 2 | 0.6561111111 |
| 2018-11-26T00:00:00.000Z | 4 | 0.5063888889 |
| 2018-11-26T00:00:00.000Z | 5 | 0.5261111111 |
| 2018-11-26T00:00:00.000Z | 6 | 0.8688888889 |
Now if I want to find the running total of hours watched that week, I can do the following:
| DAY | DAILY_HOURS_WATCHED | RUNNING_TOTAL |
|---|---|---|
| 0 | 4.412777778 | 4.412777778 |
| 1 | 1.467222222 | 5.88 |
| 2 | 0.6561111111 | 6.536111111 |
| 4 | 0.5063888889 | 7.0425 |
| 5 | 0.5261111111 | 7.568611111 |
| 6 | 0.8688888889 | 8.4375 |
We can see our new RUNNING_TOTAL column increase each day.
This example had no partition, but if we wanted to compare my viewing habits across two different weeks, we would need to PARTITION BY week:
| WEEK | DAY | DAILY_HOURS_WATCHED |
|---|---|---|
| 2018-11-19T00:00:00.000Z | 0 | 1.326666667 |
| 2018-11-19T00:00:00.000Z | 1 | 0.4775 |
| 2018-11-19T00:00:00.000Z | 2 | 0.8708333333 |
| 2018-11-19T00:00:00.000Z | 6 | 0.3 |
| 2018-11-26T00:00:00.000Z | 0 | 4.412777778 |
| 2018-11-26T00:00:00.000Z | 1 | 1.467222222 |
| 2018-11-26T00:00:00.000Z | 2 | 0.6561111111 |
| 2018-11-26T00:00:00.000Z | 4 | 0.5063888889 |
| 2018-11-26T00:00:00.000Z | 5 | 0.5261111111 |
| 2018-11-26T00:00:00.000Z | 6 | 0.8688888889 |
And now to find the running total across the two weeks:
| WEEK | DAY | DAILY_HOURS_WATCHED | WEEKLY_RUNNING_TOTAL |
|---|---|---|---|
| 2018-11-19T00:00:00.000Z | 0 | 1.326666667 | 1.326666667 |
| 2018-11-19T00:00:00.000Z | 1 | 0.4775 | 1.804166667 |
| 2018-11-19T00:00:00.000Z | 2 | 0.8708333333 | 2.675 |
| 2018-11-19T00:00:00.000Z | 6 | 0.3 | 2.975 |
| 2018-11-26T00:00:00.000Z | 0 | 4.412777778 | 4.412777778 |
| 2018-11-26T00:00:00.000Z | 1 | 1.467222222 | 5.88 |
| 2018-11-26T00:00:00.000Z | 2 | 0.6561111111 | 6.536111111 |
| 2018-11-26T00:00:00.000Z | 4 | 0.5063888889 | 7.0425 |
| 2018-11-26T00:00:00.000Z | 5 | 0.5261111111 | 7.568611111 |
| 2018-11-26T00:00:00.000Z | 6 | 0.8688888889 | 8.4375 |
Now we can see the RUNNING_TOTAL restart for the 2nd week.
How We Built This
This page was built using Count. It combines the best features of a SQL IDE, Data Visualization Tool, and Computational Notebooks. In the Count notebook, each cell acts like a CTE, meaning you can reference any other cell in your queries.
This makes not only for far more readable reports (like this one) but also a much faster and more powerful way to do your analysis, essentially turning your analysis into a connected graph of data frames rather than one-off convoluted queries and CSV files. And with a built-in visualization framework, you won't have to export your data to make your charts. Go from raw data to interactive report in one document.