1
00:00:01,360 --> 00:00:05,880
There are a bajillion different charts available
in software like excel or numbers or sheets or
2
00:00:05,880 --> 00:00:09,800
mathematica or plotly, but I recently discovered
that pretty much all plotting apps - as far as I
3
00:00:09,800 --> 00:00:15,280
can tell - are missing an incredibly basic type of
chart! Consider this video my attempt to convince
4
00:00:15,280 --> 00:00:18,680
you it's a scandal these charts are missing,
and my attempt to convince software developers
5
00:00:18,680 --> 00:00:22,360
to add them into their plotting software.
Ok, so suppose you want to compare the climates
6
00:00:22,360 --> 00:00:26,960
of London, England; London, Ontario; and London,
Kentucky. Here's a table showing the average and
7
00:00:26,960 --> 00:00:30,840
seasonal high and low temperatures for each
London. But tables are boring. If you wanted
8
00:00:30,840 --> 00:00:34,520
to plot this data by hand, you might imagine
using a chart like this to show the range of
9
00:00:34,520 --> 00:00:38,440
temperatures and allow you to see at a glance that
the climate in London England is less extreme than
10
00:00:38,440 --> 00:00:39,440
the other Londons, (which are each similarly
variable but with Ontario shifted colder) - a
11
00:00:39,440 --> 00:00:42,800
bar chart makes more sense than, say,
a point chart since the temperatures in
12
00:00:42,800 --> 00:00:47,280
these Londons can by definition be in between
the minimum and maximum ones in our table.
13
00:00:47,280 --> 00:00:50,280
The problem is, you can't easily
make a chart like this on a computer.
14
00:00:50,280 --> 00:00:53,360
You can get close with an area chart,
which plots the temperatures correctly,
15
00:00:53,360 --> 00:00:57,760
adds nice shading in between them, but also fills
the area all the way down (or up) to the X axis,
16
00:00:57,760 --> 00:01:00,960
which isn't right. You can get around the
filling problem by creating a new table of
17
00:01:00,960 --> 00:01:03,880
the differences between the temperatures
and plotting that as a stacked area
18
00:01:03,880 --> 00:01:05,080
chart... oops - we need to remove the fill from
the lowest value - but it still doesn't make
19
00:01:05,080 --> 00:01:08,280
sense to use a chart that draws connecting
lines in a situation like this where we're
20
00:01:08,280 --> 00:01:12,680
talking about discrete geographic locations.
You can also get close with a stacked bar chart
21
00:01:12,680 --> 00:01:16,080
(or "stacked column chart" depending on whether
you think bars can be both horizontal and vertical
22
00:01:16,080 --> 00:01:20,280
or if you think vertical bars have to be called
columns...) anyway, a stacked bar-column chart
23
00:01:20,280 --> 00:01:24,640
separates out the cities correctly but incorrectly
adds the temperatures together - hence,
24
00:01:24,640 --> 00:01:28,560
"stacked" - the columns are stacked together,
appropriate to the name, but it's not what we
25
00:01:28,560 --> 00:01:32,440
want. You can get around the stacking problem by
again creating a table of the differences between
26
00:01:32,440 --> 00:01:35,800
the temperatures, but then it still fills down
to zero even if you want the bars to start at
27
00:01:35,800 --> 00:01:39,560
some other number, and the negative values are
a total nightmare" since stacked column charts
28
00:01:39,560 --> 00:01:43,240
by default plot things from zero and if you enter
a negative number for the first value to try to
29
00:01:43,240 --> 00:01:47,640
get it to start below zero, it plots that amount
below zero, but then measures the positive numbers
30
00:01:47,640 --> 00:01:51,520
up from zero anyway and not from the negative
number - this is because stacked column charts are
31
00:01:51,520 --> 00:01:56,040
meant for things like keeping inflows and outflows
separate, like, you bought 5 cats and sold 2, and
32
00:01:56,040 --> 00:02:00,240
you want the total length of bars to represent the
total number of transactions (7) and not the final
33
00:02:00,240 --> 00:02:04,360
number of cats (3). The only reasonable way I've
found to deal with negative numbers in stacked
34
00:02:04,360 --> 00:02:08,640
column charts is just to add a big number to ALL
the temperatures so they're not negative anymore,
35
00:02:08,640 --> 00:02:13,160
make the chart, then photoshop the y axis labels
to remove the big number. Which is ridiculous.
36
00:02:13,160 --> 00:02:16,360
What we're actually looking for is something
called a "range chart" where you enter the
37
00:02:16,360 --> 00:02:20,480
top and bottom values of the range and then
the program draws that for you. Specifically,
38
00:02:20,480 --> 00:02:24,520
we want a "stacked range bar chart" where you
can enter multiple numbers, including negative
39
00:02:24,520 --> 00:02:30,000
numbers, and then draw bars between them. Is
that too much to ask? I guess, I guess it is.
40
00:02:30,000 --> 00:02:33,840
ps here are a few more examples of data that
really would benefit from being visualized
41
00:02:33,840 --> 00:02:50,869
with a stacked range bar chart but first, a big
thank you to my Patreon supporters who help make
42
00:02:50,869 --> 00:02:51,400
these videos possible - please consider supporting
MinutePhysics at patreon.com/minutephysics. Now,
43
00:02:51,400 --> 00:02:55,720
back to the benefits of stacked range bar charts!
Really, any chart where the x axis is a discrete
44
00:02:55,720 --> 00:02:59,840
set of things and the y axis is a range that can
go below and above zero is a candidate for this
45
00:02:59,840 --> 00:03:03,440
type of chart. Like, if you wanted to plot the
range of latitudes of each of the continents,
46
00:03:03,440 --> 00:03:07,800
or show when twilight, dusk, and daytime are
at a given location. Or compare percentiles
47
00:03:07,800 --> 00:03:11,440
for student grades in different classes. Or
the local climate: plotting the average daily
48
00:03:11,440 --> 00:03:15,280
low and average daily high temperatures for
each month along with the average monthly low
49
00:03:15,280 --> 00:03:18,960
and high temperatures and the record monthly low
and high temperatures across a year gives you a
50
00:03:18,960 --> 00:03:22,640
picture of the seasonal variation in temperature
for a particular location. Yes, you can do this
51
00:03:22,640 --> 00:03:26,800
with a stacked area chart, but that makes less
sense than stacked columns because area implies
52
00:03:26,800 --> 00:03:32,760
some sort of continuity when to create the data
you are literally binning or averaging by month.
[en] The Chart Missing From ALL Spreadsheet Software.srt