1
00:00:01,360 --> 00:00:05,880
There are a bajillion different charts available 
in software like excel or numbers or sheets or

2
00:00:05,880 --> 00:00:09,800
mathematica or plotly, but I recently discovered 
that pretty much all plotting apps - as far as I

3
00:00:09,800 --> 00:00:15,280
can tell - are missing an incredibly basic type of 
chart! Consider this video my attempt to convince

4
00:00:15,280 --> 00:00:18,680
you it's a scandal these charts are missing, 
and my attempt to convince software developers

5
00:00:18,680 --> 00:00:22,360
to add them into their plotting software.
Ok, so suppose you want to compare the climates

6
00:00:22,360 --> 00:00:26,960
of London, England; London, Ontario; and London, 
Kentucky. Here's a table showing the average and

7
00:00:26,960 --> 00:00:30,840
seasonal high and low temperatures for each 
London. But tables are boring. If you wanted

8
00:00:30,840 --> 00:00:34,520
to plot this data by hand, you might imagine 
using a chart like this to show the range of

9
00:00:34,520 --> 00:00:38,440
temperatures and allow you to see at a glance that 
the climate in London England is less extreme than

10
00:00:38,440 --> 00:00:39,440
the other Londons, (which are each similarly 
variable but with Ontario shifted colder) - a

11
00:00:39,440 --> 00:00:42,800
bar chart makes more sense than, say, 
a point chart since the temperatures in

12
00:00:42,800 --> 00:00:47,280
these Londons can by definition be in between 
the minimum and maximum ones in our table.

13
00:00:47,280 --> 00:00:50,280
The problem is, you can't easily 
make a chart like this on a computer.

14
00:00:50,280 --> 00:00:53,360
You can get close with an area chart, 
which plots the temperatures correctly,

15
00:00:53,360 --> 00:00:57,760
adds nice shading in between them, but also fills 
the area all the way down (or up) to the X axis,

16
00:00:57,760 --> 00:01:00,960
which isn't right. You can get around the 
filling problem by creating a new table of

17
00:01:00,960 --> 00:01:03,880
the differences between the temperatures 
and plotting that as a stacked area

18
00:01:03,880 --> 00:01:05,080
chart... oops - we need to remove the fill from 
the lowest value - but it still doesn't make

19
00:01:05,080 --> 00:01:08,280
sense to use a chart that draws connecting 
lines in a situation like this where we're

20
00:01:08,280 --> 00:01:12,680
talking about discrete geographic locations.
You can also get close with a stacked bar chart

21
00:01:12,680 --> 00:01:16,080
(or "stacked column chart" depending on whether 
you think bars can be both horizontal and vertical

22
00:01:16,080 --> 00:01:20,280
or if you think vertical bars have to be called 
columns...) anyway, a stacked bar-column chart

23
00:01:20,280 --> 00:01:24,640
separates out the cities correctly but incorrectly 
adds the temperatures together - hence,

24
00:01:24,640 --> 00:01:28,560
"stacked" - the columns are stacked together, 
appropriate to the name, but it's not what we

25
00:01:28,560 --> 00:01:32,440
want. You can get around the stacking problem by 
again creating a table of the differences between

26
00:01:32,440 --> 00:01:35,800
the temperatures, but then it still fills down 
to zero even if you want the bars to start at

27
00:01:35,800 --> 00:01:39,560
some other number, and the negative values are 
a total nightmare" since stacked column charts

28
00:01:39,560 --> 00:01:43,240
by default plot things from zero and if you enter 
a negative number for the first value to try to

29
00:01:43,240 --> 00:01:47,640
get it to start below zero, it plots that amount 
below zero, but then measures the positive numbers

30
00:01:47,640 --> 00:01:51,520
up from zero anyway and not from the negative 
number - this is because stacked column charts are

31
00:01:51,520 --> 00:01:56,040
meant for things like keeping inflows and outflows 
separate, like, you bought 5 cats and sold 2, and

32
00:01:56,040 --> 00:02:00,240
you want the total length of bars to represent the 
total number of transactions (7) and not the final

33
00:02:00,240 --> 00:02:04,360
number of cats (3). The only reasonable way I've 
found to deal with negative numbers in stacked

34
00:02:04,360 --> 00:02:08,640
column charts is just to add a big number to ALL 
the temperatures so they're not negative anymore,

35
00:02:08,640 --> 00:02:13,160
make the chart, then photoshop the y axis labels 
to remove the big number. Which is ridiculous.

36
00:02:13,160 --> 00:02:16,360
What we're actually looking for is something 
called a "range chart" where you enter the

37
00:02:16,360 --> 00:02:20,480
top and bottom values of the range and then 
the program draws that for you. Specifically,

38
00:02:20,480 --> 00:02:24,520
we want a "stacked range bar chart" where you 
can enter multiple numbers, including negative

39
00:02:24,520 --> 00:02:30,000
numbers, and then draw bars between them. Is 
that too much to ask? I guess, I guess it is.

40
00:02:30,000 --> 00:02:33,840
ps here are a few more examples of data that 
really would benefit from being visualized

41
00:02:33,840 --> 00:02:50,869
with a stacked range bar chart but first, a big 
thank you to my Patreon supporters who help make

42
00:02:50,869 --> 00:02:51,400
these videos possible - please consider supporting 
MinutePhysics at patreon.com/minutephysics. Now,

43
00:02:51,400 --> 00:02:55,720
back to the benefits of stacked range bar charts! 
Really, any chart where the x axis is a discrete

44
00:02:55,720 --> 00:02:59,840
set of things and the y axis is a range that can 
go below and above zero is a candidate for this

45
00:02:59,840 --> 00:03:03,440
type of chart. Like, if you wanted to plot the 
range of latitudes of each of the continents,

46
00:03:03,440 --> 00:03:07,800
or show when twilight, dusk, and daytime are 
at a given location. Or compare percentiles

47
00:03:07,800 --> 00:03:11,440
for student grades in different classes. Or 
the local climate: plotting the average daily

48
00:03:11,440 --> 00:03:15,280
low and average daily high temperatures for 
each month along with the average monthly low

49
00:03:15,280 --> 00:03:18,960
and high temperatures and the record monthly low 
and high temperatures across a year gives you a

50
00:03:18,960 --> 00:03:22,640
picture of the seasonal variation in temperature 
for a particular location. Yes, you can do this

51
00:03:22,640 --> 00:03:26,800
with a stacked area chart, but that makes less 
sense than stacked columns because area implies

52
00:03:26,800 --> 00:03:32,760
some sort of continuity when to create the data 
you are literally binning or averaging by month.

[en] The Chart Missing From ALL Spreadsheet Software.srt