-
Notifications
You must be signed in to change notification settings - Fork 23
Expand file tree
/
Copy pathV11_Time_and_Trends.Rmd
More file actions
86 lines (67 loc) · 3.27 KB
/
V11_Time_and_Trends.Rmd
File metadata and controls
86 lines (67 loc) · 3.27 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
```{r include=FALSE}
knitr::opts_chunk$set(cache = T, warning = F, message = F,
class.output = "output", out.width='100%',
fig.asp = 0.5, fig.align = 'center')
library(tidyverse)
```
# TIME & TRENDS
## Highlighting: Unemployed Population
This example is referenced from Datacamp's [Introduction to data visualization with ggplot2](https://www.datacamp.com/courses/introduction-to-data-visualization-with-ggplot2)。
### The econimics data
這是一個包含美國經濟時間序列資料的資料集,資料來源為<https://fred.stlouisfed.org/>。`economics`是以「寬」表格方式儲存,而`economics_long` 資料框則以「長」表格方式儲存。每一列之`date`為資料收集的月份。
- `pce`:個人消費支出,以十億美元為單位,資料來源為 <https://fred.stlouisfed.org/series/PCE>
- `pop`:總人口數,以千人為單位,資料來源為 <https://fred.stlouisfed.org/series/POP>
- `psavert`:個人儲蓄率,資料來源為 <https://fred.stlouisfed.org/series/PSAVERT/>
- `uempmed`:失業中位數持續時間,以週為單位,資料來源為 <https://fred.stlouisfed.org/series/UEMPMED>
- `unemploy`:失業人數,以千人為單位,資料來源為 <https://fred.stlouisfed.org/series/UNEMPLOY>
```{r}
economics %>% head()
```
### Setting marking area
```{r}
recess <- data.frame(
begin = c("1969-12-01","1973-11-01","1980-01-01","1981-07-01","1990-07-01","2001-03-01", "2007-12-01"),
end = c("1970-11-01","1975-03-01","1980-07-01","1982-11-01","1991-03-01","2001-11-01", "2009-07-30"),
event = c("Fiscal & Monetary\ntightening", "1973 Oil crisis", "Double dip I","Double dip II", "Oil price shock", "Dot-com bubble", "Sub-prime\nmortgage crisis"),
y = c(.01415981, 0.02067402, 0.02951190, 0.03419201, 0.02767339, 0.02159662, 0.02520715)
)
library(lubridate)
recess <- recess %>%
mutate(begin = ymd(begin),
end = ymd(end))
economics %>%
ggplot() +
aes(x = date, y = unemploy/pop) +
ggtitle(c("The percentage of unemployed Americans \n increases sharply during recessions")) +
geom_line() +
geom_rect(data = recess,
aes(xmin = begin, xmax = end, ymin = -Inf, ymax = +Inf, fill = "Recession"),
inherit.aes = FALSE, alpha = 0.2) +
geom_label(data = recess, aes(x = end, y = y, label=event), size = 3) +
scale_fill_manual(name = "", values="red", label="Recessions")
```
## Smoothing: Unemployed
- Smooth by [bin smoothing](http://rafalab.dfci.harvard.edu/dsbook/smoothing.html#bin-smoothing)
```{r}
fit <- with(economics,
ksmooth(date, unemploy, kernel = "box", bandwidth=210))
economics %>%
mutate(smooth = fit$y) %>%
ggplot() + aes(date, unemploy) +
geom_point(alpha = 5, color = "skyblue") +
geom_line(aes(date, smooth), color="red") + theme_minimal()
```
### Polls_2008
Second Example comes from [Rafael's online book](http://rafalab.dfci.harvard.edu/dsbook/smoothing.html)
```{r}
library(dslabs)
span <- 7
polls_2008
fit <- with(polls_2008,
ksmooth(day, margin, kernel = "box", bandwidth = span))
polls_2008 %>%
mutate(smooth = fit$y) %>%
ggplot(aes(day, margin)) +
geom_point(size = 3, alpha = .5, color = "grey") +
geom_line(aes(day, smooth), color="red") + theme_minimal()
```