-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathggplot2.Rmd
More file actions
273 lines (196 loc) · 6.45 KB
/
ggplot2.Rmd
File metadata and controls
273 lines (196 loc) · 6.45 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
---
title: "Graphics with ggplot2"
author: "Eva-K"
date: "19 June 2019"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
# Learning ggplot2
We're learning ggplot2 It's going to be amazing!
Even if you have global options displaying your plots in Rmarkdown, they do not Push in the doc to Github.....
Load tidyverse:
```{r tidyverse}
library(tidyverse)
```
Load some data from Github:
```{r data}
ohi_data <- read_csv("https://raw.githubusercontent.com/OHI-Science/data-science-training/master/data/OHI_global_data.csv")
```
Make some plots: (aes means aesthetics)
```{r}
ggplot(data = ohi_data, aes(x = georegion_one, y = OHI_score))
```
To actually plot some data you need a geom:
```{r}
ggplot(data = ohi_data, aes(x = georegion_one, y = OHI_score)) + geom_point()
```
Plots should now be turned off!
Use geom_jitter to spread the points out instead of having them all concentrated on a single vertical line.
```{r}
ggplot(data = ohi_data, aes(x = georegion_one, y = OHI_score)) + geom_jitter()
```
To reduce the spread add a fractional number:
```{r}
ggplot(data = ohi_data, aes(x = georegion_one, y = OHI_score)) + geom_jitter(width=0.2)
```
To explore a bit more: How many data points are there for georegion one (distribution across countries?
```{r}
ggplot(data = ohi_data, aes(x = georegion_one)) + geom_bar()
```
Ditribution of Human D... Index across the countries:
```{r}
ggplot(data = ohi_data, aes(x = HDI)) +
geom_histogram()
```
Import a second dataframe:
```{r}
ohi_summary <- read_csv("https://raw.githubusercontent.com/OHI-Science/data-science-training/master/data/OHI_scores_georegion_summary.csv")
ohi_summary
```
Layer multiple plots:
```{r}
ggplot(data = ohi_summary, aes(x = georegions, y = OHI_score_average)) +
geom_bar(stat="identity") +
geom_jitter(data=ohi_data, aes(x=georegion_one, y=OHI_score))
```
Any comments in the ggplot above are global comments, but if you specify a different plot or asex in your geom then this will overide the ggplot.
```{r}
ggplot(data = ohi_data, aes(y=OHI_score, x = HDI, color=georegion_one)) +
geom_point()
```
```{r}
ggplot(data = ohi_data) +
geom_point(aes(y = OHI_score, x = HDI, color=georegion_one))
```
These plots look the same but the latter can result in problems downstream as more layers are added. However, the latter can be easier to understand and not result in mixing up of the data (is global and local).
Aesthetics: Can assign x and y to variables in a dataset. But can add lots of other things:
Anything that follows the aes command has to have a value in the data.
Adding a third variable.
Size:
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, size = coastal_pop)) +
geom_point()
```
Colour: (for a continuous variable)
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, color = coastal_pop)) +
geom_point()
```
Colour: (for a discrete variable)
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, color = georegion_one)) +
geom_point()
```
Change the shape of the points:
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, shape = georegion_one)) +
geom_point()
```
Add labels:
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, label=country)) +
geom_point(aes(x = OHI_score, y = HDI))
```
Doesn't work. Have to do:
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, label=country)) +
geom_point(aes(x = OHI_score, y = HDI)) +
geom_text()
```
Messy!!
# Preloaded ggplot Themes
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI)) +
geom_point() +
theme_bw()
```
Can store custom themes in Github. Try this one:
```{r}
source('https://raw.githubusercontent.com/OHI-Science/ohiprep/master/src/R/scatterTheme.txt')
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI)) +
geom_point() +
scatterTheme
```
To chabge labels use the labs function:
```{r}
ggplot(data = ohi_data, aes(y = OHI_score, x = HDI, color=georegion_one)) +
geom_point() +
labs(y = "OHI score, 2017",
x = "Human Development Index",
title = "Countries with high human development have more sustainable oceans",
color = "Georegion") + # if color doesn't work, use "fill"
theme_bw()
```
Changes to plots:
color color of lines/points
fill color within polygons
label if points are a character
linetype type of line
shape style of point
alpha transparency (0-1)
size size of shape
COlour palette available at Rcolorbrewer.
```{r}
install.packages("RColorBrewer")
```
```{r}
library(RColorBrewer)
```
```{r}
display.brewer.all()
```
```{r}
mypalette <- brewer.pal(n=9, "YlOrRd")
```
Adding a continuous colour scale:
```{r}
ggplot(data = ohi_data, aes(x = OHI_score, y = OHI_trend, color = HDI)) +
geom_point(size =3) +
scale_colour_gradientn(colors = mypalette)
```
Now using a discrete colour scale:
```{r}
mypalette <- brewer.pal(n=12, "Set3")
ggplot(data = ohi_data, aes(x = OHI_score, y = HDI, color = georegion_one)) +
geom_point(size = 3) +
scale_color_manual(values = mypalette)
```
Saving plots (Example only):
```{r}
my_plot <- ggplot(data = fake_data, aes(x = as.factor(year), y = values, group=animal, color=animal)) +
geom_point(size = 3) +
geom_line(size=2, alpha = 0.5) +
labs(x = "year", color = "") +
theme_tufte()
ggsave("name_of_file.png", my_plot, width = 15, height = 10, dpi=300)
```
Arranging data:
```{r}
install.packages("cowplot")
library(cowplot)
score_vs_trend <- ggplot(data=ohi_data, aes(x=OHI_score, y=OHI_trend)) +
geom_point(size=3, alpha=0.4)
score_vs_trend # notice that the default theme has been changed....I really like this theme!
```
```{r}
score_vs_HDI <- ggplot(data=ohi_data, aes(x=OHI_score, y=HDI)) +
geom_point(size=3, alpha=0.4) +
geom_smooth()
plot_grid(score_vs_trend, score_vs_HDI, labels = c('A', 'B'))
```
Sweet!
# Data Wrangling