-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathproject-setup.Rmd
More file actions
168 lines (116 loc) · 5.32 KB
/
project-setup.Rmd
File metadata and controls
168 lines (116 loc) · 5.32 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
# Project setup
## Create a project
We recommend creating one directory for each project. All your data and files for your project will live in this one directory.
Ideally, you'll organize your files in this directory in a principled way. We've created a project template with our suggested folder organization. If you haven't already, install the dcl package:
```{r, eval=FALSE}
# install.packages("remotes")
remotes::install_github("stanford-datalab/dcl")
```
Now, you can run
```{r eval=FALSE}
dcl::create_data_project(path = "PATH/TO/PROJECT")
```
to create a directory at your supplied path with the following files and directories:
- **data**: cleaned data
- **data-raw**: raw data
- **docs**: data documentation and notes
- **eda**: exploratory data analysis on your cleaned data
- **scripts**: data manipulation scripts
- **reports**: findings to present to others
- **Makefile**
- **.gitignore**
- **README.md**
We'll discuss how to use these directories and files in the next chapter.
By default, `dcl::create_data_project()` creates an RStudio project for the directory. If you don't want to create an RStudio project, set the `project` argument to `FALSE`:
```{r eval=FALSE}
dcl::create_data_project(path = "PATH/TO/PROJECT", project = FALSE)
```
Note that it's generally a bad idea to nest RStudio projects. If you find yourself wanting to use our folder organization inside a different RStudio project, you'll probably want `project = FALSE`.
## Setup GitHub
We recommend using GitHub for all your data work. Generally, you'll want one repository per project.
Here, we'll explain how to setup Git and GitHub for your new project. The following steps will only work if you set `project = TRUE` in the previous section (they require a .Rproj file). However, if you didn't want an RStudio project for your project, you likely also don't want a GitHub repository.
### GitHub token
You will need a GitHub personal access token in order to setup Git and GitHub from RStudio. Open [GitHub](http://github.com/) in your browser. Then:
- Click on your profile picture in the upper righthand corner, then click on *Settings*.
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/github-pat-1.png",
dpi = image_dpi_small
)
```
- Then, go to *Developer settings* \> *Personal access tokens* \> *Tokens (classic)*.
- Click *Generate new token*.
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/github-pat-2.png",
dpi = image_dpi_small
)
```
- Name your token something like *RStudio* or *R*. Under *Scopes*, select *repo* (You can select other scopes if you anticipate using the GitHub API in more scenarios.)
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/github-pat-3.png",
dpi = image_dpi_small
)
```
- Scroll down to the bottom, then click *Generate token*.
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/github-pat-4.png",
dpi = image_dpi_small
)
```
- Copy the resulting token to your clipboard and return to RStudio.
- From the RStudio console, open your .Renviron file with
```{r eval=FALSE}
usethis::edit_r_environ()
```
- Add the following line to your .Renviron file, replacing `YOUR_TOKEN` with the token you copied earlier.
```{r eval=FALSE}
GITHUB_PAT=YOUR_TOKEN
```
- Save the file.
- Restart R (*Cmd/Ctrl* + *Shift* + *0*).
### `use_git()`
Now, we can create a Git repository for your project.
If you haven't already, open your project in RStudio. Then, in the console, run
```{r eval=FALSE}
usethis::use_git()
```
`use_git()` will set up a Git repository for your project, then ask you if you want to make an initial commit:
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/use-git-1.png",
dpi = image_dpi_small
)
```
Enter the number that corresponds to the *Yes* option. Here, that's `3`, but it might be different for you.
Next, you'll be prompted to restart RStudio. Select the *Yes* option.
### `use_github()`
`use_git()` initializes a Git repository, but you'll still need to connect that repository to GitHub. To do so, run
```{r eval=FALSE}
usethis::use_github()
```
Note that `use_github()` has multiple optional arguments that allow you to, for example, create the repository under an organization or make the repository private.
You'll be prompted for a git protocol. You'll probably want ssh.
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/use-github-1.png",
dpi = image_dpi_small
)
```
Next, you'll be prompted to verify the repository name and description. Say *Yes* unless you're unhappy with them (you can always change them later).
You might get the following error:
```{r echo=FALSE}
knitr::include_graphics(
"images/project-workflow/use-github-2.png",
dpi = image_dpi_small
)
```
- If so, copy the recommended command.
```{bash eval=FALSE}
git push --set-upstream origin master
```
- Then, open Terminal, navigate to your project directory, paste in your copied command, and press enter. This command sets the default location to which `git push` will push.
- Earlier, `use_github()` opened your GitHub repository in the browser. To see the result of your push, refresh the page. Your files should appear.
Your setup is complete! In the next chapter, we'll go into more detail about how to use the folders and makefile created by `dcl::create_data_project()`.