Coya Data Engineering challenge

We have a small task for you for us. At Coya want to collect public data to assess the plausibility of claims. The data team has found the following dataset:

https://data.sfgov.org/City-Infrastructure/Case-Data-from-San-Francisco-311-SF311-/vw6y-z8j6

First mini-task

Create a document (or edit this one) telling us your ideas on the following:

As a general concept, how would you design a pipeline which extracts this data set? And how would you extend the pipeline to include further data sets from other data sources?

Second mini-task:

Please create a procedure to achieve the following: evaluate the daily trend on Damage Property Category in particular, and category distribution. What considerations should be taken regarding data quality?

Deliverable:

Please include everything as commits in a git repository starting from this one. Don't worry too much about making it all nice or perfect, we'll discuss it later with you. Please send us back this repository as a git archive or a link to a git repository.

Good luck!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitattributes		.gitattributes
311 cases analysis.ipynb		311 cases analysis.ipynb
README.md		README.md
sample_50k.csv		sample_50k.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coya Data Engineering challenge

First mini-task

Second mini-task:

Deliverable:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Coya Data Engineering challenge

First mini-task

Second mini-task:

Deliverable:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages