Skip to content

Potential Wikipedia City Population Typos #1

@jamesfeigenbaum

Description

@jamesfeigenbaum

Not sure if this is the right place for this Ben, but over the years I've collected (from pdfs of census city population tables) city populations for 1890 to 1940 for a decent set of cities. Of the 24711 city x year observations that I have data for and you have data for, only 1325 disagree. This is some combination of data entry errors on my part, data entry errors in Wikipedia, and just weird or bad merging of city names (I did this quick and dirty, so city names showing up multiple times in a state might be an issue). Plus a bunch of CT cities are listed as cities and towns in the raw pdf and I think I punched in a different row than what is on Wikipedia).

The list of disagreements is here: https://www.dropbox.com/s/w8wisqt27mir2hh/wiki_edits.csv?dl=0 and links to the raw pdfs are below. I wonder if we could get some interested (and compulsive) Wikipedia editors interested to correct as many of the Wikipedia city tables (some fraction of the 1325, but not sure what %). My understanding of this project is that any edits on Wikipedia will eventually flow through to this data, right?

Raw PDFs

1910 pdf with city populations: https://www.dropbox.com/s/4vfuwzkh3hmysfp/census_1910.pdf?dl=0

1930 and 1940 pdfs with city populations: https://www.dropbox.com/sh/ia56uz1bs13oaep/AADjHoxKJ1N3WS5vkNEGGoRla?dl=0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions