Skip to content

Count number of authors #13

@nichtich

Description

@nichtich

The number of identified authors can be counted with P50:

$ zcat 20171120/wikidata-20171120-publications.ndjson.gz | \
    jq .claims.P50[]? -r | uniq | sort | uniq | wc -l
120821

this takes 7 minutes to run on my machine. Indexing the whole dataset in a database should be faster and more flexible for additional analytics. For instance the number of identified author statements:

$ zcat 20171120/wikidata-20171120-publications.ndjson.gz | jq .claims.P50[]? -r | wc -l
974191

The number of unidentified author statements with P2093 can be counted in the same way:

$ zcat 20171120/wikidata-20171120-publications.ndjson.gz | jq .claims.P2093[]? -r | wc -l
43206518

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions