Pypub

Code summary

This code is currently under development. The goal is to bring together APIs and/or scrapers for working with publication websites (and related material).

When I have some functional examples of this in action I'll add them!

Roadmap:

expose Pubmed (started)
CrossRef
ElSevier API
other publishers?
Mendeley Client

Supported publishers

Currently, Pypub supports information retrieval from ScienceDirect, Springer, Wiley Online, and Nature Reviews Genetics. Taylor & Francis recently moved to a new article page format, so the corresponding file needs to be updated.

Main usage

The easiest way to use this repo is with get_paper_info, a top-level function. It takes two optional keyword arguments, url, and doi. It can be called with paper_info = pypub.get_paper_info(doi='enter_doi_here', url='or_enter_url_here').

Result format

The get_paper_info method returns a PaperInfo object, the details for which can be found in paper_info.py. A PaperInfo object has three main attributes of interest: entry, references, and pdf_link. entry will contain all descriptive information about the paper, such as title, authors, journal, year, etc. references is a list of references, and pdf_link is a string, which gives the direct link to the paper PDF, if it could be retrieved.

Within the scrapers/base_objects.py file, there are several classes that each publisher inherits from to return information. The entry attribute will be a [Publisher]Entry class, which inherits from BaseEntry. Similarly, the references attribute is a list of [Publisher]Ref class instances that inherit from BaseRef.

Coding Standards

Documentation Standards, I'm trying to follow this: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt

Encapsulation! Encapsulation! Encapsulation! Ideally each module should have a well defined purpose that doesn't work with data that is not its own.

Testing

Within the tests folder, there are separate test modules for each of the scrapers. They are written for nosetests.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
docs		docs
pypub		pypub
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
quick_test_code.py		quick_test_code.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pypub

Code summary

Supported publishers

Main usage

Result format

Coding Standards

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pypub

Code summary

Supported publishers

Main usage

Result format

Coding Standards

Testing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages