pyOpenSci Software Peer Review Goals#

Python is a flexible programming language that is used across numerous disciplines and domains. Python is so flexible that it is one of the few languages that can be used to wrap around other languages.

If you are building a pure Python package, your setup may be simple. However, some scientific packages have complex requirements as they may need to support extensions or tools written in other languages such as C or C++.

To support the myriad uses of Python, there are many ways to create a Python package.

Python supports extensions written in other languages

The spatial data tool stack is a common example of tools that often have complex packaging requirements given they often need to use tools like GDAL to support spatial operations.

What makes Python unique and valuable can also make packaging complex#

The diversity of packaging options can be confusing, particularly if you are new to Python. However, we are working on a Python packaging guide that will make the packaging landscape far easier to navigate.

Working towards standardized packaging practices in the Scientific Python community#

While there is no single solution to the diverse needs of developers in the Python scientific community, pyOpenSci strives to encourage a standard approach to packaging through tutorials, documentation and guides as well as its peer review process. We do this by:

  1. Following and encouraging best practices for Python packaging that follow modern Python Enhancement Protocols (PEPs). PEPs are standards written for the broader Python community to follow.

  2. Reinforcing best practices accepted by the scientific community. This community most often develops packages that are not pure Python. Thus, the scientific community has additional layers of complexity in their tool builds that we need to consider.

  3. Enforcing documentation best practices in our reviews that support both usability and accessibility. Great documentation is critical for a package to gain more users from varying backgrounds.

As such, pyOpenSci embraces community driven standards created by organizations such as Scientific Python.

We also endeavor to help maintainers use a similar infrastructure for their packages. In the long term, consistent infrastructure and packaging approaches will:

  • Make it easier for those who are new to packaging to get started (and in turn push open science forward),

  • Make it easier for new contributors to participate given similar infrastructure setup across the ecosystem.

As it makes sense, we recommend (but do not require) packaging approaches implemented by existing packages in the scientific Python ecosystem.

Specific pyOpenSci goals#

In addition to the broader goals laid out above, our specific goals for open peer review of Python packages are as follows.

1. A catalog of vetted, maintained tools#

We hope that scientists will look to our online catalog of maintained and vetted tools to help them find tools that they can depend on. Over time as our catalog grows, this will reduce the need for searching and sorting through the many packages in various states of maintenance on PyPI.

Through our peer review, we also help maintainer with outreach about their packages through our website, blog and social media presence. Currently, the most popular content on our website are the blogs that maintainers have written about their tools.

2. Improved package usability through documentation#

Clear and useful documentation makes it easier for scientists to use your tools. During our reviews, we look carefully at:

  • documentation,

  • quick start tutorials and code examples to help users get started,

  • and code documentation (docstrings)

to ensure that the package is user friendly. If you need help with documentation, we also have an entire section of our packaging guide devoted to documentation.

3. Reduce the number of packages with overlapping functionality#

It is easy to find multiple packages on PyPI that perform that same tasks (or overlapping tasks). When you submit a package to us, we ask that you cross reference other packages in our system that may perform similar purposes. This check will help us connect maintainers who are working towards similar functionality goals. It will also help us in the long term reduce the number of overlapping packages, in various states of maintenance on PyPI.

In some cases just adding information to your README.md about how your package varies from others in the ecosystem is extremely valuable to users.

4. Support long term maintenance of vetted tools#

Most maintainers are not renumerated for their efforts and thus work on tools in their free time. So what happens when a maintainer of a commonly used tool wants to step down? We help maintainers identify someone new to take over the reigns. If that doesn’t make sense, we will help the maintainer gracefully sunset the package.

One key element in this process is a clear development guide that outlines the tools and workflows that you are using to build and maintain your package.

5. Support implementation of standards and best practices#

Where possible we encourage and help maintainers to update and enhance their package infrastructure. Examples of this include moving package metadata from the setup.py file and into the modern pyproject.toml standard format.

6. Build a community support system for maintainers#

Many packages that are commonly used by scientists are being maintained by a single person. As devoted as this single maintainer may be, it is always easier to work when you have community support.

Our online community both on discourse and more privately on slack, is available for maintainers to interact with each other, ask questions and get support. Core to our mission is building a welcoming and diverse community for people of both differing technical background and skill as well as varying cultural background.