Guidelines for healthy use of Open Source

Introduction

The purpose of this document is to provide concrete and actionable guidelines, hints and tips on how to achieve healthy use of third party libraries and frameworks in your application. This covers how to get started, how to stay in control, and how to act when health deteriorates. Where applicable, this document explains how Sigrid can be used to achieve this.

Structure and overview

The guidelines and best practices have been structured based on the actual tasks that stakeholders need to conduct (also called ‘jobs to be done’). Jobs can embed, or refer to, other jobs.

The jobs have been grouped in three categories: first some general guidelines for conducting healthy open source usage in development, that help to get started, and may need to be revisited or updated on a regular basis. Secondly the various types of tasks that are needed to ensure continued health of libraries (note that even when application code is not actively maintained for a while, the health of open source libraries may diminish!). And thirdly a set of practical tasks in handling libraries as a developer:

Be equipped for healthy open source usage

1) Define OSH-related policies
2) How to improve portfolio and system-level OSH
3) General guidelines for your application development

Ensuring your open source stays healthy

4) Scan the software for issues
5) Handling vulnerabilities
6) Handling license issues
7) Handling lack of freshness
8) Handling lack of activity

Handling libraries

9) Updating a library
10) Selecting a new library
11) When a library does not meet the requirements
12) Replacing a library
13) Reviewing a library

A word of caution: The guidelines and steps that follow are intended to be helpful in making decisions and taking proper actions; since every application context is unique, these guidelines and steps should never replace logical thinking, taking your unique situation into account!

About the SIG Open Source Health (OSH) model

The SIG Open Source Health model is described here in the documentation: OSH guidance for producers.

Be equipped for healthy open source usage

This section prescribes a typical way of working for ensuring healthy open source usage. In specific situations, you may adapt this approach, but it is best to follow a comply-or-explain approach.

There are a number of policies on how to address open source libraries during development. For most of these policies, minimal requirements should be set for all teams. Individual teams may agree on more stringent rules.

Policy I: Define the usage of a package manager

Choose the package manager(s) to be used, at least per system, preferably shared across the organization. Depending on the technologies that are used, you may need multiple package managers.

The package managers need to be integrated in your CI/CD pipeline.

Policy II: Set the thresholds for library risks

Set the thresholds for library risks that are (not) acceptable: this is applicable to all types of risks. Set these goals in the Sigrid objectives.

SIG advises the following objectives:

  • No library vulnerabilities: having vulnerabilities of medium or higher risk is generally not acceptable as a goal, and since there are relatively few low-risk vulnerabilities in practice, a ‘clean sweep’ of all vulnerabilities is preferred.
  • No unacceptable licenses; for a typical context this means no licenses that come with obligations or restrictions for commercial usage (see the OSH Guidelines for producers for more details.). In Sigrid these are classified as no-risk, and include the MIT, BSD, and Apache licenses.
  • Ensure overall OSH quality rating is 4.0 stars or more.

Policy III: Define how frequent to check for risks

Preferably check daily for vulnerabilities and quarterly for other OSH risks. See section 4. Scan the software for health issues for more details.

Policy IV: Define how fast new vulnerabilities have to be resolved

This will depend on the criticality. See the section on Handling vulnerabilities for details.

Policy V: Declare which libraries should not be checked

Do note that if a library is put on the ignore-list since the reported vulnerability is a false positive, that does not necessarily mean that the other types of risk for that library should be ignored as well.

Policy VI: (Optional) Define a shared permitted-list

A shared list of libraries that have been reviewed and approved for usage can be useful (and in some organizations required).

2. How to improve portfolio and system-level OSH

Especially when a system or portfolio is new in Sigrid, at the system and portfolio level there can be an abundance of OSH related issues that need to be fixed; this section provides some advice on how to tackle all those jobs incrementally, starting with the most critical and high-ROI topics first; Sigrid is designed specifically to help you focus on the highest priority issues.

When many libraries require (major) version updates, the level of test coverage of a system can be used as an additional factor for prioritization: systems with high test coverage have a lower risk of running into defects due to incompatible updates.

3. General guidelines for your application development

There are a number of topics to consider that are not directly related to the libraries themselves, but to the way you organize the development of the application itself. The following guidelines should be considered as compliance rules for framework and library management:

Keep application source code separate from frameworks/libraries.

  1. Do not change the source code of used frameworks/libraries: depending on the technology used, you often do not need source code at all, but will use binaries of the libraries. Changing the source code prevents you from updating later on. In effect, you will have taken on maintenance of the entire library. If you want to fix a bug or add a feature to a library, try to contribute them to the open-source project directly so that anyone can benefit.
  2. Only a single version of each library or framework should be used directly: Also, do not have copies of the same library installed. It may well be that one or more of your libraries is importing another version of the same library that your application uses; such indirect use is mostly out of scope.

Regression tests and maintainability of the application code are key to updating frameworks/libraries

If it is hard to update a library, chances are the problem lies in your codebase.

  1. Keep module coupling and component independence low, to make it easier to change code implementation (such as dealing with new versions of a library) while keeping the same behavior/requirements.
  2. Develop, maintain and run regression tests. These help to identify breaking changes in updates.
  3. Create an abstraction layer between the dependency and your code if you can foresee needing to replace it in the future. This isolates changes coming from a library update, and also makes it easier to replace the library completely.

Ensuring your open source stays healthy

This section describes guidelines, hints and tips on how to maintain healthy use of open source libraries in your application. Where applicable, we explain how Sigrid can be used to achieve this.

4. Scan the software for health issues

For timely handling of open source health risks, there are two concerns:

A good time to triage scan results is during refinement for the next sprint: You need to decide to address the detected risks during the upcoming sprint, or possibly create a backlog item. In some cases, the detected risk is considered a false positive, or acceptable risk that can be ignored. The most common mitigation will be updating a library.

5. Handling vulnerabilities

When to remediate vulnerabilities

Security risks, and hence the urgency of fixing a vulnerability, of a certain framework or library should be determined based on at least the following aspects:

Additional considerations for prioritizing vulnerability handling can also be business criticality, lifecycle phase and the privacy sensitivity of the data that an application handles.

The table below is a proposal how fast you should resolve vulnerabilities, depending on the risk level and the connectedness of the system:

CVSSv3 Range Risk Label Remediation Deadline Public facing Remediation Deadline Not public facing
9.0 – 10.0 Critical Within 1 working day Within 14 days
7.0 – 8.9 High Within 14 days Within 30 days
4.0 – 6.9 Medium Within 30 days Within 60 days
0.1 – 3.9 Low Within 60 days Within 90 days

How to remediate vulnerabilities

The primary means of remediating a vulnerability is to update the library: in most cases, vulnerabilities (especially critical ones) are only published once a patch is available in a new version of the library. See section 9. Updating a library for more details. Do check that the vulnerability is indeed solved in the newer version of the library.

If no such remediation is available, do a risk assessment which will have one of these outcomes:

6. Handling license issues

SIG assesses whether a license is generally considered a risk for use within commercial software. Contact an IT lawyer to discuss license risks specifically for the code analyzed as well as the way it will be used.

Assess license risk

The following table shows how various types of licenses are (not) suitable for different distribution policies, and explains how these common licenses are mapped to general risk levels. But do note that if your distribution model is clear, and the value listed for the particular license-distribution model is ‘ok’ in the table, then your actual licensing risk is minimal:

Risk level License category Common licenses Distribute modified code Distribute linked libraries Linked libs through network Internal use only
none permissive Apache / MIT / BSD Ok Ok Ok Ok
low Weak copy-left LGPL / MPL / CC-BY-ND prohibited Ok Ok Ok
medium Strong copy-left GPL prohibited prohibited Ok Ok
high Viral AGPL / CC-BY-NC / EUPL prohibited prohibited prohibited Ok
critical Commercial EULA / non-OSS / custom prohibited prohibited prohibited prohibited

Possible actions

Depending on the circumstances, one or more of the following actions can be taken to remediate detected licensing issues

7. Handling lack of freshness

Lack of freshness occurs when there is a newer version of a library available, but that version is not used in the application.

Development teams are responsible for keeping libraries up-to-date to a recent version: this may be part of How to remediate vulnerabilities, to make sure that bug fixes and improvements are incorporated, for compatibility with other libraries, or to ensure that future updates will not be too complicated or require a large effort all at once.

The remedy for lack of freshness is always Updating a library, possible exceptions are:

8. Handling lack of activity

Lack of activity in the development of a library is not an urgent problem, but it is a long-term concern, in particular since it precludes detecting and patching security vulnerabilities. This issue cannot be resolved by application developers, except by Replacing a library.

Handling your libraries

9. Updating a library

There can be several reasons for updating a library:

Updating to a newer version will always also improve the freshness rating. Using a package manager, updates may be installed automatically, or require updating the version constraints in the configuration file (sometimes called ‘manifest’) of the package manager.

The effort involved in updating a library can be estimated based on the release notes, and also semantic versioning: here are some rules of thumb w.r.t. the effort needed for updating:

One important factor is how well the automated test suite for the application (unit tests and/or system tests) will cover all possible cases: a need for manual testing can add substantial time to the above efforts.

Scheduling library updates:

Ground rule: never postpone updating

  • The longer you postpone updating, the bigger the eventual pain. As your system grows and evolves, the costs and risks of upgrading an old library increase. Such an accumulation of maintenance debt may lead to a much larger effort than in the case of smaller, incremental updates.
  • if a new, stable version comes out: don’t wait, start testing. If it is really core and really important, already start testing with release candidates.
  • Do not adopt the “If it ain’t broke, don’t fix it” strategy
    • This strategy implies that you do not update unless you have to. You stay with the current version of the third-party library until you notice something wrong in your application, no matter how often the vendor publishes an update. 
    • Whilst easier in the short term, with this strategy you will end up with a system that depends on outdated and unmaintained libraries, where you cannot use some other libraries since they require a newer version of that library which you cannot upgrade and at some point. You may lose the ability to fix some issues at all.
  • Only when a new version breaks the behavior of the application, postponing may be warranted.

10. Selecting a new library

The main criterion for selecting a new library is when a non-trivial amount of commonplace behavior is needed within the application: implementing such behavior from scratch is typically more time-consuming and error-prone than predicted, hence reuse from an (open source) library may be the better option.

Often, libraries are part of an ecosystem, or work within a certain application framework, such as Eclipse, or Apache, where it makes a lot of sense (consistency, frictionless compatibility) to pick a library from the same ecosystem, unless it violates any of the other recommendations provided in this document (e.g. you adopted a library that is no longer maintained, etc.).

See section Reviewing a library for a detailed checklist of properties to consider before selecting a new library.

A more extensive discussion of selecting (including reviewing) open source libraries can be found in this talk.

11. When a library does not meet requirements

There are several possible cases where a library does not support the needs and requirements:

The basic rule is that library implementations should not be modified or customized: ​One of the main benefits of libraries and frameworks, is that they provide functionality without the duty of maintaining it. After customizing a library implementation, you lose this benefit while being dependent on the changes that the community makes.​

How to address failing requirements:

  1. First, check whether a newer version of the library may solve the issue, then consider updating (Updating a library); you may also wait a bit until a fix has been released, especially when the issue is being worked on.
  2. If the issue is a bug or missing feature, you can file an issue at the maintainer of the library. If you have the time, work with the maintainer and contribute your own fix to the issue you are having.
  3. If the issue is a bug or missing feature, you can also look at the implementation of the library and develop a fix around it:
    • You may be able to wrap the relevant library call(s) with extra code that corrects or hides the bug.
    • Alternatively, you can temporarily use the modified library while merging back the bug fix into the library (be aware of the license, and make sure you have clearance from your employer to do so). Once the community has accepted the fix, you can update and remove the local code.
  4. Consider whether another library that implements similar functionality is available, and the costs of adopting that library are acceptable. Check Replacing a library for more details.
  5. If no other solution is feasible and a modification is absolutely required (or: the costs of alternative solutions are very high), the source code can be forked and put into a designated area in a version control system. In this case carefully consider the possible legal ramifications, e.g. should you make the modified version open source as well. The modification should be documented so that it can be re-applied whenever a newer version of the library is made available.

12. Replacing a library

There can be multiple reasons that require discarding a library and replacing it with another; see 11. When a library does not meet requirements for such situations.

In most cases, rebuilding a common functionality is not the best option: just assume that doing that will take much more time than expected, and will also require you to maintain the code in the future. So looking for an alternative library is most likely the best choice, unless there is only very specific and limited behavior that you need now and in the foreseeable future.

See section 10. Selecting a new library for guidelines on how to pick a new library.

One major concern when replacing a library with a new one is that a new library will most likely come with a new API; this means that all the locations in the application that use that library may need to be identified and adjusted. This can be more than just the identifiers of method calls, but also the data types that are passed back and forth to the library API can be different, which may impact the calling code substantially. An approach in this case can be to encapsulate the new library and in this way provide an interface that is equal, or more similar, to the previous library.

13. Reviewing a library

Whenever choosing a new library or updating to a new version, consider the following review criteria:

  1. Are there currently known vulnerabilities?
  2. Is the license acceptable? (and/or is the library in the shared permitted-list). See also 6. Handling license issues.
  3. Is the library actively maintained? (also discussed here in the Guidance for Producers), preferably by multiple developers?)
  4. Is the code quality (esp. maintainability) of the library acceptable (>3.0 stars)?
  5. How mature is the version? (e.g. an x.0 version tends to be a bit more immature). Are there still -relevant- open issues? Use a stable version unless there is a real reason not to do so. An example might be that a Release Candidate fixes a vulnerability, and you do not want to wait for the stable version to come out.
  6. Are there enough users of the library? (check for example the number of downloads, or amount of GitHub stars).
  7. Is an updated version compatible with the previous version?  (release notes should indicate any breaking changes)

The workflow between all jobs to be done

The following chart visualizes how the various tasks are interrelated: often one task contains multiple steps, one or more of which have been described separately in another task:


Frequently Asked Questions

Q: Is open Source Health the most important software quality concern?
A: No, not necessarily (depending on your situation), but in our opinion OSH is typically the quality aspect that has a really high return on investment: with a relatively low amount of effort and expertise, large steps towards reducing the risk associated with open source usage can be made. The majority of risks can be addressed by consistently updating the libraries in use to their latest stable versions. Also, risks such as vulnerable libraries and licensing risks can be rather impactful.

Q: Why not adopt a ‘if it ain’t broke, don’t fix it’ strategy for updating libraries? (i.e. do not update unless you have to).
A: We also addressed this in section 9. Updating a library: Whilst easier in the short term, with this strategy you will end up with a system that depends on outdated and unmaintained libraries, where you cannot use some other libraries since they require a newer version of that library which you cannot upgrade and at some point. You may lose the ability to fix some issues at all.​

Q: When a library has a known vulnerability, why do something about it, if you are not even sure that this vulnerability is actually exploitable (e.g. we may not be using the vulnerable method)?
A: This is our proposed strategy to deal with this uncertainty: in 90% of the cases, it takes less time to update, than to figure out whether you are vulnerable or not. So just do it. The other 10% contains frameworks that are used everywhere, or a major update, or systems that have no test code and a rigid pre-release manual testing setup. In those 10% of the situations, it can be worthwhile to do the investigation into whether the system is actually vulnerable.
Also: you can use information about deployment to assess the exploitability of a vulnerability: typically public-facing and connected systems need to be considered more carefully; for internal systems, vulnerabilities are more relevant as a second line of defense against attacks–not unimportant either. See also section 5. Handling Vulnerabilities).

Q: How can we update our libraries frequently, given that it takes so much time due to the need for manually re-testing our entire system, to get approval, to wait for the next quarterly release, because we are not using a package manager?
A: You have bigger problems than outdated libraries. Update the most important vulnerable dependencies, and invest in test automation and your development process.

Q: Is it not a bad practice to be on the latest version all the time, since these tend to be unstable and contain bugs?
A: The general advice is that it makes most sense to update to all security patches and minor revisions: those tend to make small, focused, changes to the code and typically do not introduce new bugs. For major revisions with new features and possbily even changed APIs, doing an update requires more careful consideration (also see 9.Updating a library ).

Q: How does Sigrid OSH collect its information?
A: First, the entire codebase is scanned for configuration files of common dependency management systems (e.g., NuGet, Maven, NPM) to find explicitly managed libraries.
Information about each library is then queried from public sources to determine the version currently used, the date of this version, the number and date of the newest published version, information about its license, and whether it is known to contain security vulnerabilities.
In addition, the entire codebase is scanned for unmanaged libraries following some heuristics: (1) JavaScript files are scanned for a version identifier in their name of contents. If found, it is assumed third-party and looked up in public databases to determine freshness, license and vulnerabilities. (2) The contents of Windows DLL files and Java JAR files are considered third-party and are scanned for name and version number and then looked up in public databases to determine freshness, license and vulnerabilities.

Q: Can I use Sigrid for libraries that I develop and maintain internally?
A: No, currently not: Sigrid has no access to your internal package repository, and also it exploits a lot of the information that is provided by open source ecosystems such as Maven, NuGet and NPM.

Q: Can I use Sigrid to assess a library before I decide to incorporate it into my project?
A: No (and yes); Sigrid only looks for the libraries that are in use in the source code that you pass through sigridci or upload: having said that: if you include some libraries explicitly in your project (even before you decide on actually using them), you will see the results next time you trigger a Sigrid scan.