What if you have an open source software package licensed under a permissive license like the Apache or MIT, but inside that package are dependencies licensed under a restrictive license like the General Public License (GPL)? What are some best practices to follow?
This scenario arises often. Open source packages licensed under a permissive license can include dependencies licensed under the GPL, the Lesser General Public License (LGPL), or some other so-called “copyleft” license or other restrictive license. In these cases, the question is: What are the user’s obligations? Does the code licensed under the restrictive license affect your ability to use the package as a whole? How far do you really want to peel back this onion and identify the many parts and pieces within a package to figure out what all the various—and sometimes competing or contradictory—obligations may be?
Generally, as the name suggests, permissive licenses permit a broad use of the software code licensed under it. Further, code licensed under a permissive license can be used and shared with any license type, including commercial proprietary licenses. In contrast, code licensed under copyleft licenses like the GPL present a more complex range of compliance challenges and is often subject to greater scrutiny. Given the complex nature of the GPL and the possible implications arising from reusing and distributing code licensed under it, lawyers and compliance professionals are keen to identify instances where their proprietary code may interact with code licensed under the GPL and consider the ramifications of that.
But how do we account for dependencies? This blog post explores how our customers and their lawyers think about and deal with this issue.
Dependencies can be lumped into two general categories: direct dependencies and transitive dependencies. Direct dependencies are the libraries your code directly calls and otherwise utilizes. Transitive dependencies are the libraries or other software that the direct dependencies utilize. Transitive dependencies are, essentially, dependencies of dependencies.
For example, Android is licensed under the permissive Apache license, version 2.0. However, buried in Android, among other dependencies, is a modified version of the GPL-licensed Linux kernel. Someone digging into the Apache stack could see some Linux elements and may wonder why the copyleft effect of the GPL doesn’t extend to Android as a whole.
As with so many things in law, and especially in the arena of open source license interpretation, the answer is, “it depends.” First, it should be said that as a user of software, you have an obligation to adhere to the applicable license, regardless of what the software is or what channel or means you accessed it through. You are technically legally responsible for complying with the obligations of all the licenses applicable to all the dependencies included with the code you obtained. There is no “umbrella coverage” provided by the fact that a product may be licensed under a permissive license. It’s expected that any user identifies all the applicable licenses and abides by the obligations. So the top-level license typically does not provide you with protections or a shield from the licenses applicable to other included components.
However, producing a complete Bill of Materials for every open source package that you intend to repurpose, and conducting a complete analysis of the obligations of all the licenses applicable to all the elements contained therein can be daunting—and at times impractical. Accordingly, a risk-based, triage- style approach is often used to consider the effort in light of the risk.
The risk analysis that our customers engage in usually starts with deprioritizing the transitive dependencies. This is because there tend to be many transitive dependences and in practice, they tend to be considered relatively low risk from a license compliance perspective. The further down the engineering chain the transitive dependencies are, the less likely those dependencies—even those licensed under the GPL—are seen as interacting with your proprietary code in a manner that invokes any copyleft obligations.
That’s why, despite often being fewer in number, the real focus should immediately fall on the direct dependencies. From an engineering perspective, the interactions between the direct dependencies and your product are far more likely to raise compliance concerns under the GPL. Because of this, we see most customers focusing their compliance energies on making sure that direct dependencies are being used in compliance with their licenses.
Along with the potential copyleft effect of dependencies are issues arising from the need to provide adequate attribution notice. Adhering to notice requirements is a necessary element of license compliance and can present a significant administrative burden. Again, prioritizing direct dependencies is a help, and if you are utilizing direct dependencies from a well-run project, it should include a notices file for its dependencies (your indirect dependencies), so that may offer some assistance as well.
As with all license compliance challenges, identifying open source components and the corresponding license is made much easier by deploying automated scanning and compliance tools. Likewise, organizing dependencies, direct or indirect, is made much easier by tools that can automatically identify those dependencies and automatically categorize their risk based on the nature of the license governing those dependencies. For instance, an automated tool may be used to parse out only the dependencies governed by the GPL to enable a focused analysis on only the handful of dependencies that may ultimately pose an issue.
Having hopefully reduced the universe of potentially problematic dependencies to relatively small number, the next step is usually to deploy a deeper analysis and more conservative approach of a company’s flagship products to make sure that they have a clean open source Bill of Materials before those products enter the stream of commerce. Other products, like embedded products, may also be high on the list for deeper diligence, since remediating open source license compliance issues in these products can be difficult. Another element of this risk-based approach is to evaluate how active the community enforcement is and how likely a competitor is to use any license noncompliance, real or perceived, as a point of leverage over you.
All that said, sometimes the transitive dependencies can matter a great deal. It’s rare, but there was a very recent case in which the license type assigned to a transitive dependency mattered very much. See: GitHub rails issues.
Finally, while the focus of this blog post has been on managing open source license compliance risk, the risks associated with security vulnerabilities can’t be dismissed. From the perspective of managing security risk, sifting out transitive dependencies may actually increase potential risks if those transitive dependencies aren’t otherwise subject to the organization’s patch management process. Accordingly, a complete Bill of Materials that accounts for all components in the dependency chain may ultimately be required for reasons outside of license compliance.