[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Some hard numbers on licenses used by elisp packages
From: |
Jonas Bernoulli |
Subject: |
Re: Some hard numbers on licenses used by elisp packages |
Date: |
Wed, 12 Jul 2017 14:49:52 +0200 |
User-agent: |
mu4e 0.9.19; emacs 25.2.1 |
Richard has asked me privately (by accident, I suspect) for some
clarifications. Many of his questions were already addressed by the
page I linked to, and most others were already answered by the code
that that page in turn linked to.
I have now improved the introductory text on the linked page and I am
including that text here for your convenience:
> This page contains statistics about the licenses used by known Emacs
> packages. *These statistics are not legal advice. They are
> distributed in the hope that they will be useful, but WITHOUT ANY
> WARRANTY; without even the implied warranty of MERCHANTABILITY or
> FITNESS FOR A PARTICULAR PURPOSE.*
>
> The information used here is available from the Emacsmirror database
> (also known as the Epkg database). For more information about the
> Emacsmirror see these
> [[https://emacsair.me/2016/04/16/re-introducing-the-emacsmirror][blog]]
> [[https://emacsair.me/2016/05/17/assimilate-emacs-packages-as-git-submodules][posts]].
>
> I have created this page to accompany
> [[http://lists.gnu.org/archive/html/emacs-devel/2017-07/msg00341.html][this]]
> conversation on
> ~emacs-devel~.
>
> I will periodically update the these statistics. If you want to do so
> yourself, then read the relevant documentation. You may also ask me
> for guidance.
>
> This information is extracted using the function ~elx-license~, which is
> provided by my package [[https://github.com/tarsius/elx][elx]] (~git clone
> https://github.com/tarsius/elx.git~).
>
> The license is determined from the contents of the "main library" of
> the package alone (the library whose name matches the name of the
> package). First this function looks for a permission statement for a
> license published by the Free Software Foundation, if any. If that
> fails, then the value of the "License" header keyword is considered.
> Finally it searches for brief, and potentially ambiguous, permission
> statements for non-FSF licenses. For FSF licenses a "+" is appended
> if the text "or (at your option) any later version", or similar was
> found. An effort is made to normalize the returned value. This
> function also accounts for some commonly used variations in wording,
> typos, and other complications.
>
> However the returned value is sometimes false or ambiguous. In
> particular note that if a license is "unknown", then that merely means
> that it is /not known/ what license applies. This may be because the
> library lacks a permission statement altogether (possibly because an
> accompanying ~LICENSE~ file is considered sufficient by the upstream),
> but it may also be because ~elx-license~ does not attempt to detect the
> used non-standard and/or non-fsf permission statement, or because of
> typos in the statement, or for a number of other reasons.
I have also improved the code used to extract this information and made
a new `elx' release. This is the relevant code, including doc-strings:
> (defconst elx-gnu-permission-statement-regexp
> (replace-regexp-in-string
> "\s" "[\s\t\n;]+"
> ;; is free software[.,:;]? \
> ;; you can redistribute it and/or modify it under the terms of the \
> "\
> GNU \\(?1:Lesser \\| Library \\|Affero \\|Free \\)?\
> General Public Licen[sc]e[.,:;]? \
> \\(?:as published by the \\(?:Free Software Foundation\\|FSF\\)[.,:;]? \\)?\
> \\(?:either \\)?\
> \\(?:GPL \\)?\
> version \\(?2:[0-9.]*[0-9]\\)[.,:;]?\
> \\(?: of the Licen[sc]e[.,:;]?\\)?\
> \\(?3: or \\(?:(at your option) \\)?any later version\\)?"))
>
> (defconst elx-gnu-license-keyword-regexp "\
> \\(?:GNU \\(?1:Lesser \\| Library \\|Affero \\|Free \\)? General Public
> Licen[sc]e\
> \\|\\(?4:[laf]?gpl\\)[- ]?\
> \\)\
> \\(?:\\(?:v\\|version \\)?\\(?2:[0-9.]*[0-9]\\)\\)?\
> \\(?3: or \\(?:(at your option) \\)?\\(?:any \\)?later\\(?: version\\)?\\)?")
>
> (defconst elx-non-gnu-license-keyword-alist
> '(("Apache-2.0" . "apache-2\\.0")
> ("MIT" . "mit")
> ("as-is" . "as-?is")
> ("public-domain" . "public[- ]domain")))
>
> (defconst elx-non-gnu-license-keyword-regexp "\
> \\`\\(?4:[a-z]+\\)\\(?:\\(?:v\\|version \\)?\\(?2:[0-9.]*[0-9]\\)\\)?\\'")
>
> (defconst elx-non-gnu-permission-statement-alist
> `(("Apache-2.0" . "^;.* Apache License, Version 2\\.0")
> ("MIT" . "^;.* mit license")
> ("public-domain" . "^;.*in\\(to\\)? the public[- ]domain")
> ("public-domain" . "^;+ +Public domain\\.")
> ("as-is" . "^;.* \\(provided\\|distributed\\) \
> \\(by the author \\)?[\"`']\\{0,2\\}as[- ]is[\"`']\\{0,2\\}")))
>
> (defun elx-license (&optional file)
> "Attempt to return the license used for the file FILE.
> Or the license used for the file that is being visited in the
> current buffer if FILE is nil.
>
> *** A value is returned in the hope that it will be useful, but
> *** WITHOUT ANY WARRANTY; without even the implied warranty of
> *** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> This function completely ignores and \"LICENSE\" or similar file
> in the proximity of FILE. The returned value is solely based on
> the contents of FILE itself.
>
> The license is determined from the permission statement, if any.
> Otherwise the value of the \"License\" header keyword is
> considered. An effort is made to normalize the returned value.
>
> *** However this function does not always return the correct
> *** value and the returned value is not legal advice.
>
> Note in particular that if this function returns nil, then that
> merely merely means that it is not known what license applies.
> This may be because the library lacks a permission statement
> altogether (possibly because an accompanying \"LICENSE\" file
> is considered sufficient by the upstream), but it may also be
> because this function does not attempt to detect the used
> non-standard and/or non-fsf permission statement, or because
> of typos in the statement, or for a number of other reasons."
> (lm-with-file file
> (cl-flet ((format-gnu-abbrev
> (&optional object)
> (let ((abbrev (match-string 1 object))
> (version (match-string 2 object))
> (later (match-string 3 object))
> (prefix (match-string 4 object)))
> (concat (if prefix
> (upcase prefix)
> (pcase abbrev
> ("Lesser " "LGPL")
> ("Library " "LGBL")
> ("Affero " "AGPL")
> ("Free " "FDL")
> (`nil "GPL")))
> (and version (concat "-" version))
> (and later "+")))))
> (let ((bound (lm-code-start))
> (case-fold-search t))
> (or (and (re-search-forward elx-gnu-permission-statement-regexp bound
> t)
> (format-gnu-abbrev))
> (-when-let (license (lm-header "Licen[sc]e"))
> (or (and (string-match elx-gnu-license-keyword-regexp license)
> (format-gnu-abbrev license))
> (car (cl-find-if (pcase-lambda (`(,_ . ,re))
> (string-match re license))
> elx-non-gnu-license-keyword-alist))
> (and (string-match elx-non-gnu-license-keyword-regexp
> license)
> (format-gnu-abbrev license))))
> (and (re-search-forward
> "^;\\{1,4\\} Licensed under the same terms as Emacs" bound
> t)
> "GPL-3+")
> (and ;; Some libraries are releases "under the *GPL and
> ;; "<other license>", while the GPL is mentioned in
> ;; a way the above code does not recognize. Return
> ;; nil instead of "<other license>" in such cases.
> (not (re-search-forward elx-gnu-license-keyword-regexp bound
> t))
> (car (cl-find-if (pcase-lambda (`(,_ . ,re))
> (re-search-forward re bound t))
>
> elx-non-gnu-permission-statement-alist))))))))
Note that this function now returns e.g. "GPL-3+" if the "or (at your
option) any later version" pattern was detected. I also made some other
changes to avoid false-positives (which comes at the cost of also no
longer matching some patterns that were previously matched correctly).
I can provide lists of packages that fall into a particular "category".
These lists can contain the names and email addresses of the maintainer,
links to the homepage and repository and many other things you might
find useful.
I would also be willing to contribute this code to the `lisp-mnt.el'
library, which is part of Emacs. It certainly could still be improved
a lot, but it is a start.
Oh, and I almost forgot - here is an updated table:
| License | Count | Percent |
|---------------+-------+---------|
| GPL-3+ | 2230 | 61 |
| GPL-2+ | 611 | 17 |
| (unknown) | 511 | 14 |
| as-is | 91 | 2 |
| MIT | 70 | 2 |
| public-domain | 52 | 1 |
| GPL-3 | 41 | 1 |
| GPL-2 | 31 | 1 |
| Apache-2.0 | 18 | 0 |
| GPL-1+ | 4 | 0 |
| BSD | 3 | 0 |
| GPL | 2 | 0 |
| LGPL | 2 | 0 |
| AGPL-3 | 1 | 0 |
| AGPL-3+ | 1 | 0 |
| BSD-3 | 1 | 0 |
| EPL | 1 | 0 |
| LGPL-3+ | 1 | 0 |
| LGPL-3.0 | 1 | 0 |
|---------------+-------+---------|
| total GNU | 2925 | 80 |
|---------------+-------+---------|
| total | 3672 | 100 |
And to briefly answer the post questions:
> > | (unknown) | 509 | 14 |
>
> Could you explain what "unknown" means? If a program
> does not explicitly state a license, it is proprietary.
Either the license was not specified OR the code was unable to find
the permission statement, which actually is present.
> > | as-is | 117 | 3 |
>
> Could you tell me what "as-is" means, here? Is "as-is" meant to
> identify a speciic license? If so, could you please show it to me? I
> need to determine whether it is a free license and GPL-compatible.
Essentially the string "as-is" was found in the header. I do agree
that this is ambiguous and problematic, but I decided to provide
this information anyway, because it is at least less ambiguous than
"unknown".
> > | MIT | 45 | 1 |
>
> "MIT" as the name of a license is ambiguous; see
Merely reporting that the string "MIT license" was found.
> > | GPL | 29 | 1 |
>
> What does that mean, concretely?
> Do these packages say, "any version of the GNU GPL"?
> That would be peculiar but not a substantive problem.
>
> > | GPL-1 | 4 | 0 |
>
> Do these packages carry "GPL version 1 only"
> or "GPL version 1 or later"?
This has been improved now:
* "GPL" => the GPL was mentioned, no version was mention
(or possibly was just not detected)
* "GPL-N" => the GPL and version N were mentioned
* "GPL-N+" => ... additionally "or (at your opinion) any later version"
was found (or a variation thereof).
> > | EPL | 1 | 0 |
>
> Does that mean the Eclipse Public License?
My guess is as good as yours; the string ";; License: EPL" was found.
Best regards,
Jonas
- Some hard numbers on licenses used by elisp packages, Jonas Bernoulli, 2017/07/10
- Re: Some hard numbers on licenses used by elisp packages,
Jonas Bernoulli <=
- Re: Some hard numbers on licenses used by elisp packages, Richard Stallman, 2017/07/13
- Re: Some hard numbers on licenses used by elisp packages, Jonas Bernoulli, 2017/07/14
- Re: Some hard numbers on licenses used by elisp packages, Mats Lidell, 2017/07/15
- Re: Some hard numbers on licenses used by elisp packages, Richard Stallman, 2017/07/15
- Re: Some hard numbers on licenses used by elisp packages, Jean-Christophe Helary, 2017/07/15
- Re: Some hard numbers on licenses used by elisp packages, Jonas Bernoulli, 2017/07/16
- Re: Some hard numbers on licenses used by elisp packages, Mats Lidell, 2017/07/16
- Re: Some hard numbers on licenses used by elisp packages, Mats Lidell, 2017/07/16
- Re: Some hard numbers on licenses used by elisp packages, Richard Stallman, 2017/07/17
- Re: Some hard numbers on licenses used by elisp packages, Richard Stallman, 2017/07/17