release versioning best practice conventions

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

release versioning best practice conventions

Hilmar Lapp-2
Hi all,

I’m wondering whether the OBO community has developed commonly followed conventions for versioning releases of ontologies.

It seems that in the past one of the practices followed was to use the release date (such as in the format ‘YYYY-MM-DD’) as the release version, and consequently as a prefix in owl:versionIRI. I haven’t broadly surveyed the OBO ontologies yet to see whether that’s still broadly used or not. 

The OBO Foundry principle on the subject (http://www.obofoundry.org/principles/fp-004-versioning.html) is unfortunately rather unhelpful (it seems to say “anything goes as long as you can state it”).

Whether or not the date-based YYYY-MM-DD is still the most commonly used convention, it strikes me as a poor choice (because – not without irony – it is bare of any semantics), and hence I’d like to move the ontologies I’m involved with (which have been using this convention) to a better one. In the software (including scientific software) realm, semantic versioning (http://semver.org) is being increasingly widely adopted, and as a result will likely be already familiar in some way or another to many people involved with ontology engineering. I’m wondering whether arguments for or against adopting semantic versioning have been discussed for OBO ontologies in the past.

Obviously, OWL gives us the language to expressly assert compatibility semantics between different versions of an ontology, so one might ask why use or bother with a convention for release versions that leaves those semantics implicit and thus much more ambiguous and up to individual interpretation. For me, my response to that is that it’s not an either/or – version naming and expressly asserting compatibility semantics can, and IMHO should be complementary. The name of a version is primarily for communication between humans, not machines, and the precision in human communication matters too. At present, ontology versions used in publications, to the extent that I’m aware, are _at best_ cited by date. Not all dated release versions of all ontologies are archived for perpetuity (in fact most are not), and therefore after only a few years the ontology or ontologies used in such papers become not only essentially irreproducible, but also no notion can be recovered at all about the extent an archived version would or not be compatible with one used in the paper. This is dramatically different from software used in papers (setting aside the issue that authors frequently forget to include the version of a software tool they used).

I’m curious about thoughts about this in the community, and am wondering what the OBO Foundry’s position is on developing a much stronger principle on release versioning.

  -hilmar 

-- 
Hilmar Lapp -:- lappland.io




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss
Reply | Threaded
Open this post in threaded view
|

Re: release versioning best practice conventions

Chris Mungall


On 31 May 2017, at 7:47, Hilmar Lapp wrote:

> Hi all,
>
> I’m wondering whether the OBO community has developed commonly
> followed conventions for versioning releases of ontologies.
>
> It seems that in the past one of the practices followed was to use the
> release date (such as in the format ‘YYYY-MM-DD’) as the release
> version, and consequently as a prefix in owl:versionIRI.

Correct.

Anyone who uses the ontology starter kit
https://github.com/cmungall/ontology-starter-kit

Will get a configured Makefile and a README describing how to manage
their releases via github, see the template:
https://github.com/cmungall/ontology-starter-kit/blob/master/template/src/ontology/README-editors.md#release-manager-notes

> I haven’t broadly surveyed the OBO ontologies yet to see whether
> that’s still broadly used or not.

Not universally. For example, CHEBI uses a single incremented number
https://github.com/OBOFoundry/OBOFoundry.github.io/issues/214

BFO uses major.minor

Current versionIRIs:

http://sparql.hegroup.org/sparql?default-graph-uri=&query=SELECT+%3Fv+WHERE+%7B%3Fx+owl%3AversionIRI+%3Fv%7D&format=text%2Fhtml&timeout=0&debug=on

Ontologies lacking a versionIRI:

http://sparql.hegroup.org/sparql?default-graph-uri=&query=SELECT+%3Font+WHERE+%7B%3Font+a+owl%3AOntology+.+FILTER+NOT+EXISTS+%7B%3Font+owl%3AversionIRI+%3Fv%7D+%7D&format=text%2Fhtml&timeout=0&debug=on

Oops, at least one I'm partly responsible for there...

>
> The OBO Foundry principle on the subject
> (http://www.obofoundry.org/principles/fp-004-versioning.html 
> <http://www.obofoundry.org/principles/fp-004-versioning.html>) is
> unfortunately rather unhelpful (it seems to say “anything goes as
> long as you can state it”).
>
> Whether or not the date-based YYYY-MM-DD is still the most commonly
> used convention, it strikes me as a poor choice (because – not
> without irony – it is bare of any semantics), and hence I’d like
> to move the ontologies I’m involved with (which have been using this
> convention) to a better one. In the software (including scientific
> software) realm, semantic versioning (http://semver.org 
> <http://semver.org/>) is being increasingly widely adopted, and as a
> result will likely be already familiar in some way or another to many
> people involved with ontology engineering. I’m wondering whether
> arguments for or against adopting semantic versioning have been
> discussed for OBO ontologies in the past.

Not deeply, this is the only record I can find:
https://github.com/OBOFoundry/OBOFoundry.github.io/issues/214#issuecomment-187888472

I personally think semver makes more sense for schema-type ontologies, I
don't really see a use case for it on most OBO ontologies (outside of
upper ontologies, for which there seems to be a strong case for it).

To me the most important thing is that versionIRIs are used, used
consistently within an ontology, resolvable and that some kind of
coherent well thought out system is followed, and if an alternative to
the recommended YYYY-MM-DD is chosen then there is at least some brief
justification.

Opinions amongst OBOists may differ, as seen in the thread above, Alan
would prefer we all adhere to the same versionIRI scheme with no
exceptions

> Obviously, OWL gives us the language to expressly assert compatibility
> semantics between different versions of an ontology, so one might ask
> why use or bother with a convention for release versions that leaves
> those semantics implicit and thus much more ambiguous and up to
> individual interpretation.
> For me, my response to that is that it’s not an either/or –
> version naming and expressly asserting compatibility semantics can,
> and IMHO should be complementary. The name of a version is primarily
> for communication between humans, not machines, and the precision in
> human communication matters too. At present, ontology versions used in
> publications, to the extent that I’m aware, are _at best_ cited by
> date. Not all dated release versions of all ontologies are archived
> for perpetuity (in fact most are not),

This is indeed a problem.

Where possible, I always make a snapshot of the version on zenodo (at
CERN). This is super-easy if your ontology releases are small enough to
manage in github. I have not had luck using the zenodo API to release
larger files (but haven't had the resources to persist at this).

> and therefore after only a few years the ontology or ontologies used
> in such papers become not only essentially irreproducible, but also no
> notion can be recovered at all about the extent an archived version
> would or not be compatible with one used in the paper. This is
> dramatically different from software used in papers (setting aside the
> issue that authors frequently forget to include the version of a
> software tool they used).
>
> I’m curious about thoughts about this in the community, and am
> wondering what the OBO Foundry’s position is on developing a much
> stronger principle on release versioning.

+1

>
>   -hilmar
>
> --
> Hilmar Lapp -:- lappland.io


> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org!
> http://sdm.link/slashdot_______________________________________________
> Obo-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/obo-discuss

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss
Reply | Threaded
Open this post in threaded view
|

Re: release versioning best practice conventions

Darren Natale
In reply to this post by Hilmar Lapp-2
The public version of principle 4 (the page you linked) shows the
original text. The Editorial Working Group of the OBO Foundry has been
(very slowly) going through these to provide more guidance. Such has
been done for principle 4, but the new text has not yet been vetted.
Nonetheless, in light of this discussion, I reproduce what we have so
far in hopes that a final decision can be made.:

Name: versioning

Summary:
The ontology provider has documented procedures for versioning the
ontology, and different versions of ontology are marked, stored, and
officially released.

Purpose:
Ontologies are developed and refined over time. Similarly, they are
consumed at different points in time. The latter could lead to
differences when comparing different sets of results. Accordingly, it
becomes important to track precisely which version of an ontology has
been used, and be able to retrieve past versions. Note that this applies
only to those versions which have been officially released.

Recommendation:
Version identifiers can be of the form “YYYY-MM-DD” (that is, a date),
or use a numbering system, but in any case each must associate with a
distinct official release. The date versioning system is preferred, as
it meshes with the OBO Foundry ID policy that also governs PURLs (see
below).

Implementation:
Regardless of the versioning system chosen (see below), the PURL must
use a date. In cases where there multiple releases on the same day, the
PURL points to the newest, and the previous release stays in the same
folder or a subfolder, named in such a way as to distinguish the
releases. Ontology developers must maintain--and make publicly
accessible into perpetuity--every official release. That is, the PURLs
pointing to each version must resolve, even if the file was moved
physically.

If terms are imported from an external ontology, the “IAO:imported from”
annotation (see Principle 1) may specify a dated version of the ontology
from which they are imported.

Examples:
For an OBO format ontology use the metadata tag:

data-version: 2015-03-31
data-version: 44.0

For an OWL format ontology, owl:versionInfo identifies the version and
versionIRI identifies the resource:

<owl:versionInfo
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">2014-12-03</owl:versionInfo>
<owl:versionIRI
rdf:resource="http://purl.obolibrary.org/obo/obi/2014-12-03/obi.owl"/>

===

An alternative proposal, again not yet vetted:

Purpose:
OBO projects share their ontologies using files in OWL or OBO format
(see OBO Principle 2). Ontologies are expected to change over time as
they are developed and refined (see OBO Principle 16 on maintenance).
This will lead to a series of different files. Consumers of ontologies
must be able to specify exactly which ontology files they used to encode
their data or build their applications, and be able to retrieve
unaltered copies of those files in perpetuity.

Recommendation:
OBO projects must publicly release official versions of their ontologies
from time to time. Each official release must have one or more files in
OWL or OBO format. Each official released must also have a unique
version IRI that conforms to the OBO ID Policy. The version IRI of each
official release is a Permanent URL (PURL), and the OBO project must
ensure that the PURL of each official release continues to resolve to
the files for that release, in perpetuity. If the files are moved, the
PURL must be updated to resolve to the new location. Consumers can then
use the version IRI to uniquely identify which official release of the
ontology they used, and to retrieve unaltered copies of the file(s).

The content of official release files MUST NOT be changed. For example,
if a bug is found in some official released file for some ontology, the
bug MUST NOT be fixed by changing the file(s) for that official release.
Instead the bug fixes should be included in a new official release, with
new files, and consumers can switch to the new release.

All OBO purjects must also have a PURL that resolves to the current
official release of their ontology.

===

Again, none of this is yet finalized, but I provide it to show where the
discussion has led so far. I think one of the sticking points is that
while the date versioning method was preferred by most, it fails to take
into account the possibility that multiple versions can be released on a
given date.


On 5/31/2017 10:47 AM, Hilmar Lapp wrote:

> Hi all,
>
> I’m wondering whether the OBO community has developed commonly followed
> conventions for versioning releases of ontologies.
>
> It seems that in the past one of the practices followed was to use the
> release date (such as in the format ‘YYYY-MM-DD’) as the release
> version, and consequently as a prefix in owl:versionIRI. I haven’t
> broadly surveyed the OBO ontologies yet to see whether that’s still
> broadly used or not.
>
> The OBO Foundry principle on the subject
> (http://www.obofoundry.org/principles/fp-004-versioning.html) is
> unfortunately rather unhelpful (it seems to say “anything goes as long
> as you can state it”).
>
> Whether or not the date-based YYYY-MM-DD is still the most commonly used
> convention, it strikes me as a poor choice (because – not without irony
> – it is bare of any semantics), and hence I’d like to move the
> ontologies I’m involved with (which have been using this convention) to
> a better one. In the software (including scientific software) realm,
> semantic versioning (http://semver.org) is being increasingly widely
> adopted, and as a result will likely be already familiar in some way or
> another to many people involved with ontology engineering. I’m wondering
> whether arguments for or against adopting semantic versioning have been
> discussed for OBO ontologies in the past.
>
> Obviously, OWL gives us the language to expressly assert compatibility
> semantics between different versions of an ontology, so one might ask
> why use or bother with a convention for release versions that leaves
> those semantics implicit and thus much more ambiguous and up to
> individual interpretation. For me, my response to that is that it’s not
> an either/or – version naming and expressly asserting compatibility
> semantics can, and IMHO should be complementary. The name of a version
> is primarily for communication between humans, not machines, and the
> precision in human communication matters too. At present, ontology
> versions used in publications, to the extent that I’m aware, are _at
> best_ cited by date. Not all dated release versions of all ontologies
> are archived for perpetuity (in fact most are not), and therefore after
> only a few years the ontology or ontologies used in such papers become
> not only essentially irreproducible, but also no notion can be recovered
> at all about the extent an archived version would or not be compatible
> with one used in the paper. This is dramatically different from software
> used in papers (setting aside the issue that authors frequently forget
> to include the version of a software tool they used).
>
> I’m curious about thoughts about this in the community, and am wondering
> what the OBO Foundry’s position is on developing a much stronger
> principle on release versioning.
>
>   -hilmar
>
> --
> Hilmar Lapp -:- lappland.io <http://lappland.io>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> _______________________________________________
> Obo-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss
Reply | Threaded
Open this post in threaded view
|

Re: release versioning best practice conventions

Chris Mungall
Hi Darren,

Can we make all proposed changes such as this one pull requests on
github? This has multiple advantages including keeping a full audit
trail and history, making everything transparent for our community, etc.

Here is an example of an open pull request to change the text on one of
exiting principles:
https://github.com/OBOFoundry/OBOFoundry.github.io/pull/407

On 31 May 2017, at 9:00, Darren Natale wrote:

> The public version of principle 4 (the page you linked) shows the
> original text. The Editorial Working Group of the OBO Foundry has been
> (very slowly) going through these to provide more guidance. Such has
> been done for principle 4, but the new text has not yet been vetted.
> Nonetheless, in light of this discussion, I reproduce what we have so
> far in hopes that a final decision can be made.:
>
> Name: versioning
>
> Summary:
> The ontology provider has documented procedures for versioning the
> ontology, and different versions of ontology are marked, stored, and
> officially released.
>
> Purpose:
> Ontologies are developed and refined over time. Similarly, they are
> consumed at different points in time. The latter could lead to
> differences when comparing different sets of results. Accordingly, it
> becomes important to track precisely which version of an ontology has
> been used, and be able to retrieve past versions. Note that this
> applies only to those versions which have been officially released.
>
> Recommendation:
> Version identifiers can be of the form “YYYY-MM-DD” (that is, a
> date), or use a numbering system, but in any case each must associate
> with a distinct official release. The date versioning system is
> preferred, as it meshes with the OBO Foundry ID policy that also
> governs PURLs (see below).
>
> Implementation:
> Regardless of the versioning system chosen (see below), the PURL must
> use a date. In cases where there multiple releases on the same day,
> the PURL points to the newest, and the previous release stays in the
> same folder or a subfolder, named in such a way as to distinguish the
> releases. Ontology developers must maintain--and make publicly
> accessible into perpetuity--every official release. That is, the PURLs
> pointing to each version must resolve, even if the file was moved
> physically.
>
> If terms are imported from an external ontology, the “IAO:imported
> from” annotation (see Principle 1) may specify a dated version of
> the ontology from which they are imported.
>
> Examples:
> For an OBO format ontology use the metadata tag:
>
> data-version: 2015-03-31
> data-version: 44.0
>
> For an OWL format ontology, owl:versionInfo identifies the version and
> versionIRI identifies the resource:
>
> <owl:versionInfo
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">2014-12-03</owl:versionInfo>
> <owl:versionIRI
> rdf:resource="http://purl.obolibrary.org/obo/obi/2014-12-03/obi.owl"/>
>
> ===
>
> An alternative proposal, again not yet vetted:
>
> Purpose:
> OBO projects share their ontologies using files in OWL or OBO format
> (see OBO Principle 2). Ontologies are expected to change over time as
> they are developed and refined (see OBO Principle 16 on maintenance).
> This will lead to a series of different files. Consumers of ontologies
> must be able to specify exactly which ontology files they used to
> encode their data or build their applications, and be able to retrieve
> unaltered copies of those files in perpetuity.
>
> Recommendation:
> OBO projects must publicly release official versions of their
> ontologies from time to time. Each official release must have one or
> more files in OWL or OBO format. Each official released must also have
> a unique version IRI that conforms to the OBO ID Policy. The version
> IRI of each official release is a Permanent URL (PURL), and the OBO
> project must ensure that the PURL of each official release continues
> to resolve to the files for that release, in perpetuity. If the files
> are moved, the PURL must be updated to resolve to the new location.
> Consumers can then use the version IRI to uniquely identify which
> official release of the ontology they used, and to retrieve unaltered
> copies of the file(s).
>
> The content of official release files MUST NOT be changed. For
> example, if a bug is found in some official released file for some
> ontology, the bug MUST NOT be fixed by changing the file(s) for that
> official release. Instead the bug fixes should be included in a new
> official release, with new files, and consumers can switch to the new
> release.
>
> All OBO purjects must also have a PURL that resolves to the current
> official release of their ontology.
>
> ===
>
> Again, none of this is yet finalized, but I provide it to show where
> the discussion has led so far. I think one of the sticking points is
> that while the date versioning method was preferred by most, it fails
> to take into account the possibility that multiple versions can be
> released on a given date.
>
>
> On 5/31/2017 10:47 AM, Hilmar Lapp wrote:
>> Hi all,
>>
>> I’m wondering whether the OBO community has developed commonly
>> followed
>> conventions for versioning releases of ontologies.
>>
>> It seems that in the past one of the practices followed was to use
>> the
>> release date (such as in the format ‘YYYY-MM-DD’) as the release
>> version, and consequently as a prefix in owl:versionIRI. I haven’t
>> broadly surveyed the OBO ontologies yet to see whether that’s still
>> broadly used or not.
>>
>> The OBO Foundry principle on the subject
>> (http://www.obofoundry.org/principles/fp-004-versioning.html) is
>> unfortunately rather unhelpful (it seems to say “anything goes as
>> long
>> as you can state it”).
>>
>> Whether or not the date-based YYYY-MM-DD is still the most commonly
>> used
>> convention, it strikes me as a poor choice (because – not without
>> irony
>> – it is bare of any semantics), and hence I’d like to move the
>> ontologies I’m involved with (which have been using this
>> convention) to
>> a better one. In the software (including scientific software) realm,
>> semantic versioning (http://semver.org) is being increasingly widely
>> adopted, and as a result will likely be already familiar in some way
>> or
>> another to many people involved with ontology engineering. I’m
>> wondering
>> whether arguments for or against adopting semantic versioning have
>> been
>> discussed for OBO ontologies in the past.
>>
>> Obviously, OWL gives us the language to expressly assert
>> compatibility
>> semantics between different versions of an ontology, so one might ask
>> why use or bother with a convention for release versions that leaves
>> those semantics implicit and thus much more ambiguous and up to
>> individual interpretation. For me, my response to that is that it’s
>> not
>> an either/or – version naming and expressly asserting compatibility
>> semantics can, and IMHO should be complementary. The name of a
>> version
>> is primarily for communication between humans, not machines, and the
>> precision in human communication matters too. At present, ontology
>> versions used in publications, to the extent that I’m aware, are
>> _at
>> best_ cited by date. Not all dated release versions of all ontologies
>> are archived for perpetuity (in fact most are not), and therefore
>> after
>> only a few years the ontology or ontologies used in such papers
>> become
>> not only essentially irreproducible, but also no notion can be
>> recovered
>> at all about the extent an archived version would or not be
>> compatible
>> with one used in the paper. This is dramatically different from
>> software
>> used in papers (setting aside the issue that authors frequently
>> forget
>> to include the version of a software tool they used).
>>
>> I’m curious about thoughts about this in the community, and am
>> wondering
>> what the OBO Foundry’s position is on developing a much stronger
>> principle on release versioning.
>>
>>   -hilmar
>>
>> --
>> Hilmar Lapp -:- lappland.io <http://lappland.io>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>>
>>
>> _______________________________________________
>> Obo-discuss mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Obo-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/obo-discuss

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss
Reply | Threaded
Open this post in threaded view
|

Re: release versioning best practice conventions

Marijane White
In reply to this post by Chris Mungall
I mostly lurk around here but I have been paying attention to semantic versioning (hereafter referred to as SemVer) for several years now so I have some thoughts on this.

I think the most important thing to recognize about SemVer is the very first item in the spec, which is that software using SemVer must declare a public API.  It’s not intended for general use, it’s specifically for situations where dependencies can be declared. [1]
As such, I agree with Chris’ remarks below that there may be a strong case for using it with upper ontologies and I’m less sure it’s useful in other situations.

Also, I offer that before adopting something like SemVer, one should consider arguments both in favor and in opposition to it.  If you Google “semantic versioning criticism” you’ll find things people have written on the topic over the last few years, but I think Rich Hickey’s keynote from Clojure/conj 2016 is a particularly thoughtful discussion, despite Hickey’s description of it as a rant.  You can watch it at https://youtu.be/oyLBGkS5ICk  (note: it’s long.  I recommend watching it at 2x because Hickey is not a hurried speaker.  The part about SemVer starts 30 min in but you’ll miss the discussion about dependencies that comes before it if you skip ahead.)

In it, he has some pretty harsh words for SemVer.  One argument he makes is that since major version changes in SemVer signal breaking changes, you might as well change the name, because it’s actually a new thing.  From there he argues that you just shouldn’t make breaking changes at all, using Maven as an example of this in practice since it is, as he says, “an accreting collection of immutable things”.

This strikes me as very similar to the opaque IDs discussion in the naming section of the OBO Tutorial [2], where it is argued that instead of replacing terms, new ones should be created instead, both to avoid breaking legacy data and to encourage use of the new term.  I’m not clear on whether it is generally held in the OBO community that changes should always be backwards compatible, but if it is the case, I’m not sure SemVer is something we need, since SemVer is really about signaling breaking changes to those that depend on you, and which is something it does rather poorly, if you agree with the various criticisms of it.   Also, the schema.org community seems to follow this practice, as I have noticed they never delete terms, they only mark them as superseded.  They also do not follow SemVer, to my knowledge.

Personally, I rather like the date-based versioning because it unambiguously identifies the most recent release. Hickey suggests chronological versioning as a possible alternative to SemVer in his keynote for this reason.


Marijane White, MSLIS
Ontologist Research Associate
Ontology Development Group
Oregon Health & Science University Library


[1] This is actually a counter-criticism to many arguments against SemVer that ignore this point: https://news.ycombinator.com/item?id=13379347
[2] https://github.com/jamesaoverton/obo-tutorial/blob/master/docs/names.md



On 2017/05/31, 8:40 AM, "Chris Mungall" <[hidden email]> wrote:

   
   
    On 31 May 2017, at 7:47, Hilmar Lapp wrote:
   
    > Hi all,
    >
    > I’m wondering whether the OBO community has developed commonly
    > followed conventions for versioning releases of ontologies.
    >
    > It seems that in the past one of the practices followed was to use the
    > release date (such as in the format ‘YYYY-MM-DD’) as the release
    > version, and consequently as a prefix in owl:versionIRI.
   
    Correct.
   
    Anyone who uses the ontology starter kit
    https://github.com/cmungall/ontology-starter-kit
   
    Will get a configured Makefile and a README describing how to manage
    their releases via github, see the template:
    https://github.com/cmungall/ontology-starter-kit/blob/master/template/src/ontology/README-editors.md#release-manager-notes
   
    > I haven’t broadly surveyed the OBO ontologies yet to see whether
    > that’s still broadly used or not.
   
    Not universally. For example, CHEBI uses a single incremented number
    https://github.com/OBOFoundry/OBOFoundry.github.io/issues/214
   
    BFO uses major.minor
   
    Current versionIRIs:
   
    http://sparql.hegroup.org/sparql?default-graph-uri=&query=SELECT+%3Fv+WHERE+%7B%3Fx+owl%3AversionIRI+%3Fv%7D&format=text%2Fhtml&timeout=0&debug=on
   
    Ontologies lacking a versionIRI:
   
    http://sparql.hegroup.org/sparql?default-graph-uri=&query=SELECT+%3Font+WHERE+%7B%3Font+a+owl%3AOntology+.+FILTER+NOT+EXISTS+%7B%3Font+owl%3AversionIRI+%3Fv%7D+%7D&format=text%2Fhtml&timeout=0&debug=on
   
    Oops, at least one I'm partly responsible for there...
   
    >
    > The OBO Foundry principle on the subject
    > (http://www.obofoundry.org/principles/fp-004-versioning.html 
    > <http://www.obofoundry.org/principles/fp-004-versioning.html>) is
    > unfortunately rather unhelpful (it seems to say “anything goes as
    > long as you can state it”).
    >
    > Whether or not the date-based YYYY-MM-DD is still the most commonly
    > used convention, it strikes me as a poor choice (because – not
    > without irony – it is bare of any semantics), and hence I’d like
    > to move the ontologies I’m involved with (which have been using this
    > convention) to a better one. In the software (including scientific
    > software) realm, semantic versioning (http://semver.org 
    > <http://semver.org/>) is being increasingly widely adopted, and as a
    > result will likely be already familiar in some way or another to many
    > people involved with ontology engineering. I’m wondering whether
    > arguments for or against adopting semantic versioning have been
    > discussed for OBO ontologies in the past.
   
    Not deeply, this is the only record I can find:
    https://github.com/OBOFoundry/OBOFoundry.github.io/issues/214#issuecomment-187888472
   
    I personally think semver makes more sense for schema-type ontologies, I
    don't really see a use case for it on most OBO ontologies (outside of
    upper ontologies, for which there seems to be a strong case for it).
   
    To me the most important thing is that versionIRIs are used, used
    consistently within an ontology, resolvable and that some kind of
    coherent well thought out system is followed, and if an alternative to
    the recommended YYYY-MM-DD is chosen then there is at least some brief
    justification.
   
    Opinions amongst OBOists may differ, as seen in the thread above, Alan
    would prefer we all adhere to the same versionIRI scheme with no
    exceptions
   
    > Obviously, OWL gives us the language to expressly assert compatibility
    > semantics between different versions of an ontology, so one might ask
    > why use or bother with a convention for release versions that leaves
    > those semantics implicit and thus much more ambiguous and up to
    > individual interpretation.
    > For me, my response to that is that it’s not an either/or –
    > version naming and expressly asserting compatibility semantics can,
    > and IMHO should be complementary. The name of a version is primarily
    > for communication between humans, not machines, and the precision in
    > human communication matters too. At present, ontology versions used in
    > publications, to the extent that I’m aware, are _at best_ cited by
    > date. Not all dated release versions of all ontologies are archived
    > for perpetuity (in fact most are not),
   
    This is indeed a problem.
   
    Where possible, I always make a snapshot of the version on zenodo (at
    CERN). This is super-easy if your ontology releases are small enough to
    manage in github. I have not had luck using the zenodo API to release
    larger files (but haven't had the resources to persist at this).
   
    > and therefore after only a few years the ontology or ontologies used
    > in such papers become not only essentially irreproducible, but also no
    > notion can be recovered at all about the extent an archived version
    > would or not be compatible with one used in the paper. This is
    > dramatically different from software used in papers (setting aside the
    > issue that authors frequently forget to include the version of a
    > software tool they used).
    >
    > I’m curious about thoughts about this in the community, and am
    > wondering what the OBO Foundry’s position is on developing a much
    > stronger principle on release versioning.
   
    +1
   
    >
    >   -hilmar
    >
    > --
    > Hilmar Lapp -:- lappland.io
   
   
    > ------------------------------------------------------------------------------
    > Check out the vibrant tech community on one of the world's most
    > engaging tech sites, Slashdot.org!
    > http://sdm.link/slashdot_______________________________________________
    > Obo-discuss mailing list
    > [hidden email]
    > https://lists.sourceforge.net/lists/listinfo/obo-discuss
   
    ------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org! http://sdm.link/slashdot
    _______________________________________________
    Obo-discuss mailing list
    [hidden email]
    https://lists.sourceforge.net/lists/listinfo/obo-discuss
   

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss
Reply | Threaded
Open this post in threaded view
|

Re: release versioning best practice conventions

Alan Ruttenberg-2
In reply to this post by Hilmar Lapp-2


On Wed, May 31, 2017 at 10:47 AM, Hilmar Lapp <[hidden email]> wrote:
Hi all,

I’m wondering whether the OBO community has developed commonly followed conventions for versioning releases of ontologies.

It seems that in the past one of the practices followed was to use the release date (such as in the format ‘YYYY-MM-DD’) as the release version, and consequently as a prefix in owl:versionIRI. I haven’t broadly surveyed the OBO ontologies yet to see whether that’s still broadly used or not. 
 
It's more than a practice. It's part of the ID Policy document  http://obofoundry.org/id-policy.html
That policy was posted on OBO-discuss years ago, solicited comments, and was eventually adopted. Absent new and compelling information, I don't see any reason to change the existing policy.


The OBO Foundry principle on the subject (http://www.obofoundry.org/principles/fp-004-versioning.html) is unfortunately rather unhelpful (it seems to say “anything goes as long as you can state it”).

Whether or not the date-based YYYY-MM-DD is still the most commonly used convention,

policy
 
it strikes me as a poor choice (because – not without irony – it is bare of any semantics), and hence I’d like to move the ontologies I’m involved with (which have been using this convention) to a better one. In the software (including scientific software) realm, semantic versioning (http://semver.org) is being increasingly widely adopted, and as a result will likely be already familiar in some way or another to many people involved with ontology engineering. I’m wondering whether arguments for or against adopting semantic versioning have been discussed for OBO ontologies in the past.

Semantic versioning is embedding more information than we already have into the identifier. That's generally considered bad practice as the information is implicit rather than explicit. Particularly in the case of semantic versioning, there is a lot of room for interpretation. Better practice is to explicitly represent any information that's relevant to understanding the content of ontology in the ontology or in other metadata.

I don't agree with the comments that suggest there should be a change in this policy for upper level ontologies. 

Obviously, OWL gives us the language to expressly assert compatibility semantics between different versions of an ontology, so one might ask why use or bother with a convention for release versions that leaves those semantics implicit and thus much more ambiguous and up to individual interpretation.

They don't leave them implicit, because there isn't anything in them other than the namespace and the data that can be interpreted. There's a difference between something being implicit and not saying anything. Also, it is the intention that OBO Foundry ontologies are always forward compatible except to the extent that terms are deprecated, but even then there is a kind of forward compatibility in that the identifiers don't disappear, have metadata associated with them,  and terms are intended to refer to the same thing over successive versions, even when metadata changes. These principles are documented.

The sole intentional breaking change was the move from URIs with fragment identifiers to ones without. That was a big change and was carefully implemented by moving it out in a coordinated and predictable way.
 
For me, my response to that is that it’s not an either/or – version naming and expressly asserting compatibility semantics can, and IMHO should be complementary. The name of a version is primarily for communication between humans, not machines, and the precision in human communication matters too. At present, ontology versions used in publications, to the extent that I’m aware, are _at best_ cited by date.

It depends on the context. If I had my choice I would have citations be to the standard PURL, with the version IRI given as how it was accessed.

There is no problem embedding more version information either in the ontology, or in the ontology registry (the set of yaml files). So you are right, it's not an either/or in that one is not prevented, in any way, from including more explicit versioning information as metadata.
 
Not all dated release versions of all ontologies are archived for perpetuity (in fact most are not), and therefore after only a few years the ontology or ontologies used in such papers become not only essentially irreproducible, but also no notion can be recovered at all about the extent an archived version would or not be compatible with one used in the paper. This is dramatically different from software used in papers (setting aside the issue that authors frequently forget to include the version of a software tool they used).

I'm sorry, I don't see how you come to the conclusion that changing how ontologies versions are named will impact, in any way, appropriate diligence in keeping prior versions accessible.
 
I’m curious about thoughts about this in the community, and am wondering what the OBO Foundry’s position is on developing a much stronger principle on release versioning.

I hope not. There are much bigger fish to fry.

Alan
 

  -hilmar 

-- 
Hilmar Lapp -:- lappland.io




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Obo-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/obo-discuss