ChemRxiv. Why?

In August 2016, the launch of a chemistry pre-print service ChemRxiv was announced. I was phoned a day or so later by a staff journalist at C&E News for my opinion. The only comment that was retained for their report was my instantaneous feeling that “the community needed a chemistry pre-print server like one needed a hole in the head“. I had been there before you see, recollecting a pre-print server launched by the ChemWeb service around 1996 or 1997 and which lasted only about two years before being withdrawn due to the low quality of the preprints. So what do I think of ChemRxiv now in 2019?

Let me set the scene first. Nowadays, many journals offer open access options, most upon payment of an APC (article processing charge). One can sometimes get a grant for this fee from institutional libraries. Mine for example has a policy that to apply for an APC, one has to deposit a “final author version” (FAV) of a manuscript in our local institutional repository (Spiral). Thus the final outcome is two versions of open access articles, one the FAV and then a version-of-record (VOR) held by the publisher. ChemRxiv can now add a third version to the process, since the expectation is that after some life as a pre-print, the manuscript can then be submitted to a peer-reviewed journal. Because the pre-print is allocated a persistent identifier (a DOI), the expectation is that the pre-print will indeed be persistent, with no expiration. Three versions of any given article are therefore now likely to be around, in effect permanently (or what goes for permanence nowadays). Importantly, there is no clear protocol for indicating how these three versions might differ, if they do. Even the FAV and the VOR may contain differences such as errors found in galley-proofing which will appear in the VOR but may not be propagated to the FAV. The congruence between the pre-print and any VOR is even less obvious.

All this came to a head as a result of the pre-print I noted in my previous two posts.[cite]10.26434/chemrxiv.8009633.v1[/cite] Unlike the topic of an earlier post of mine, where the VOR article[cite]10.1038/s41586-019-1059-9[/cite] (not a preprint) allows readers to comment (see e.g. https://www.nature.com/articles/s41586-019-1059-9#article-comments) I have not been able to identify a mechanism to post any comment about pre-prints. After all, that did seem to me to be a primary reason for exposing a pre-print, which is to invite insights from the community, perchance to improve the science or make suggestions related to it. What I have spotted however was an altmetric index. Hover over that and you get social media metrics. For this pre-print[cite]10.26434/chemrxiv.8009633.v1[/cite], these put it in the top 5% of all outputs, so it is clearly attracting much interest. This interest includes (currently) 1955 views, 539 downloads and commentary via two blog posts (www.altmetric.com/details/59250193/blogs) and 40 tweets (www.altmetric.com/details/59250193/twitter). You would have to work quite hard to visit all the blog posts and read all the tweets to assess overall how the community was responding to any specific pre-print. 

So what is the purpose of posting (or should I use the term publishing?) a ChemRxiv pre-print? Is it primarily to gather commentary via social media such as blog and Twitter posts and to use this to improve the final VOR based on such feedback? A colleague I discussed this with suggested that in some very competitive areas of science/chemistry, it might also serve to acquire a date-stamp for the research (part of the metadata associated with a DOI) and hence to claim priority, a stamp which would thus pre-date that obtained from VOR publication by a few months. This might be perceived as making all the difference in a competitive area in terms of gathering evidence of esteem, inclusion in grant proposals etc, especially for early career researchers. There may be other reasons which I have not thought of and comments here for these are most welcome.

I will end with noting the following project: en.wikiversity.org/wiki/WikiJournal_of_Science,[cite]10.15347/wjs/2018.001[/cite] being part of the WikiVersity. Here, the APC is dispensed with (no publication costs, at least to the authors), a DOI is again allocated and each article is subjected to both public peer review (en.wikiversity.org/wiki/WikiJournal_of_Science/Peer_reviewers) and can also carry post-publication review comments and even direct edits in the manner of Wikipedia. The other infra-structures of the Wiki ecosystem are available, including access to WikiData, which is high quality reference data.

So I think it is going to be an interesting debate about how the publication of primary research articles is going to evolve. Is a Triad of articles (the pre-print, the FAV and the VOR) the future? Or could it be e.g. the Wiki Journal of Science (extended perchance in the future to Wiki Journal of chemistry?) showing an interesting alternative way? Or is it all just getting too fragmented and confusing?

14 Responses to “ChemRxiv. Why?”

  1. Henry Rzepa says:

    It has recently been suggested that preprint servers such as ChemRxiv might be involved in Plan S, a strategy by European funders to mandate Open Access publication of key research articles;
    https://science.sciencemag.org/content/364/6441/620

    Implementation of Plan S itself has been delayed by a year to let research community adapt: https://www.nature.com/articles/d41586-019-01717-2

  2. Emilio says:

    So what is the purpose of posting […] a ChemRxiv pre-print? […] There may be other reasons which I have not thought of and comments here for these are most welcome.

    Dear Prof. Rzepa,
    Recently, I posted in ChemRxiv just for commenting an article.

    At the beginning, I had prepared a letter to editor but it was rejected recommending me to write a private communication to the authors or good-luck submitting elsewhere.

    However, I needed to publicly expose contradictions between this article and a recently published article of mine. I couldn’t submit a private communication but I didn’t see any point in publishing my commentary in another journal.

    In the middle of frustration, ChemRxiv appeared and in less than 3 days my comment had a DOI. I invited the authors of the article to make a response to the comment and they kindly agreed. After their reply, I updated my comment (in ChemRxiv they use a suffix v1, v2…etc in the DOI to distinguish between versions) and after this exchange of point of views, we have probably reached consensus with the discrepancies.

    My comment will remain preprint forever and will never go to any journal except where the commented article remains, but I don’t care, I was able to expose my point of view and now it can be cited.

    Best regards,
    Emilio

  3. Henry Rzepa says:

    Thanks for this really interesting insight into how modern discourse can (and apparently cannot) be conducted, including private vs open.

    I would be interested in the DOIs of your original article, the other article, and the DOI of your commentary.

    When you write I invited the authors of the article to make a response to the comment and they kindly agreed is that response to you privately or in open?

    Certainly if ChemRxiv facilities this sort of scientific discourse, I would change my mind about it!


    Postscript

    I have now located the relevant articles:

    1. Original article, DOI: 10.1039/C8SE00358K

    2. Comment: 10.26434/chemrxiv.7295585.v1

    3. Response: 10.26434/chemrxiv.7649897.v1

    4. Revised comment: 10.26434/chemrxiv.7295585.v2

  4. Henry Rzepa says:

    The previous comment noted that a continuing commentary on ChemRxiv can take place in the form of DOIs with version numbers, ie …85.v1 and …85.v2

    Whenever I have heard versioning discussed at conferences such as PIDapalooza, there are some who strongly deprecate it. Thus I encountered this at https://help.zenodo.org/#versioning

    Q: Why don’t the DOIs have a version number suffix like “.v1”?
    A: Including semantic information such as the version number in a DOI is bad practice, because this information may change over time, while DOIs must remain persistent and should not change.

    Zenodo’s solution is to issue a new DOI for each updated item and then to link them all using what they call a Concept DOI. We have used a similar idea in our repository, calling it a Collection which has the individual items as members of the collection. But this too could get very ungainly.

    So I should remind that the technical expedient that ChemRxiv have adopted to allow the kind of discourse discussed above may not turn out to be “persistent”. A work in progress I fancy.

  5. Mike Turner says:

    I remember following that discussion on ChemRxiv with great interest. It is an intangible benefit, but having such material available to those who can access only a fraction, or even none, of the published literature encourages them to maintain their interest in up to date research, and in turn pass the story on to others. If open access makes a small fraction of the public pro-chemistry rather than ignorant or suspicious of chemistry, that is a useful result.

  6. Kevin Davies says:

    I worked at ACS Publications from 2013-17 and helped conceive and launch ChemRxiv. We were inspired by the #ASAPbio initiative spearheaded by Ron Vale (HHMI/UCSF) and the promising growth of bioRxiv. The principal benefits are 1) the chance to share and receive feedback on research prior to publication; 2) to provide a timestamp of research; and 3) to dissseminate potentially important findings many months before formal publication, analagous to a presentation of unpublished data at a conference.

    It is to ACS’ great credit that it launched ChemRxiv without unanimous backing from its >50 journal editors — the flagship JACS was not a supporter initially, but has changed policies (presumably in response to pressure from authors and EAB members).

    The spread of multiple versions of a paper may be a distraction but is not the point; other preprint servers allow authors to post revisions of their preprint. The posting of a preprint provides authors with some peace of mind as they endure the peer-review process, and in some cases have to try their luck with multiple journals.

    The altmetrics data are a frill that might provide some interesting data but wasn’t a consideration when the server was launched.

    — Kevin Davies

  7. Henry Rzepa says:

    Thanks Kevin for that useful perspective from the point of view of how a publisher sees things.

    I found your analysis that The posting of a preprint provides authors with some peace of mind as they endure the peer-review process, and in some cases have to try their luck with multiple journals. the most salient point perhaps.

    That could at face value be taken as an indictment of the peer-review process itself rather than as necessarily a powerful case for a pre-print server. Your use of the word “luck” also implies that peer review is almost a random process, one perhaps driven by personal human motivations other than the quality of the science. In which case perhaps fixing peer review rather than using pre-print servers should be the real priority? What I think emerged from the Chemweb preprint experiment >20 years ago was that very few of the posted items attracted any comment at all (whether positive or negative) and were in effect treated with a big yawn.

    As a devil’s advocate, I might also suggest that if authors need to try multiple journals to get into print, that could also indicate real issues with their science rather than having unlucky reviewers? Or that their science is simply too routine, I remain worried that pre-prints may drive quality down and especially noise up. There are also many that say we all publish far to much already and pre-prints are unlikely to address that aspect.

    Finally, preprints could be an opportunity to address the quality of the data associated with research, ideally in the form of FAIR data. I do not currently see chemRxiv pre-prints as directly addressing this issue, related to quality and replicability.

  8. Henry Rzepa says:

    I asked a colleague in life science why they favoured pre-prints so strongly. He responded that the funders loved them (indeed sometimes mandated them) because it made their funding appear to produce twice as many outputs! Authors, especially perhaps early career researchers (?), probably also appreciate this.

    I also remember a few decades back tha research “fragmentation” was strongly discouraged. Whilst pre-prints are not necessarily such fragmentation, since they are not peer reviewed as such, there is nothing to prevent authors from using them to improve outputs and to fragment in effect covertly.

    As alluded in my last comment, if pre-prints were to come to be regarded as a “data rich” version of any given research output, or even just a useful pointer to FAIR data in repositories, I might support such a role. I also noted that Zenodo uses concept DOIs to link a story together and perhaps that might also be a role that pre-print servers take on?

  9. Henry Rzepa says:

    Here is a ChemRxiv article, DOI:10.26434/chemrxiv.8267837.v1 which highlights a specific reference to data, DOI 10.14469/hpc/5737. Also such citation need not be unique to ChemRxiv since it can also be included in the final articles!

  10. Henry Rzepa says:

    In my previous comment, I noted that a link from an article was inserted by the depositor to the data on which the article was in part based. I was curious how this link might have been exposed in the metadata about the article. This metadata can be acquired using eg

    https://data.datacite.org/application/vnd.datacite.datacite+xml/10.26434/chemrxiv.8267837.v1 and what it contains (or does not contain) is quite interesting.

    1. It lists the DOI identifier for the article
    2. It lists the authors, two of which are identified by their ORCID.
    3. It gives a title and a description.

    But no inclusion of the reference https://doi.org/10.14469/hpc/5737 which means there is no easily automated procedure for mining the data starting from the article.

    So can I urge ChemRxiv to extend their metadata record to include such information?

  11. lhac says:

    Personally I do not (yet) see the need for preprints for my regular articles, but I appreciate the point of Emilio wo stated that: “My comment will remain preprint forever … but I don’t care, I was able to expose my point of view and now it can be cited.”

    I have experienced cases where I found mistakes in the published literature, and where communication with authors and editors (informing them of the implausibility and insufficient experimental basis of the published claims) was leading nowhere.

    In one case, we did some experiments (at non-negligible cost of chemical, time and infrastructure) to disprove the claims in a paper. We now know the reported chemistry is not working, but this alone does not provide sufficient substance to justify a regular paper.

    Having read Emilios comment I now seriously consider submitting a short report on the repeat experiment to ChemRxiv. I wonder, however, what the general position on this would be, if people started to pre-publish work (some of it cranky, maybe) without having the intention to go through the peer-reviewing process.

  12. Henry Rzepa says:

    Lukas,

    Re: if people started to pre-publish work (some of it cranky, maybe) without having the intention to go through the peer-reviewing process.

    I gather ChemRxiv do have a system to monitor whether a peer-reviewed version of the pre-print is detected at some future stage after the pre-print is posted. The DOI of the peer-reviewed version would then be added to the metadata for the pre-print (and ideally vice-versa). How reliable this procedure is has yet to be established (especially if eg either the title or authors change or increase), but it would be reasonable to say that a measure of greater confidence could be established in those pre-prints for which peer-reviewed versions are detected, and hence perhaps lesser confidence in those for which this is not detected after a given period has elapsed.

    The issue then, when a peer-reviewed version does appear, is detecting if there are any substantive differences between the two. I am not sure this is easily automated and it is unreasonable I think to have to regularly read in detail two versions of a paper to detect this.

  13. Henry Rzepa says:

    I discussed above that the introduction of a pre-print can result in three versions of any given scientific article.

    Further discussions with a journal editor has apparently revealed a fourth possibility, namely an HTML version of a publisher VoR. Why might the HTML version differ from eg the PDF version? Because some journals allow the HTML version to have comments appended. These do not however propagate to the PDF version.

    The status of such HTML comments is also uncertain. Thus are they presumed to inherit the DOI of the article? If not, can they be cited elsewhere, and if so how?

    What level of recognition, if any, would a commentator receive?

    Will comments persist for the same span of time as eg the PDF? A PDF is a purely stand alone entity and is still intact if separated from the journal environment, whereas a HTML comment depends very much on the journal infrastructures and cannot be easily relocated elsewhere.

    How easily can the author of a comment be identified? They may provide only a pseudonym rather than for example a formal identifier such as ORCID.

    What is the level of assessment by the journal as to whether a comment is relevant, appropriate, backed up by some form of evidence etc?

    The publishing landscape continues to get more complex. It is rather unpredictable at this stage where it is all going!

  14. Henry Rzepa says:

    Her is an interesting analysis of the reasons for using a pre-print server;

    https://www.insidehighered.com/news/2019/08/26/sociologist-says-journal-rejected-her-paper-because-shes-shared-it-elsewhere

    The analysis ends with the quote But academics as a group remain “woefully ignorant about open access, scholarly communication and the way the landscape of knowledge production is changing in the digital era.”

    Perhaps that in part is due to the increasing complexity of digital scholarly communications?

Leave a Reply