1. I thought I'd give ERL a chance to redeem itself. They didn't.

    Article under review for Environmental Research Letters Quantifying the consensus on anthropogenic global warming in the
     literature: A re-analysis - Professor Dr Richard S J Tol
    ID: ERL/482132/COM

    REFEREE'S REPORT

    ============================

    General Comments

    In this Comment letter, the author presents a thorough exploration of many facets of the recent Cook et al. ERL paper. Replication, re-analysis, and investigation of claims made in the literature are some of the hallmarks of science, and thus I welcomed the idea of this submission. After reading it a number of times, however, I believe that the paper does not contribute enough novelty to the discussion to move the field forward, has a number of critical flaws, and thus should not be accepted, per the ERL stated guidelines.

    Rather than contribute to the discussion, the paper instead seems oriented at casting doubt on the Cook paper, which is not appropriate to a peer-reviewed venue, and has a number of important flaws. I outline the largest of these below and then specific comments/feedback below that.



    1) Claims not supported by data/results. Many of the claims in the abstract and conclusion are not supported by the author’s analyses. Much of the analyses explore methods choices made by the Cook et al paper, and often find differences when a different database or search term or subset of papers is analyzed, but the larger point is completely missed – that both the Cook authors and this paper’s author make assumptions about representativeness, appropriateness of search terms, appropriateness of different fields in calculations made. These are, in fact, assumptions. Thus, it is impossible to claim that the Cook dataset is “unrepresentative” of any larger population, as the other scenarios investigated by the author are just a different (and not necessarily better or “more true”, even in some cases less likely to be a good sample) set of assumptions. Regarding later calculations of consensus, the author finds largely similar percentages to that of the Cook paper and also

    seems to ignore the self-rated abstract consensus rate, presenting evidence in fact that the Cook paper’s main conclusions do seem to be quite robust, which is the opposite of what is claimed by the author.



    2) A vast degree of relevant literature ignored. In order to contribute to the discussion, the paper should consider other relevant literature concerning the quantification of consensus, almost none of which is cited in this paper.

    The Cook paper is a larger version of the 2004 Oreskes Science paper which found incredibly high consensus rates and stood the test of time. Furthermore, if the Cook conclusions were wildly off, they would disagree with the literature of direct polling of scientists. The Cook conclusions do not disagree, however, and are almost exactly in line with Doran & Kendall-Zimmerman 2009, Rosenberg et al 2010, Anderegg et al 2010. That these papers present consensus rates between 94-98%, which is completely in line with the Cook findings and even those presented by the author.


    3) Language is overly polemical and not professional in some areas. At times in the introduction and conclusion, the language used is charged, combative, not appropriate of a peer-reviewed article and reads more like a blog post.

    This does not serve the paper well and reflects poorly on the author.


    4) Other Cook paper findings ignored. This paper does not mention or discuss the author self-ratings presented in the Cook et al paper whatsoever. These self-ratings, in fact, are among the strongest set of data presented in the paper and almost exactly mirror the reported ratings from the Cook author team.



    Specific Comments

    P1L15-19: Almost every single one of these abstract claims is an oversimplification and often misleading characterization of what is reported in the analyses of these comment.



    P4L1-3: This analysis has no bearing on the Cook paper’s findings.

    Oversampling the top 50 papers will have very little to no effect on a total population sample of >12,000 papers.



    P4L7-9: Just because a database has more papers, does not mean that it is a better representation of the “true” population of relevant papers. In fact, Figure S4 shows quite clearly that Scopus includes a large amount of disciplines that are certainly not relevant to the attribution of climate change. It’s quite clear that one would, in fact, want to sample earth and planetary sciences quite extensively and not sample “Medicine,” “Material Science,” “Biochemistry/genetics,” or “Business Management and Accounting”,

    which are more heavily sampled in the Scopus database. Thus, a strong case could be made that this comparison and analysis is quite flawed, as the Web of Science search would seem to be much more relevant to the question at hand, dropping out a large number of irrelevant disciplines to this question. The Web of Science sample is therefore not over-sampling or undersampling the true population, the analyses presented here indicate in fact that it’s a better representation of the true population of relevant papers.



    P4L10-15: This is a good point that should be pointed out.



    P4L16-17: This must be supported with citations in the literature.

    Otherwise, the statement is not justified. One could easily argue that younger and more obscure journals are also much more highly prone to bypassing or having flawed peer-review processes, as evidenced by the recent explosion of pseudo-journals in recent years, leading to publishing flawed and incorrect papers, and thus the Web of Science bar could be more informative.



    P4L21: Unrepresentative of what? This is a fairly broad and bold statement that is entirely unjustified by the data presented. Their sample is simply a sample, subject to the various constraints and meta-data of each database, which the author has explored and pointed out. Pointing out these constraints and the potential for them to lead to different conclusions is valuable. Blanket broad and normative statements like this are not. In fact, it should be acknowledged that several of the figures in this section actually give one *more* confidence in the appropriateness of Web of Science, rather than less.



    P5: These are useful analyses to include. The section gives no evidence of bias in one direction or another, however, and the title is not appropriate. In large part, these figures suggest mostly that it was human beings doing the research and thus humans can get fatigued in such a large and daunting research project. This can lead to lower quality data, but rarely are data perfectly transcribed and meet all assumptions, especially when humans are categorizing and qualitatively assessing literature, and there’s no indication that this

    biased the findings. In fact, all indications, such as the self-rating by paper’s authors, suggest that the analysis was robust and the findings not influenced by fatigue or low data quality.



    P6L9-10: Sample size of 2-3 is not enough to say anything meaningful.



    P6L11-12: Again, a completely meaningless and biased sample size. The Cook authors in fact present self-rating by paper authors and arrive at 97.2% consensus by author self-ratings. Given that their response was 1,189 authors, this indicates that seven who disagree with their paper’s ratings is essentially trivial and cherry-picked.



    P7L9-14: This is largely true. In fact, if one accepts this logic, it very much discounts much of the analysis presented earlier in this comment using Scopus and several of the supplemental figures (e.g. 25 top-cited papers, 50 top-published authors, etc).



    P7: These tables show quite clearly that in fact the major conclusions of the Cook paper do stand (consensus rates of 95-99%), that consensus is even higher in explicit endorsements, and this aligns with the self-rated abstract findings presented in Cook, which are omitted from the discussion in this comment.



    P8L11-17: This is perhaps useful to point out, but not as useful as the author presents. Science works by going after new findings and exploring around the edges. Because the fingerprint of human impact on the climate is well-established and the overwhelming evidence is clearly laid out in recent IPCC reports, then of course the focus of research shifts to other developing areas (such as how much it will warm based on paleoclimate studies).



    P9L3-7: These statements are assumptions and value-judgments by the author.

    The author disagrees with how Cook et al. performed some of the analyses in choice of database, search terms, etc., but just because the author disagrees with the choices does not mean that the sample was unrepresentative.



    P9L12: No. In fact, the analysis in this paper shows quite clearly that all of the major points in Cook et al do largely stand. Furthermore, this comment omits other data presented by Cook et al. such as the author self-rated consensus.



    P9L8-17: This section is not supported by the data presented and is also not professional and appropriate for a peer-reviewed publication. Furthermore, aspersions of secrecy and holding data back seem largely unjustified, as a quick google search reveals that much of the data is available online (http://www.skepticalscience.com/tcp.php?t=home), including interactive ways to replicate their research. This is far more open and transparent than the vast majority of scientific papers published. In fact, given how much of the paper’s findings were replicated and checked in the analyses here, I would say the author has no grounds to cast aspersions of data-hiding and secrecy.



    Table 3: Why is self-rated consensus not reported in this table as an additional column? It’s omission is glaring.





    Article under review for Environmental Research Letters Quantifying the consensus on anthropogenic global warming in the literature: A re-analysis - Professor Dr Richard S J Tol

    ID: ERL/482132/COM

    BOARD MEMBER'S REPORT

    ============================

    The paper is a comment on Cook et al. (2013). The author has in essence the following main criticisms of the paper.



    1. Based on the (unsupported) claim that “Nowadays, climate change means global climate change“, Tol suggests the search term “climate change” would have been more appropriate for the survey, instead of “global climate change”. While there is always a choice to be made which search term one uses in such a survey, I think either choice would have been a valid one and the pros and cons of each are debatable. Had Cook et al. used “climate change”, they could have been criticised for casting a net that is too large and could have found a host of papers dealing with local and regional issues. The key issue is that any

    publication documents which terms were actually used, which was the case in Cook et al. Other authors are of course free to publish their own survey findings using other search terms – which Tol does not do here, despite calling his manuscript “a reanalysis” of the consensus. He does not present his own analysis of the consensus but merely criticises how Cook et al. conducted their survey.



    2. Tol argues that using a different publication data base (Scopus) instead of Web of Science might have given different results. That may or may not be true – but is again a point that can only be addressed if another group of scientists actually performs a similar study based on some other publication data base. Cook et al. cannot be faulted for using Web of Science; that is a perfectly valid choice. I generally prefer Web of Science over Scopus in my own literature searches because I find Scopus includes sources of dubious quality which not always conform to what I would call peer-reviewed journal articles.

    Had Cook et al. used Scopus, they could have been criticised for that choice.



    3. Tol discusses problems that can arise in the rating process, e.g. rater fatigue. That certainly is a potential problem – human raters are not objective algorithms, and in this kind of rating exercise a certain subjective element in taking the decisions is as inevitable as it is obvious. Tol presents no evidence that this is a large problem that would significantly alter the results, though, to the contrary – the numbers he presents suggest it is a small problem that would not significantly alter the conclusion of an overwhelming consensus. Thus I think this rating subjectivity is a caveat to mention when discussing and interpreting the results, but no cause for a formal Comment. It remains unclear why Tol headed this section “data errors”; the issue of subjective classification is not one of “data errors”.



    4. Tol makes an interesting point about how the consensus rate depends on subject and argues that policy papers should have been rated as “neutral”, rather than endorsing anthropogenic warming. Cook et al. rated those as implicit endorsement of anthropogenic warming, since in its absence CO2 emissions reductions don’t make sense. I would agree with Cook et al. here. A more valid question is whether the views of a paper on climate policy are interesting – perhaps not, perhaps one should only be interested in the views of natural scientists on this question. This is a matter of opinion, though, and certainly not one of “classification errors” as Tol heads this section.



    5. The final paragraph on “Trends” makes the same point again – if one reclassified all adaptation and mitigation papers then the results would be different. But I don’t think these papers were incorrectly classified; it is merely a matter of opinion whether one would want to include the authors of these papers in the expert community that one surveys, or whether one finds their views to be irrelevant, as Tol apparently does. Having a different opinion on this point is by itself not a reason to publish a formal Comment.



    In the final paragraph Tol writes: “There is no doubt in my mind that the literature on climate change overwhelmingly supports the hypothesis that climate change is caused by humans. I have very little reason to doubt that the consensus is indeed correct.”



    Indeed Tol provides no reason to question the main conclusions of Cook et al. He merely provides his opinions on where he would have conducted this survey differently and in his view better – and he is free to do just that. But he has not identified serious methodological flaws in Cook et al. that would justify the publication of a Comment.

     
    Seventh draft. Main changes: Inclusion of paper ratings; further tests of patterns in data.

    Update on data (22 July): John Cook has now been asked (once, around July 7) by the director of the Global Change Institute, University of Queensland, and (three times, first around June 20) by the editor of Environmental Research Letters to release all of his data. I asked him 5 times now (first on May 31). Cook has released only a little bit more data: The author ratings. The actual data confirm what is in the paper: Paper ratings and abstract ratings strongly disagree with each other.

    Sixth draft

    Rejection letter by ERL:
    Article under review for Environmental Research Letters
    Comment on: "Quantifying the consensus on anthropogenic global warming in the literature" - Professor Dr Richard S J Tol
    ID: ERL/477057/COM
    BOARD MEMBER'S REPORT
    ============================
    The comment raises a number of issues with the recent study by Cook et al. It is written in a rather opinionated style, seen e.g. in the entire introductory section making political points, and in off-hand remarks like labelling Skeptical Science a “polemic blog” or in sweeping generalisations like the paper “may strengthen the belief that all is not well in climate research”.
    It reads more like a blog post than a scientific comment.

    The specification for ERL comments is:
    “A Comment in Environmental Research Letters should make a real
    contribution to the development of the subject, raising important issues about errors, controversial points or misleading results in work published in the journal recently.”

    I do not think this manuscript satisfies those criteria. It is in a large part an opinion piece, in other parts it suggests better ways of analysing the published literature (e.g. using a larger database rather than just Web of Science). These are all valid points for the further discussion following the publication of a paper – colleagues will have different opinions on interpreting the results or on how this could have been done better, and it is perfectly valid to express these opinions and to go ahead and actually do the research better in order to advance the field.

    I do not see that the submission has identified any clear errors in the Cook et al. paper that would call its conclusions into question – in fact he agrees that the consensus documented by Cook et al. exists. The author offers much speculation (e.g. about raters perhaps getting tired) which has no place in the scientific literature, he offers minor corrections – e.g. that the endorsement level should not be 98% but 97.6% if only explicit endorsements are counted. He spends much time on the issue of implicit endorsements, about which one can of course have different opinions, but the issue is clearly stated in the Cook et al. paper so this does not call for a published comment on the paper. He also offers an alternative interpretation of the trends – which is fine, it is always possible to interpret data differently.

    All these things are valid issues for the usual discourse that exists in many informal avenues like conferences or blogs, but they do not constitute material for a formal comment.

    The editor-in-chief has an interesting blog on the paper.

    As submitted to Environmental Research Letters; data

    Fourth draft, editing and references

    Third draft, with proper tests for internal consistency

    Second draft, with more validity tests and Scopus v Web of Science explained.

    First draft of comment on Cook et al. paper in ERL 2013.
    18

    View comments

Blog roll
Blog roll
Translate
Translate
Blog Archive
About Me
About Me
Subscribe
Subscribe
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.