Rfam, RNA Biology and Wikipedia in the news

February 20, 2009

Some of you may have noticed the recent attention that the unholy alliance between Rfam, RNA Biology and Wikipedia has been receiving recently. I thought it might be worthwhile posting a more detailed overview of how this happened, what we’re planning and dealing with the major criticisms.

In September 2008, Peter Stadler wrote to me to ask whether we had considered starting an RNA families journal or a dedicated track in an existing journal. He had even discussed the idea with Renee Schroeder, the editor for the journal RNA Biology. After a brief skype conference between Peter, Renee, Alex Bateman and myself, we all decided this was a fantastic idea and, somehow, I found myself an editor for the RNA families track for the journal RNA Biology.

Now, the chief aim of the track was to provide clear guidelines for authors, such that their latest RNA families work could easily be incorporated into existing bioinformatic databases, such as Rfam, with the added incentive that they would get a publication for their CV in return for all the hard work. This had the bonus that it saves a LOT of work for Rfam curators, without creating too much extra work for authors. As Alex and I mention in our editorial [1], we frequently resort to typing in sequences and structures by hand, simply due to a lack of standards when it comes to publishing RNA sequences and structures. The other hurdle we’ve faced is getting scientists to contribute to the Wikipedia entries that Rfam now uses for annotation of RNA families [2]. This new track should neatly help us with both problems.

In the meantime, Peter had mentioned in his email that he had been working on the SmY ncRNA family, and that this might be appropriate for an exemplar article for the track. I’d forwarded his email to all the members of the Rfam consortium. Sean Eddy responded enthusiastically about the idea and also mentioned that he and Tom Jones had also been working on SmY. Fortunately, in this situation Peter and Sean’s groups worked together, pooling their respective resources and produced a fantastic exemplar article for the RNA Families track [3].

Media response

The Sanger press team then swung into high-gear and contacted people at NatureNews. The subsequent press release was indeed picked up by NATURENEWS and subsequently by WIKINEWS, resulting in loads of interesting comments and subsequent blog posts. I have linked to a few of these below.


The majority of the comments on the articles and blogs I have seen are rather positive about the new track and our unique requirements. However, there have been a few criticisms of this, which have largely been dealt with by by the comments of others on the NatureNews site. These are the main ones:

1. Sam Hocevar asked:

“How is the Wikipedia rule about not allowing original research dealt with?”

This was the most common criticism of requiring a Wikipedia article for each RNA families article. It has been answered elsewhere very well elsewhere but I’ll try again here. Most of the articles are going to be very review-like in nature hence will suit a Wikipedia style article very well. Those articles that do indeed contain original research can remain unpublished but available in a users workspace until the corresponding RNA Biology article is published. Therefore this policy in no way conflicts with WP:NOR.

2. Finn Årup Nielsen points out that apart from the issue of “no original research” consensus policy on Wikipedia, as well as the problem of vandalism, the adding of scientific information in Wikipedia may face another problem: notability.

This is an interesting point and something that will naturally have to be dealt with on a case-by-case basis. I envisage that most of the Wikipedia articles will contain multiple references to journal articles in reliable sources, and hence will satisfy the requirement of notability.

Other families may less notable, and therefore will fall into a certain class or type of RNA. In this case we may ask for an update or review of the Wikipedia article corresponding to that type, e.g. snoRNA or miRNA.

3. Vandalism!

The vast majority of vandalism in Wikipedia seems to fall into the juvenile, high school brand of humour, falling into insults or making comments of a lewd nature, presumably followed by lots of sniggering after the save button is pressed. This form of vandalism is extremely innocuous, as the Wikipedia community and automated bots usually catch these edits within seconds and revert the article back to the unadulterated form. In fact, on rainy nights it can be a lot of fun to get in touch with your inner high schooler by going to your favourite Wikipedia page and browsing the history of reverts.

4. Gabe Simon wrote:

“I personally think this is inappropriate. In my humble opinion, any encyclopedia (including Wikipedia) should aim to summarize widely accepted beliefs in a field… much like a textbook. I would argue that many if not most articles published in peer-reviewed journals end up getting ignored or, worse, refuted as time goes by.” [snip] “For what it’s worth I’ve written/edited many scientific articles and I think Wiki getting spammed with piddling results will be a MUCH bigger problem than vandalism.”

Tim Vickers had a great response to this that I don’t think I could ever improve upon:

“How close you get to the findings of current scientific research depends on which article you are talking about. You are right that the article on ‘RNA’ should be written like a good general textbook and only discuss the basics that we all agree on. ‘Non-coding RNA’ can go into more detail on this specialised topic, and might mention some current research, but still be based on material taken from reviews and textbooks. Finally, the article on ‘RNase MRP’ could summarise the state of current research on this particular non-coding RNA and largely use journal articles as references. The articles in Wikipedia therefore form a chain leading from the most general introductions to the most specific and technical – the amount of primary publications it is appropriate to reference depends on where in this chain you are contributing.”

5. The articles themselves should be free.

The articles are indeed all free to publish and open access. Please read the guidelines properly before making this comment.

6. In a bioinform article, Masanori Arita was quoted at length criticising wikis in general. His main issue seemed to be that

“wiki pages are independent of each other, so that changes made on any one page are not replicated on pages with related information.”

Now I’m no Wikipedia expert, but I imagine one could use something like Templates or simply link to relevant page that contains the core information. For example, each page about a new RNA family should link to the ncRNA article, which will give further core information, and which in turn links to the RNA article. This is solved simply by Tim’s chain mentioned above. To be honest I’m not sure this criticism makes much sense in this context. If you want normalised hierarchical information then use a normalised hierarchical database. I very much doubt that encyclopedic information will ever fit neatly into this sort of schema.


Finally, these are the very early days of an interesting and fragile idea. There are any number of things that could go wrong that we haven’t foreseen. What we need for this to work is input from the RNA community. We’ve already had some interest from a few noted researchers, so hopefully we’ll see a few more articles and families like SmY RNA. If you have anything you think might be interesting, feel free to drop me an email at pg5@sanger.ac.uk.


[1] Paul P. Gardner and Alex G. Bateman “A home for RNA families at RNA Biology”. RNA Biology, January 2009

[2] J. Daub, P. P. Gardner, J. Tate, D. Ramskold, M. Manske, W. G. Scott, Z. Weinberg, S. Griffiths-Jones & A. Bateman (2008): The RNA WikiProject: Community annotation of RNA families. RNA (New York, N.Y.)

[3] Thomas A. Jones, Wolfgang Otto, Manja Marz, Sean R. Eddy, Peter F. Stadler (2009) A survey of nematode SmY RNAs RNA Biology 6:1; 5-8

3 Responses to “Rfam, RNA Biology and Wikipedia in the news”

  1. [...] have taken data from the Rfam database [3], and put it all into regular wikipedia. This project got quite a lot of media attention back in February. In this case, the primary advantages of “letting go of data” by [...]

  2. Andrea Says:

    This is a very novel approach to advancing open science and open data in a more visible form – bravo! I’m looking forward to hearing more about how it works out.

    Promoting useful contributions can be difficult, but it seems that you’ve hit the nail on the head by engineering a system that requires very little overhead by contributors and pays off very well in the academic currency of citations. This sort of incentive-centered design, which convinces individuals to act in their own rational self-interest to the betterment of the commons, leverages the social and technical resources in a way that seems almost guaranteed to succeed.

    • Paul Gardner Says:

      Many thanks for your kind comments Andrea. The new track with RNA Biology has been going well. We are getting a steady trickle of useful and interesting contributions from all over the RNA community. I hope it continues!

