Thursday, February 19, 2009

Known Issues in STRING version 8.0

For each STRING version so far, only when we released it to the users did we find the last remaining bugs. Users often email us with their problems, and sometimes we are indeed to blame because there is an error. This is good (we think), because each bug found is a bug fixed - albeit only in the next release, usually.

So far, this is what we have found in release 8.0:

a) Some of our text-mining links do not show up in the corresponding evidence viewer. They are still correct, but the underlying text cannot be recovered and shown, for technical reasons. This happens because we developed a new feature that recognizes generic 'family' names for gene groups (like 'WNTs' for the various, homologous Wnt proteins). Within reasonable limits, such ambiguous names are now expanded to the individual protein members. However, we forgot to update the code of the text-viewer to reflect this ... we will do so in the next version.

b) Unfortunately, some of the prokaryotic genomes in this release are incomplete - in 43 cases we're missing a second (or third) minor chromosome. This was caused by a misunderstanding when parsing files from the RefSeq database: RefSeq provides an overview file that only lists one chromosome for each prokaryote, and we mistook that file for the full listing. Again, this will be fixed in the next release of STRING (on which we are already working). Obviously, we're now writing a new entry in our test suite that will prevent this type of error in the future - we will be checking the final gene counts of all organisms for consistency and also compare these counts against an external reference. Below is a list of affected organisms in the current release; if you're working with any of these, we recommend you continue using version 7.1 of STRING for now.

Luckily, no major model organisms are affected !!

Agrobacterium tumefaciens str. C58
Brucella abortus biovar 1 str. 9-941
Brucella melitensis 16M
Brucella melitensis biovar Abortus 2308
Brucella ovis ATCC 25840
Brucella suis 1330
Burkholderia ambifaria AMMD
Burkholderia cenocepacia AU 1054
Burkholderia cenocepacia HI2424
Burkholderia mallei ATCC 23344
Burkholderia mallei NCTC 10229
Burkholderia mallei NCTC 10247
Burkholderia mallei SAVP1
Burkholderia pseudomallei 1106a
Burkholderia pseudomallei 1710b
Burkholderia pseudomallei 668
Burkholderia pseudomallei K96243
Burkholderia sp. 383
Burkholderia thailandensis E264
Burkholderia vietnamiensis G4
Burkholderia xenovorans LB400
Deinococcus radiodurans R1
Haloarcula marismortui ATCC 43049
Leptospira borgpetersenii serovar Hardjo-bovis JB197
Leptospira borgpetersenii serovar Hardjo-bovis L550
Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130
Leptospira interrogans serovar Lai str. 56601
Ochrobactrum anthropi ATCC 49188
Paracoccus denitrificans PD1222
Photobacterium profundum SS9
Pseudoalteromonas haloplanktis TAC125
Ralstonia eutropha H16
Ralstonia eutropha JMP134
Ralstonia metallidurans CH34
Rhodobacter sphaeroides 2.4.1
Rhodobacter sphaeroides ATCC 17029
Vibrio cholerae O1 biovar eltor str. N16961
Vibrio cholerae O395
Vibrio fischeri ES114
Vibrio harveyi ATCC BAA-1116
Vibrio parahaemolyticus RIMD 2210633
Vibrio vulnificus CMCP6
Vibrio vulnificus YJ016

That's it for known issues so far. But, do keep those emails coming - the feedback is very valuable !!