Tuesday, March 16, 2010

Combining scores the right way

With STRING version 8, we began to add prior correction to the scores. That means that we consider how close scores are to the background rate of about 6% for protein–protein interactions, and we remove the random expectation from the scores before we combine them.  (At the end, we put the random contribution back in to make it consistent with the case that only a single evidence channel contributes to the score.)

For some applications, you might want to remove certain evidence types from the STRING download files. To compute a new combined score, you can now use the same prior correction we've been using internally. We have started a new BitBucket repository, and added a script to discard channels from the full STRING download files. 

5 comments:

  1. Hello,

    I observed a significant difference between the listed "combined scores" in the file 'COG.links.detailed.v8.txt' and scores that I compute myself with the formula from your publication in 2005.

    I guess this difference is also explained by this new modification?

    How can I compute "combined scores" with discarded channels for the file 'COG.links.detailed.v8.txt'? It seems that your published scripts for correction cannot be applied to this file, but only to the file containing protein links.

    Best regards,
    Sebastian

    ReplyDelete
  2. The compute_scores.py module also contains a function for correcting scores for orthologous groups. I've quickly added another script for COGs to the repository.

    ReplyDelete
  3. Thank you very much! Now I also found the method "compute_combined_score_orthgroup_orthgroup" with the prior correction inside which I adopted for my script.

    But the new script makes things more easy for me.

    Thanks alot!

    ReplyDelete
  4. Just wanted to let you know that STRING is down today.

    ReplyDelete
  5. Hi, thanks for letting us know. We restarted the server and it seems to be fine now.

    ReplyDelete