[Update 29/01/2009: The homology correction is also applied to the text-mining channel starting with STRING 8.]
In order to avoid that gene duplications lead spurious functional associations, homologous proteins are down-weighed in the co-occurrence and text-mining channels. You will notice this on the score summary page of a link and if you have our SQL dumps.
Here's an example: The co-occurrence view looks fine for this pair of proteins.
However, the total score of 0.204 is less than the co-occurrence score:
The reason for this is that the proteins have some sequence similarity and are therefore down-weighted according to this formula:
effective co-occurrence score = co-occurrence score * (1 - homology score)
(The homology score is calculated from the bit score of the alignment.) In this case:
0.204 = 0.478 * ( 1 - 0.572 )
Thursday, January 29, 2009
Homology correction of co-occurrence and text-mining scores (updated)
Labels:
documentation,
string
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.