I went to the BioHackathon 2008 in Tokyo and worked on an API for STRING and STITCH. If you think about using STRING or STITCH with an API, and miss features, please get in touch with us either via the comments or e-mail (e.g. mkuhn//embl.de).
Here's what we have to offer so far:
REST interface
The URL patterns are: http://stitch.embl.de/api/[format]
http://string.embl.de/api/[format]
Possible formats:
- tsv: tab-separated values, with a header line
- tsv-no-header: as above, but no header
- json: JSON format either as a list of hashes/dictionaries, or as a plain list (if there is only one value to be returned per record)
- psi-mi: the interaction network is available in PSI-MI 2.5 XML format
- psi-mi-tab: there is also a tab-delimited form, modeled after the IntAct specification. This is easier to parse, but contains less information than the XML format.
- url: return the URL of the network image
- abstracts: return a list of abstracts that contain the query item
- abstractsList: return a list of abstracts that contain any of the query items
- interactions: return an interaction network in PSI-MI 2.5 format (PSI-MI is currently the only format for interactions. Perhaps the PSI-MI tab-delimited form would also make sense? I don't know how a JSON form should look like.)
- interactionsList: same as above, but for list of identifiers
- interactors: return a list of interaction partners for the query item
- interactorsList: return a list of interaction partners for any of the query item
- resolve: return the list of items that match (in name or identifier) the query item
- network / networkList: in conjunction with the "url" format, return the URL to the network
Examples
To find out which proteins match the description "dopamine receptor" in human, you can use this query:
http://stitch.embl.de/api/tsv/resolve?identifier=dopamine%20receptor&species=9606
http://string.embl.de/api/tsv/resolve?identifier=dopamine%20receptor&species=9606
This gives you a lot of additional info. If you just want to get the list of STRING identifiers, you can alter the query a bit:
http://stitch.embl.de/api/tsv-no-header/resolve?identifier=dopamine%20receptor&species=9606&format=only-ids
http://string.embl.de/api/tsv-no-header/resolve?identifier=dopamine%20receptor&species=9606&format=only-ids
Now, you'll only receive a bare list of ids that you could pipe into other STRING API functions.
To illustrate the difference between normal and "list" queries:
http://stitch.embl.de/api/tsv/interactors?identifier=DRD1_HUMAN
http://stitch.embl.de/api/tsv/interactorsList?identifiers=DRD1_HUMAN%0DDRD2_HUMAN
http://string.embl.de/api/tsv/interactors?identifier=DRD1_HUMAN
http://string.embl.de/api/tsv/interactorsList?identifiers=DRD1_HUMAN%0DDRD2_HUMAN
In the second case, the identifiers parameter contains a list of items separated by new line characters (%0A or %0D).
SOAP / Taverna
In a separate post, I've described an example Taverna workflow. As for SOAP integration, I hope that the Soaplab interface works...
Obligatory beta notice
As all good things these days, this is still in beta (internally, everything in fact runs on our beta server, I'm just making it accessible via the normal STITCH domain to expose it to the web). Therefore, the API might change, be down, ... until STITCH 2 / STRING 8 comes out.
Updates
03.03.2008: Added clarification – PSI-MI is currently the only interactions format.
04.03.2008: Fixed typo – it's "interactorsList"
12.03.2008: Add psi-mi-tab format
19.05.2008: Add STRING API (with same specification)
08.07.2008: Add API for generating network images
16.03.2009: Enabled interactionsList
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.