Files in this Data Supplement:
Supporting Table 2
Table 2. A hierarchy of gene-pair characteristics. Characteristics
are
followed by the number of gene pairs known to possess that
characteristic. As in the Gene Ontology, if a gene pair possesses a
characteristic, it possesses all parents of that characteristic.
Supporting Table 3
Table 3. SSL interactions used for cross-validation and in training for
experimental validation. On the left are the SGD identification number,
systematic name, and standard name of first gene, followed by the same
information for the second gene. On the right, query genes and array
genes that were paired to assess SSL interaction.
Supporting Table 4
Table 4. Performance in cross-validation. Score threshold, sensitivity,
false positive rate, number passing threshold, and success rate. Total
true SSL interactions, 3,685; total falses, 688,433.
Supporting Figure 4
Fig. 4. (a- d) Cross-validation trees with the five top-scoring leaves
labeled by rank. Left and right arrows point to gene pairs with and
without, respectively, the characteristic that labels the node from
which the arrow points. Arrowhead size is proportional to the fraction
of gene pairs in the parent node that were assigned to each daughter
node. Nodes with higher (lower) fractions of SSL gene pairs than the
root are red (blue). Color saturation reflects the entropy of gene
pairs, with respect to SSL interaction, in a node relative to the root.
Each node is labeled with the number of its gene pairs that are (+) or
are not (-) SSL.
Supporting Figure 5
Fig. 5. Method performance when omitting various types of information.
Supporting Figure 6
Fig. 6. The network of correctly predicted interactions (red) at a
sensitivity of 25%, among experimentally verified SSL interactions (red
edges and blue edges) from eight synthetic genetic array (SGA) screens
(35,996 tested pairs). Nodes of the eight query genes are enlarged.
Supporting Table 5
Table 5. Performance on 35,996 blinded predictions. Score threshold,
sensitivity, false positive rate, number passing threshold, and success
rate.
Supporting Figure 7
Fig. 7. Tree used to generate predictions validated by experimental
results. See description of tree in Fig. 4 legend.
Supporting Figure 8
Fig. 8. SSL gene pairs from the highest-scoring leaves of the decision
tree may belong to compensating pathways. When gene 1 and gene 2 are
lost, synthetic sickness or lethality may result, because both
compensating pathways are impaired (a) or because two of three (or more)
compensating pathways are impaired (b). Blue circles represent genes.
"gene 1" and "gene 2" represent a query gene pair from the first (a) or
third (b) leaf. H indicates sequence homology; P indicates physical
interaction; S indicates SSL interaction; X indicates correlated mRNA
expression.
Supporting Table 6
Table 6. List of top 5,000 predicted SSL gene pairs. From left to right,
prediction number, the SGD identification number of the first gene, its
systematic name, standard name, and MIPS function, followed by the same
information for the second gene, followed by the score of the gene pair,
and the rank of the leaf to which it mapped.
Supporting Table 7
Table 7. Top predicted pairs comprised of genes without SSL interactions
in the training set. From left to right, SGD identification number of
the first gene, its systematic name, and its standard name, followed by
the same information for the second gene, followed by the score of the
gene pair and whether its was annotated as SSL in YPD.