Supporting Information for:

SL Wong, LV Zhang, AHY Tong, Z Li, DS Goldberg, OD King, G Lesage, M Vidal, B Andrews, H Bussey, C Boone, and FP Roth. Combining biological networks to predict genetic interactions. Proceedings of the National Academy of Sciences 101(44): 15682-15687 (2004). PDF

Files in this Data Supplement:

Supporting Text

Supporting Table 2
Table 2. A hierarchy of gene-pair characteristics. Characteristics are followed by the number of gene pairs known to possess that characteristic. As in the Gene Ontology, if a gene pair possesses a characteristic, it possesses all parents of that characteristic.

Supporting Table 3
Table 3. SSL interactions used for cross-validation and in training for experimental validation. On the left are the SGD identification number, systematic name, and standard name of first gene, followed by the same information for the second gene. On the right, query genes and array genes that were paired to assess SSL interaction.

Supporting Table 4
Table 4. Performance in cross-validation. Score threshold, sensitivity, false positive rate, number passing threshold, and success rate. Total true SSL interactions, 3,685; total falses, 688,433.

Supporting Figure 4
Fig. 4. (a- d) Cross-validation trees with the five top-scoring leaves labeled by rank. Left and right arrows point to gene pairs with and without, respectively, the characteristic that labels the node from which the arrow points. Arrowhead size is proportional to the fraction of gene pairs in the parent node that were assigned to each daughter node. Nodes with higher (lower) fractions of SSL gene pairs than the root are red (blue). Color saturation reflects the entropy of gene pairs, with respect to SSL interaction, in a node relative to the root. Each node is labeled with the number of its gene pairs that are (+) or are not (-) SSL.

Supporting Figure 5
Fig. 5. Method performance when omitting various types of information.

Supporting Figure 6
Fig. 6. The network of correctly predicted interactions (red) at a sensitivity of 25%, among experimentally verified SSL interactions (red edges and blue edges) from eight synthetic genetic array (SGA) screens (35,996 tested pairs). Nodes of the eight query genes are enlarged.

Supporting Table 5
Table 5. Performance on 35,996 blinded predictions. Score threshold, sensitivity, false positive rate, number passing threshold, and success rate.

Supporting Figure 7
Fig. 7. Tree used to generate predictions validated by experimental results. See description of tree in Fig. 4 legend.

Supporting Figure 8
Fig. 8. SSL gene pairs from the highest-scoring leaves of the decision tree may belong to compensating pathways. When gene 1 and gene 2 are lost, synthetic sickness or lethality may result, because both compensating pathways are impaired (a) or because two of three (or more) compensating pathways are impaired (b). Blue circles represent genes. "gene 1" and "gene 2" represent a query gene pair from the first (a) or third (b) leaf. H indicates sequence homology; P indicates physical interaction; S indicates SSL interaction; X indicates correlated mRNA expression.

Supporting Table 6
Table 6. List of top 5,000 predicted SSL gene pairs. From left to right, prediction number, the SGD identification number of the first gene, its systematic name, standard name, and MIPS function, followed by the same information for the second gene, followed by the score of the gene pair, and the rank of the leaf to which it mapped.

Supporting Table 7
Table 7. Top predicted pairs comprised of genes without SSL interactions in the training set. From left to right, SGD identification number of the first gene, its systematic name, and its standard name, followed by the same information for the second gene, followed by the score of the gene pair and whether its was annotated as SSL in YPD.

All raw datafiles