October 25, 2005 README file for "Combining Biological Networks to Predict Genetic Interactions" Wong et al, PNAS 2004 The gene-pair data used to predict genetic interactions is contained in 27 files, plus one file that lists the gene-pair characteristics we used. A. GENE-PAIR CHARACTERISTICS FILE: characteristics.txt This file contains a heirarchical list of gene-pair characteristics. File format: ; For each root level characteristic, the name of file enclosed in parenthesis contains gene pairs that posses that characteristic. Abbreviated names for gene-pair characteristics are enclosed in curly braces. B. GENE-PAIR FILES: files ending in "_pairs" or "_prs" These files contain gene pairs possessing the characteristic described by the respective file name. File format: \t\t SGD = Saccharomyces Genome Database (http://www.yeastgenome.org/) c. NOTES 1. Treat "_pairs" files the same as "_prs" files. 2. In the characteristics.txt file, ignore lines beginning with "#". Those characteristics were ignored because they were - redundant with another characteristic (usually the one listed immediately above), or - too non-specific 3. A gene pair that possesses a characteristic also possess all corresponding parent characteristics. 4. When predicting SSL gene pairs, SSL-containg "2-hop" characteristics must be generated on the fly, according to gene pairs in the training set (i.e. A gene pair (or edge) that is not in the training set can not give rise to "2-hop" data used to train the model.). 5. Some SGD identification numbers have now been deleted or merged with other SGD identification numbers. Pairs containing these retired SGD identification numbers should be removed when making future predictions. 6. Since these files were generated, the Saccharomyces Genome Database modified its gene identification numbers. Specifically, two zeros were inserted into each identification number, such that "S0001234" became "S000001234".