Q: What does the ClusterJudge do?
A: The ClusterJudge implements the algorithm published in
Gibbons, F.D. and Roth F.P., (2002) Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation. Genome Research, vol. 12, pp1574-1581.This is not a new clustering algorithm. Rather, it is a way to judge the quality of data that have been clustered elsewhere. We do this by evaluating the mutual information between a gene's membership in a cluster, and the attributes it possesses, as annotated by the Saccharomyces Genome Database (SGD)
Q: What format should I use for my uploaded files?
A: ClusterJudge expects to get your clustering result in the ".kgg" format produced by Eisen's Cluster program running k-means clustering. Clustering results produced by other programs will have to modified, but this is a simple task, since the format is so strightforward. The uploaded files should consist of a single line per gene. Each line should contain three words, separated by tab characters: the ORF name, some kind of identifying name (typically just the ORF name again), and an integer indicating which cluster it belongs to. Clusters should be numbered starting at 0. Since ClusterJudge works by scoring the degree to which genes in the same cluster share annotation, cluster membership is the only information it requires. You do not need to upload the original expression data.