Tables Comparing the Known Folds in Various Genomes

Table Name	Size (kb), Format	Links	Fields (keys bold)	Description
minscop summary	54 k, tab delim.	data, head	EC, SC, SS, HI, HP, MJ, MP, MG, \|, class, SF, type, count	The table summarizes the patterns of fold usage in minscop_report (which, in turn, is derived from merging descrip_did and the many minscop_occurrence). This is derived from an analysis of the genomes EC SC SS HI HP MJ MP MG. For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome
minscop report	140 k, tab delim.	data, head	obj_id_, class, Fold, EC, SC, SS, HI, HP, MJ, MP, MG, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, EC, SC, SS, HI, HP, MJ, MP, MG	Detailed report on the fold usage in the genomes EC SC SS HI HP MJ MP MG This large joined table is derived from merging the following tables: minscop, descrip_did, and many minscop_occurrence. It contains the name of each fold, a best representative scop domain id (did), with associated pdb id and residue selection, the number of times the fold appears in scop and minscop. Some of the most important fields are described below. did = a best representative (scop domain id) Fam. = the number in minscop (number of seq. families) PDB = the number of these domains in the PDB, according to scop 1.35 Name = the name for this fold object total = total number of a given fold in all the genomes totexist = how many genomes a given fold exists in sortidx = totexist + total / 1000 SF = whether or not the fold is a superfold class = a representation for the fold's class Fold# = scop fold number corresponding to the domain The final columns just given a representation of whether or not the fold exists in a given genome. Here are the actual db storing lines (for reference): $fold_report->store($obj_id_,$csym2{$class},$foldnum, @tuple,$totfolds,$N_minsp, $N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold, $class,$csym{$class},$did,$fids,$a,\@tuple_exist);
merged summary	77 k, tab delim.	data, head	EC, SC, HI, SS, HP, MJ, MP, MG, \|, class, SF, type, fold_n, sfam_n, fam_n	This table summarizes the patterns of fold, superfamily, and family usage in the genomes (EC SC HI SS HP MJ MP MG). For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome
fold summary	52 k, tab delim.	data, head	EC, SC, HI, SS, HP, MJ, MP, MG, \|, class, SF, type, count	The table summarizes the patterns of fold usage in fold_report (which, in turn, is derived from merging descrip_fold and the many fold_occurrence). This is derived from an analysis of the genomes EC SC HI SS HP MJ MP MG. For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome
fold report	48 k, tab delim.	data, head	obj_id_, class, Fold, EC, SC, HI, SS, HP, MJ, MP, MG, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, EC, SC, HI, SS, HP, MJ, MP, MG	Detailed report on the fold usage in the genomes EC SC HI SS HP MJ MP MG This large joined table is derived from merging the following tables: minscop, descrip_fold, and many fold_occurrence. It contains the name of each fold, a best representative scop domain id (did), with associated pdb id and residue selection, the number of times the fold appears in scop and minscop. Some of the most important fields are described below. did = a best representative (scop domain id) Fam. = the number in minscop (number of seq. families) PDB = the number of these domains in the PDB, according to scop 1.35 Name = the name for this fold object total = total number of a given fold in all the genomes totexist = how many genomes a given fold exists in sortidx = totexist + total / 1000 SF = whether or not the fold is a superfold class = a representation for the fold's class Fold# = scop fold number corresponding to the domain The final columns just given a representation of whether or not the fold exists in a given genome. Here are the actual db storing lines (for reference): $fold_report->store($obj_id_,$csym2{$class},$foldnum, @tuple,$totfolds,$N_minsp, $N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold, $class,$csym{$class},$did,$fids,$a,\@tuple_exist);
fold dist ratio	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the ratio of non-shared folds to shared ones.
crosstab summary	13 k, tab delim.	data, head	9, EC, SC, HI, SS, HP, MJ, MP, MG, fold, sfam., fam., fold-A, fold-B, fold-SF, fold-AB, mA, mB, mN, mS, sA, sB, sN, sS	This table crosstabulates the fields in the merged_summary table. For each pattern of occurences in the genomes (EC SC HI SS HP MJ MP MG), a number of different counts are given. Here are the main ones: fold = number of folds sfam. = number of superfamilies fam. = number of distinct minscop families fold-A = number of all-alpha folds fold-B = .. .. all-beta .. fold-AB = .. .. mixed .. fold-SF = .. .. superfolds Patterns with 1 and _ are to be read literally. Those with + and - are unordered.
fold dist both have fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of shared folds.
fold dist neither has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are in neither genome.
fold dist one has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are contained in one but not the other genome.
gen aa comp dist	1 k, phylip dist. matrix	data, head
minscop dist both have fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of shared folds.
minscop dist neither has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are in neither genome.
minscop dist one has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are contained in one but not the other genome.
minscop dist ratio	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the ratio of non-shared folds to shared ones.
sfams dist both have fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of shared folds.
sfams dist neither has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are in neither genome.
sfams dist one has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are contained in one but not the other genome.
sfams dist ratio	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the ratio of non-shared folds to shared ones.
sfams report	71 k, tab delim.	data, head	obj_id_, class, Fold, EC, SC, HI, SS, HP, MJ, MP, MG, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, EC, SC, HI, SS, HP, MJ, MP, MG	Detailed report on the fold usage in the genomes EC SC HI SS HP MJ MP MG This large joined table is derived from merging the following tables: minscop, descrip_sfam, and many sfam_occurrence. It contains the name of each fold, a best representative scop domain id (did), with associated pdb id and residue selection, the number of times the fold appears in scop and minscop. Some of the most important fields are described below. did = a best representative (scop domain id) Fam. = the number in minscop (number of seq. families) PDB = the number of these domains in the PDB, according to scop 1.35 Name = the name for this fold object total = total number of a given fold in all the genomes totexist = how many genomes a given fold exists in sortidx = totexist + total / 1000 SF = whether or not the fold is a superfold class = a representation for the fold's class Fold# = scop fold number corresponding to the domain The final columns just given a representation of whether or not the fold exists in a given genome. Here are the actual db storing lines (for reference): $fold_report->store($obj_id_,$csym2{$class},$foldnum, @tuple,$totfolds,$N_minsp, $N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold, $class,$csym{$class},$did,$fids,$a,\@tuple_exist);
sfams summary	54 k, tab delim.	data, head	EC, SC, HI, SS, HP, MJ, MP, MG, \|, class, SF, type, count	The table summarizes the patterns of fold usage in sfams_report (which, in turn, is derived from merging descrip_sfam and the many sfam_occurrence). This is derived from an analysis of the genomes EC SC HI SS HP MJ MP MG. For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome
unsorted fold dist both have fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of shared folds.
unsorted fold dist neither has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are in neither genome.
unsorted fold dist one has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are contained in one but not the other genome.
unsorted fold dist ratio	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the ratio of non-shared folds to shared ones.
unsorted fold report	48 k, tab delim.	data, head	obj_id_, class, Fold, MG, MP, MJ, HP, SS, HI, SC, EC, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, MG, MP, MJ, HP, SS, HI, SC, EC	Detailed report on the fold usage in the genomes MG MP MJ HP SS HI SC EC This large joined table is derived from merging the following tables: minscop, descrip_fold, and many fold_occurrence. It contains the name of each fold, a best representative scop domain id (did), with associated pdb id and residue selection, the number of times the fold appears in scop and minscop. Some of the most important fields are described below. did = a best representative (scop domain id) Fam. = the number in minscop (number of seq. families) PDB = the number of these domains in the PDB, according to scop 1.35 Name = the name for this fold object total = total number of a given fold in all the genomes totexist = how many genomes a given fold exists in sortidx = totexist + total / 1000 SF = whether or not the fold is a superfold class = a representation for the fold's class Fold# = scop fold number corresponding to the domain The final columns just given a representation of whether or not the fold exists in a given genome. Here are the actual db storing lines (for reference): $fold_report->store($obj_id_,$csym2{$class},$foldnum, @tuple,$totfolds,$N_minsp, $N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold, $class,$csym{$class},$did,$fids,$a,\@tuple_exist);
unsorted fold summary	52 k, tab delim.	data, head	MG, MP, MJ, HP, SS, HI, SC, EC, \|, class, SF, type, count	The table summarizes the patterns of fold usage in unsorted_fold_report (which, in turn, is derived from merging descrip_fold and the many fold_occurrence). This is derived from an analysis of the genomes MG MP MJ HP SS HI SC EC. For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome
unsorted minscop dist both have fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of shared folds.
unsorted minscop dist neither has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are in neither genome.
unsorted minscop dist one has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are contained in one but not the other genome.
unsorted minscop dist ratio	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the ratio of non-shared folds to shared ones.
unsorted minscop report	140 k, tab delim.	data, head	obj_id_, class, Fold, MG, MP, MJ, HP, SS, HI, SC, EC, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, MG, MP, MJ, HP, SS, HI, SC, EC	Detailed report on the fold usage in the genomes MG MP MJ HP SS HI SC EC This large joined table is derived from merging the following tables: minscop, descrip_did, and many minscop_occurrence. It contains the name of each fold, a best representative scop domain id (did), with associated pdb id and residue selection, the number of times the fold appears in scop and minscop. Some of the most important fields are described below. did = a best representative (scop domain id) Fam. = the number in minscop (number of seq. families) PDB = the number of these domains in the PDB, according to scop 1.35 Name = the name for this fold object total = total number of a given fold in all the genomes totexist = how many genomes a given fold exists in sortidx = totexist + total / 1000 SF = whether or not the fold is a superfold class = a representation for the fold's class Fold# = scop fold number corresponding to the domain The final columns just given a representation of whether or not the fold exists in a given genome. Here are the actual db storing lines (for reference): $fold_report->store($obj_id_,$csym2{$class},$foldnum, @tuple,$totfolds,$N_minsp, $N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold, $class,$csym{$class},$did,$fids,$a,\@tuple_exist);
unsorted minscop summary	54 k, tab delim.	data, head	MG, MP, MJ, HP, SS, HI, SC, EC, \|, class, SF, type, count	The table summarizes the patterns of fold usage in unsorted_minscop_report (which, in turn, is derived from merging descrip_did and the many minscop_occurrence). This is derived from an analysis of the genomes MG MP MJ HP SS HI SC EC. For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome
unsorted sfams dist both have fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of shared folds.
unsorted sfams dist neither has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are in neither genome.
unsorted sfams dist one has fold	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the number of folds that are contained in one but not the other genome.
unsorted sfams dist ratio	1 k, phylip dist. matrix	data, head	key_, val1, val2	Distance between each genome in terms of the number of shared folds. This phylip formatted matrix contains the ratio of non-shared folds to shared ones.
unsorted sfams report	71 k, tab delim.	data, head	obj_id_, class, Fold, MG, MP, MJ, HP, SS, HI, SC, EC, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, MG, MP, MJ, HP, SS, HI, SC, EC	Detailed report on the fold usage in the genomes MG MP MJ HP SS HI SC EC This large joined table is derived from merging the following tables: minscop, descrip_sfam, and many sfam_occurrence. It contains the name of each fold, a best representative scop domain id (did), with associated pdb id and residue selection, the number of times the fold appears in scop and minscop. Some of the most important fields are described below. did = a best representative (scop domain id) Fam. = the number in minscop (number of seq. families) PDB = the number of these domains in the PDB, according to scop 1.35 Name = the name for this fold object total = total number of a given fold in all the genomes totexist = how many genomes a given fold exists in sortidx = totexist + total / 1000 SF = whether or not the fold is a superfold class = a representation for the fold's class Fold# = scop fold number corresponding to the domain The final columns just given a representation of whether or not the fold exists in a given genome. Here are the actual db storing lines (for reference): $fold_report->store($obj_id_,$csym2{$class},$foldnum, @tuple,$totfolds,$N_minsp, $N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold, $class,$csym{$class},$did,$fids,$a,\@tuple_exist);
unsorted sfams summary	54 k, tab delim.	data, head	MG, MP, MJ, HP, SS, HI, SC, EC, \|, class, SF, type, count	The table summarizes the patterns of fold usage in unsorted_sfams_report (which, in turn, is derived from merging descrip_sfam and the many sfam_occurrence). This is derived from an analysis of the genomes MG MP MJ HP SS HI SC EC. For all * is the wildcard and matches all of types. class describes the fold class SF is whether or not this applies to superfolds type is as follows: pattern_exist -- for all genomes pattern_exist_unordered -- for all genomes just considering the number of genomes exist_in_a_genome -- whether or not a fold exists in a genome total_in_a_genome -- accumulates the count of folds in a particular genome

[census home]