Table Name

Size (kb),
Format

Links

Fields
(keys bold)

Description

minscop summary

54 k, tab delim.

data,
head

EC, SC, SS, HI, HP, MJ, MP, MG, , class, SF, type, count

The table summarizes the patterns of fold usage in minscop_report
(which, in turn, is derived from merging descrip_did and the many
minscop_occurrence).
This is derived from an analysis of the genomes EC SC SS HI HP MJ MP MG.
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome

minscop report

140 k, tab delim.

data,
head

obj_id_, class, Fold, EC, SC, SS, HI, HP, MJ, MP, MG, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, EC, SC, SS, HI, HP, MJ, MP, MG

Detailed report on the fold usage in the genomes
EC SC SS HI HP MJ MP MG
This large joined table is derived from merging the
following tables: minscop,
descrip_did, and many minscop_occurrence.
It contains the name of each fold, a best representative scop domain
id (did), with associated pdb id and residue selection, the number of
times the fold appears in scop and minscop.
Some of the most important fields are described below.
did = a best representative (scop domain id)
Fam. = the number in minscop (number of seq. families)
PDB = the number of these domains in the PDB, according to scop 1.35
Name = the name for this fold object
total = total number of a given fold in all the genomes
totexist = how many genomes a given fold exists in
sortidx = totexist + total / 1000
SF = whether or not the fold is a superfold
class = a representation for the fold's class
Fold# = scop fold number corresponding to the domain
The final columns just given a representation of whether or not the
fold exists in a given genome.
Here are the actual db storing lines (for reference):
$fold_report>store($obj_id_,$csym2{$class},$foldnum,
@tuple,$totfolds,$N_minsp,
$N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold,
$class,$csym{$class},$did,$fids,$a,\@tuple_exist);

merged summary

77 k, tab delim.

data,
head

EC, SC, HI, SS, HP, MJ, MP, MG, , class, SF, type, fold_n, sfam_n, fam_n

This table summarizes the patterns of fold, superfamily, and family
usage in the genomes (EC SC HI SS HP MJ MP MG).
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome

fold summary

52 k, tab delim.

data,
head

EC, SC, HI, SS, HP, MJ, MP, MG, , class, SF, type, count

The table summarizes the patterns of fold usage in fold_report
(which, in turn, is derived from merging descrip_fold and the many
fold_occurrence).
This is derived from an analysis of the genomes EC SC HI SS HP MJ MP MG.
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome

fold report

48 k, tab delim.

data,
head

obj_id_, class, Fold, EC, SC, HI, SS, HP, MJ, MP, MG, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, EC, SC, HI, SS, HP, MJ, MP, MG

Detailed report on the fold usage in the genomes
EC SC HI SS HP MJ MP MG
This large joined table is derived from merging the
following tables: minscop,
descrip_fold, and many fold_occurrence.
It contains the name of each fold, a best representative scop domain
id (did), with associated pdb id and residue selection, the number of
times the fold appears in scop and minscop.
Some of the most important fields are described below.
did = a best representative (scop domain id)
Fam. = the number in minscop (number of seq. families)
PDB = the number of these domains in the PDB, according to scop 1.35
Name = the name for this fold object
total = total number of a given fold in all the genomes
totexist = how many genomes a given fold exists in
sortidx = totexist + total / 1000
SF = whether or not the fold is a superfold
class = a representation for the fold's class
Fold# = scop fold number corresponding to the domain
The final columns just given a representation of whether or not the
fold exists in a given genome.
Here are the actual db storing lines (for reference):
$fold_report>store($obj_id_,$csym2{$class},$foldnum,
@tuple,$totfolds,$N_minsp,
$N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold,
$class,$csym{$class},$did,$fids,$a,\@tuple_exist);

fold dist ratio

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the ratio of nonshared folds
to shared ones.

crosstab summary

13 k, tab delim.

data,
head

9, EC, SC, HI, SS, HP, MJ, MP, MG, fold, sfam., fam., foldA, foldB, foldSF, foldAB, mA*, mB*, mN*, m*S, sA*, sB*, sN*, s*S

This table crosstabulates the fields in the merged_summary
table.
For each pattern of occurences in the genomes (EC SC HI SS HP MJ MP MG),
a number of different counts are given. Here are the main ones:
fold = number of folds
sfam. = number of superfamilies
fam. = number of distinct minscop families
foldA = number of allalpha folds
foldB = .. .. allbeta ..
foldAB = .. .. mixed ..
foldSF = .. .. superfolds
Patterns with 1 and _ are to be read literally. Those with + and 
are unordered.

fold dist both have fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of shared folds.

fold dist neither has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are in neither genome.

fold dist one has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are contained in one but not the other genome.

gen aa comp dist

1 k, phylip dist. matrix

data,
head



minscop dist both have fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of shared folds.

minscop dist neither has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are in neither genome.

minscop dist one has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are contained in one but not the other genome.

minscop dist ratio

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the ratio of nonshared folds
to shared ones.

sfams dist both have fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of shared folds.

sfams dist neither has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are in neither genome.

sfams dist one has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are contained in one but not the other genome.

sfams dist ratio

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the ratio of nonshared folds
to shared ones.

sfams report

71 k, tab delim.

data,
head

obj_id_, class, Fold, EC, SC, HI, SS, HP, MJ, MP, MG, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, EC, SC, HI, SS, HP, MJ, MP, MG

Detailed report on the fold usage in the genomes
EC SC HI SS HP MJ MP MG
This large joined table is derived from merging the
following tables: minscop,
descrip_sfam, and many sfam_occurrence.
It contains the name of each fold, a best representative scop domain
id (did), with associated pdb id and residue selection, the number of
times the fold appears in scop and minscop.
Some of the most important fields are described below.
did = a best representative (scop domain id)
Fam. = the number in minscop (number of seq. families)
PDB = the number of these domains in the PDB, according to scop 1.35
Name = the name for this fold object
total = total number of a given fold in all the genomes
totexist = how many genomes a given fold exists in
sortidx = totexist + total / 1000
SF = whether or not the fold is a superfold
class = a representation for the fold's class
Fold# = scop fold number corresponding to the domain
The final columns just given a representation of whether or not the
fold exists in a given genome.
Here are the actual db storing lines (for reference):
$fold_report>store($obj_id_,$csym2{$class},$foldnum,
@tuple,$totfolds,$N_minsp,
$N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold,
$class,$csym{$class},$did,$fids,$a,\@tuple_exist);

sfams summary

54 k, tab delim.

data,
head

EC, SC, HI, SS, HP, MJ, MP, MG, , class, SF, type, count

The table summarizes the patterns of fold usage in sfams_report
(which, in turn, is derived from merging descrip_sfam and the many
sfam_occurrence).
This is derived from an analysis of the genomes EC SC HI SS HP MJ MP MG.
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome

unsorted fold dist both have fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of shared folds.

unsorted fold dist neither has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are in neither genome.

unsorted fold dist one has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are contained in one but not the other genome.

unsorted fold dist ratio

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the ratio of nonshared folds
to shared ones.

unsorted fold report

48 k, tab delim.

data,
head

obj_id_, class, Fold, MG, MP, MJ, HP, SS, HI, SC, EC, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, MG, MP, MJ, HP, SS, HI, SC, EC

Detailed report on the fold usage in the genomes
MG MP MJ HP SS HI SC EC
This large joined table is derived from merging the
following tables: minscop,
descrip_fold, and many fold_occurrence.
It contains the name of each fold, a best representative scop domain
id (did), with associated pdb id and residue selection, the number of
times the fold appears in scop and minscop.
Some of the most important fields are described below.
did = a best representative (scop domain id)
Fam. = the number in minscop (number of seq. families)
PDB = the number of these domains in the PDB, according to scop 1.35
Name = the name for this fold object
total = total number of a given fold in all the genomes
totexist = how many genomes a given fold exists in
sortidx = totexist + total / 1000
SF = whether or not the fold is a superfold
class = a representation for the fold's class
Fold# = scop fold number corresponding to the domain
The final columns just given a representation of whether or not the
fold exists in a given genome.
Here are the actual db storing lines (for reference):
$fold_report>store($obj_id_,$csym2{$class},$foldnum,
@tuple,$totfolds,$N_minsp,
$N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold,
$class,$csym{$class},$did,$fids,$a,\@tuple_exist);

unsorted fold summary

52 k, tab delim.

data,
head

MG, MP, MJ, HP, SS, HI, SC, EC, , class, SF, type, count

The table summarizes the patterns of fold usage in unsorted_fold_report
(which, in turn, is derived from merging descrip_fold and the many
fold_occurrence).
This is derived from an analysis of the genomes MG MP MJ HP SS HI SC EC.
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome

unsorted minscop dist both have fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of shared folds.

unsorted minscop dist neither has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are in neither genome.

unsorted minscop dist one has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are contained in one but not the other genome.

unsorted minscop dist ratio

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the ratio of nonshared folds
to shared ones.

unsorted minscop report

140 k, tab delim.

data,
head

obj_id_, class, Fold, MG, MP, MJ, HP, SS, HI, SC, EC, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, MG, MP, MJ, HP, SS, HI, SC, EC

Detailed report on the fold usage in the genomes
MG MP MJ HP SS HI SC EC
This large joined table is derived from merging the
following tables: minscop,
descrip_did, and many minscop_occurrence.
It contains the name of each fold, a best representative scop domain
id (did), with associated pdb id and residue selection, the number of
times the fold appears in scop and minscop.
Some of the most important fields are described below.
did = a best representative (scop domain id)
Fam. = the number in minscop (number of seq. families)
PDB = the number of these domains in the PDB, according to scop 1.35
Name = the name for this fold object
total = total number of a given fold in all the genomes
totexist = how many genomes a given fold exists in
sortidx = totexist + total / 1000
SF = whether or not the fold is a superfold
class = a representation for the fold's class
Fold# = scop fold number corresponding to the domain
The final columns just given a representation of whether or not the
fold exists in a given genome.
Here are the actual db storing lines (for reference):
$fold_report>store($obj_id_,$csym2{$class},$foldnum,
@tuple,$totfolds,$N_minsp,
$N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold,
$class,$csym{$class},$did,$fids,$a,\@tuple_exist);

unsorted minscop summary

54 k, tab delim.

data,
head

MG, MP, MJ, HP, SS, HI, SC, EC, , class, SF, type, count

The table summarizes the patterns of fold usage in unsorted_minscop_report
(which, in turn, is derived from merging descrip_did and the many
minscop_occurrence).
This is derived from an analysis of the genomes MG MP MJ HP SS HI SC EC.
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome

unsorted sfams dist both have fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of shared folds.

unsorted sfams dist neither has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are in neither genome.

unsorted sfams dist one has fold

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the number of folds
that are contained in one but not the other genome.

unsorted sfams dist ratio

1 k, phylip dist. matrix

data,
head

key_, val1, val2

Distance between each genome in terms of the number of shared folds.
This phylip formatted matrix contains the ratio of nonshared folds
to shared ones.

unsorted sfams report

71 k, tab delim.

data,
head

obj_id_, class, Fold, MG, MP, MJ, HP, SS, HI, SC, EC, total, Fam., PDB, Rep., Struc., Name, totexist, sortidx, SF, nclass, class2, did, fids, longid, MG, MP, MJ, HP, SS, HI, SC, EC

Detailed report on the fold usage in the genomes
MG MP MJ HP SS HI SC EC
This large joined table is derived from merging the
following tables: minscop,
descrip_sfam, and many sfam_occurrence.
It contains the name of each fold, a best representative scop domain
id (did), with associated pdb id and residue selection, the number of
times the fold appears in scop and minscop.
Some of the most important fields are described below.
did = a best representative (scop domain id)
Fam. = the number in minscop (number of seq. families)
PDB = the number of these domains in the PDB, according to scop 1.35
Name = the name for this fold object
total = total number of a given fold in all the genomes
totexist = how many genomes a given fold exists in
sortidx = totexist + total / 1000
SF = whether or not the fold is a superfold
class = a representation for the fold's class
Fold# = scop fold number corresponding to the domain
The final columns just given a representation of whether or not the
fold exists in a given genome.
Here are the actual db storing lines (for reference):
$fold_report>store($obj_id_,$csym2{$class},$foldnum,
@tuple,$totfolds,$N_minsp,
$N_scop,$pdbsel{$did},$name,$totexist,$sortidx,$superfold,
$class,$csym{$class},$did,$fids,$a,\@tuple_exist);

unsorted sfams summary

54 k, tab delim.

data,
head

MG, MP, MJ, HP, SS, HI, SC, EC, , class, SF, type, count

The table summarizes the patterns of fold usage in unsorted_sfams_report
(which, in turn, is derived from merging descrip_sfam and the many
sfam_occurrence).
This is derived from an analysis of the genomes MG MP MJ HP SS HI SC EC.
For all * is the wildcard and matches all of types.
class describes the fold class
SF is whether or not this applies to superfolds
type is as follows:
pattern_exist  for all genomes
pattern_exist_unordered  for all genomes just considering the number of genomes
exist_in_a_genome  whether or not a fold exists in a genome
total_in_a_genome  accumulates the count of folds in a particular genome
