For Genome `MJ`, Tables with Specific Analysis

Table Name	Size (kb), Format	Links	Fields (keys bold)	Description
tm segs	31 k, tab delim.	data, head	id_, start_I, stop_n, energy_f	Transmembrane segments. (version 2, revised 971113)
tm histo	1 k, tab delim.	data, head	ntm_I, prots_n	Histogram of frequency of transmembrane segments.
signal segs	3 k, tab delim.	data, head	id_, start_I, stop_n	Signal sequences.
seq MBY pdb MBY lcl MBY tms MBY lnk STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_tms with the mask linkers to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY pdb MBY lcl MBY tms MBY lnk COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_tms with the mask linkers to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk.
seq MBY pdb MBY lcl MBY tms MBY lnk	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb_MBY_lcl_MBY_tms with the mask linkers
minscop soluble matches no overlap	16 k, tab delim.	data, head	gid_, TargStart_I, TargStop_n, did, fids, QryStart_n, QryStop_n, ev_f, swsc_n, swid_f	These are the good matches to an e-value cutoff of .01 for just the soluble proteins, scop classes 1-5,7 This table is the result of filtering out the matches from minscop_soluble_matches that hit the same sequence on the genome.
id ntm	19 k, tab delim.	data, head	id_, signalp, ntm_n	This table contains the number of transmembrane segments for each ORF. Its definition of TM-segment is after filtering. It also has signal sequence data, based on simple criteria.
fold occurrence	4 k, tab delim.	data, head	fold_, count	Number of times each fold (represented by two scop fid numbers) occurs in genome MJ This table should be sorted into a standard order.
all masks	249 k, tab delim.	data, head	gid_, start_I, stop_n, tool_, score	This file concatenates the results of creating all the masks for genome MJ.
aafreq histo	1 k, tab delim.	data, head	aa_, freq_n	Histogram of frequency of the various amino acids
alla segs	5 k, tab delim.	data, head	id_, start_I, stop_n	all-a segments
allb segs	2 k, tab delim.	data, head	id_, start_I, stop_n	all-b segments
characterized domains	28 k, tab delim.	data, head	id_, start_I, stop_n	Already characterized domains (the borders between linker regions).
fid12 count	3 k, tab delim.	data, head	fid12_, count_n, foldname	top-10 counts
full len segs	22 k, tab delim.	data, head	id_, start_I, stop_n	Full length segments.
genome v minscop	456 k, tab delim.	data, head	did_, gid_, TargStart_I, TargStop_n, QryStart_n, QryStop_n, ev_f, swsc_n, swid_f	Result of running genome MJ against Ted's minscop (scop 1.35)
genome v pdb40 132	7 k, tab delim.	data, head	did_, gid_, TargStart_i, TargStop_i, QryStart_i, QryStop_i, ev_r, swsc_f	Result of running genome MJ agains pdb40d-1.32 The swsc_f column is really ident. Using an e-value cutoff of .001
genome v pdb40d135	560 k, tab delim.	data, head	did_, gid_, TargStart_I, TargStop_n, QryStart_n, QryStop_n, ev_f, swsc_n, swid_f	Result of running genome MJ agains pdb40d-1.35
gorss	509 k, fasta	data, head
gorss MBY nul	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking gorss with the mask full_len_segs
gorss MBY nul COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file gorss with the mask full_len_segs to generate the masked fasta file gorss_MBY_nul.
gorss MBY nul STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file gorss with the mask full_len_segs to generate the masked fasta file gorss_MBY_nul. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
gorss MBY ucd	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking gorss with the mask unchar_domains
gorss MBY ucd COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file gorss with the mask unchar_domains to generate the masked fasta file gorss_MBY_ucd.
gorss MBY ucd STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file gorss with the mask unchar_domains to generate the masked fasta file gorss_MBY_ucd. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
id ntm nofilt	19 k, tab delim.	data, head	id_, signalp, ntm_n	This table contains data on whether there is a signal sequence and the number of transmembrane segments. (version 2, revised 971113). (Renamed table on 980101: id_ntm --> id_ntm_nofilt)
linkers	24 k, tab delim.	data, head	id_, start_I, stop_n	Linker regions between two other defined segments, which are less in length than 50
low complexity long	28 k, tab delim.	data, head	id_, start_I, stop_n, cplxity_f	Low complexity regions generated with the following seg command: seg/seg tmp.fa 45 3.4 3.75 -l
low complexity short	44 k, tab delim.	data, head	id_, start_I, stop_n, cplxity_f	Low complexity regions generated with the following seg command: seg/seg tmp.fa 25 3.0 3.3 -l
minscop occurrence	10 k, tab delim.	data, head	did_, count	Number of times each minscop domain id (did) occurs in genome MJ This table should be sorted into a standard order and contain 990 entries.
minscop soluble matches	17 k, tab delim.	data, head	gid_, TargStart_I, TargStop_n, did, fids, QryStart_n, QryStop_n, ev_f, swsc_n, swid_f	These are the good matches to an e-value cutoff of .01 for just the soluble proteins, scop classes 1-5,7 This is with a year cutoff of 97
minscop soluble matches overlap	2 k, tab delim.	data, head	gid_, TargStart_I, TargStop_n, did, fids, QryStart_n, QryStop_n, ev_f, swsc_n, swid_f	These are the good matches to an e-value cutoff of .01 for just the soluble proteins, scop classes 1-5,7 This table is the matches from minscop_soluble_matches that hit the same sequence on the genome. That is, it contains duplicate matches that should not be used.
null mask	1 k, tab delim.	data, head
pdb40d135 soluble matches	35 k, tab delim.	data, head	gid_, TargStart_I, TargStop_n, did, fids, QryStart_n, QryStop_n, ev_f, swsc_n, swid_f	These are the good matches to an e-value cutoff of .01 for just the soluble proteins, scop classes 1-5,7
seq	541 k, Hidden	data, head	-	-
seq MBY cdo	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask characterized_domains
seq MBY cdo COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask characterized_domains to generate the masked fasta file seq_MBY_cdo.
seq MBY cdo STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask characterized_domains to generate the masked fasta file seq_MBY_cdo. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY lcl	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask low_complexity_long
seq MBY lcl COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask low_complexity_long to generate the masked fasta file seq_MBY_lcl.
seq MBY lcl STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask low_complexity_long to generate the masked fasta file seq_MBY_lcl. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY lcs	4 k, Bad!	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask low_complexity_short
seq MBY lcs COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask low_complexity_short to generate the masked fasta file seq_MBY_lcs.
seq MBY lcs STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask low_complexity_short to generate the masked fasta file seq_MBY_lcs.
seq MBY lnk	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask linkers
seq MBY lnk COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask linkers to generate the masked fasta file seq_MBY_lnk.
seq MBY lnk STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask linkers to generate the masked fasta file seq_MBY_lnk. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY nul	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask full_len_segs
seq MBY nul COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask full_len_segs to generate the masked fasta file seq_MBY_nul.
seq MBY nul STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask full_len_segs to generate the masked fasta file seq_MBY_nul. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY pdb	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask minscop_soluble_matches
seq MBY pdb COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask minscop_soluble_matches to generate the masked fasta file seq_MBY_pdb.
seq MBY pdb MBY lcl	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb with the mask low_complexity_long
seq MBY pdb MBY lcl COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb with the mask low_complexity_long to generate the masked fasta file seq_MBY_pdb_MBY_lcl.
seq MBY pdb MBY lcl MBY lcs	6 k, Bad!	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb_MBY_lcl with the mask low_complexity_long
seq MBY pdb MBY lcl MBY lcs COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb_MBY_lcl with the mask low_complexity_long to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_lcs.
seq MBY pdb MBY lcl MBY lcs MBY tms	4 k, Bad!	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb_MBY_lcl_MBY_lcs with the mask tm_segs
seq MBY pdb MBY lcl MBY lcs MBY tms COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_lcs with the mask tm_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_lcs_MBY_tms.
seq MBY pdb MBY lcl MBY lcs MBY tms STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_lcs with the mask tm_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_lcs_MBY_tms.
seq MBY pdb MBY lcl MBY lcs STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb_MBY_lcl with the mask low_complexity_long to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_lcs.
seq MBY pdb MBY lcl MBY tms	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb_MBY_lcl with the mask tm_segs
seq MBY pdb MBY lcl MBY tms COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb_MBY_lcl with the mask tm_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms.
seq MBY pdb MBY lcl MBY tms MBY lnk MBY alp	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk with the mask alla_segs
seq MBY pdb MBY lcl MBY tms MBY lnk MBY alp COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk with the mask alla_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp.
seq MBY pdb MBY lcl MBY tms MBY lnk MBY alp MBY bet	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp with the mask allb_segs
seq MBY pdb MBY lcl MBY tms MBY lnk MBY alp MBY bet COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp with the mask allb_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp_MBY_bet.
seq MBY pdb MBY lcl MBY tms MBY lnk MBY alp MBY bet STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp with the mask allb_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp_MBY_bet. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY pdb MBY lcl MBY tms MBY lnk MBY alp STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk with the mask alla_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk_MBY_alp. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY pdb MBY lcl MBY tms STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb_MBY_lcl with the mask tm_segs to generate the masked fasta file seq_MBY_pdb_MBY_lcl_MBY_tms. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY pdb MBY lcl STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb with the mask low_complexity_long to generate the masked fasta file seq_MBY_pdb_MBY_lcl. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY pdb MBY tms	1 k, Bad!	data, head	gid_, masked_seq	This fasta file is the result of masking seq_MBY_pdb with the mask tm_segs
seq MBY pdb MBY tms COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq_MBY_pdb with the mask tm_segs to generate the masked fasta file seq_MBY_pdb_MBY_tms.
seq MBY pdb MBY tms STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq_MBY_pdb with the mask tm_segs to generate the masked fasta file seq_MBY_pdb_MBY_tms.
seq MBY pdb STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask minscop_soluble_matches to generate the masked fasta file seq_MBY_pdb. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY tms	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask tm_segs
seq MBY tms COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask tm_segs to generate the masked fasta file seq_MBY_tms.
seq MBY tms STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask tm_segs to generate the masked fasta file seq_MBY_tms. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq MBY ucd	509 k, fasta	data, head	gid_, masked_seq	This fasta file is the result of masking seq with the mask unchar_domains
seq MBY ucd COMP	1 k, tab delim.	data, head	aa_, count_n	This is the aa composition of the masked file from masking the fasta file seq with the mask unchar_domains to generate the masked fasta file seq_MBY_ucd.
seq MBY ucd STAT	1 k, tab delim.	data, head	stat_, value	This are the statistics from masking the fasta file seq with the mask unchar_domains to generate the masked fasta file seq_MBY_ucd. MASKED_CHARS = number of characters masked with the application of this mask. Masked_Seqs = number of sequences masked with the application of this mask. Masking_Segs = number of segments used in the application of the mask
seq lengths	19 k, tab delim.	data, head	gid_, length_n	Length of each sequence in genome.
sfam occurrence	8 k, tab delim.	data, head	sfam_, count	Number of times each sfam (represented by three scop fid numbers) occurs in genome MJ This table should be sorted into a standard order.
tm segs filtered	11 k, tab delim.	data, head	id_, start_I, stop_n, energy_f	Transmembrane segment definitions after removing pdb matches and (most importantly) low-complexity regions. The tm_segs table is just the raw data. This is based on looking at the masked the file seq_MBY_pdb_MBY_lcl_MBY_tms_MBY_lnk for the TM segments (annotated with a 3).
tmp genome v pdb40 132	4 k, tab delim.	data, head	did_, gid_, fid1, fid2	temp table
unchar domains	21 k, tab delim.	data, head	id_, start_I, stop_n	Linker regions between two other defined segments, which are greater in length than 50 That is, these are uncharacterized protein domains.

[census home]

For Genome MJ, Tables with Specific Analysis

For Genome `MJ`, Tables with Specific Analysis