Excel files containing comparisons of sample in a format compatible with DNA chip analysis programs. In columns : the various samples In lines : the "gene", species, genus,...phylum class to which tags are assigned Contents : expression level, two different sub tables,
One with how many tag occurences in each sample, for each assignment
One with how many tag dereplicates in each sample, for each assignment (I have some doubt about usefullness of these numbers)
form of the file is name according to the kind of assignment done.
Different files for assignments at different % of similarity. "-" ==> Could not be assigned to a public sequence at that level of similarity "unassigned"
==> Could not be assigned to a public sequence with good taxonomy,
but was assigned to some kind of clone sequence. Note that
for species and genus, unassinged will be absent.
species, assignment is to an accession number instead of unassigned (as
tags in different samples can be assigned the same accession number,
but obviously with some loss of generality).
New use. Open the program. Select working dir and project name
Creat a project
It will contain a list of subdirs you can use with the present version of the program
It contains the following directories
Bio. Was supposed to contain data for the biologists.
blast_out : the various results of the blasts. Not included in this distribution
fasta: contains all fasta files
: contains every file parsed at various % of similarity. Please use
data contained in the folders 60,65,...85. Files out of these folders
are aoutdated if any.
results. Contains some xls files
Copy the result of unzipped folders in the respective folders created
I do not provide all results, since it is presently more than 40 Go (big outputs of blasts)