Files for analyses of SSU sequences of eukaryotic origins


IMPORTANT NOTICE: the .xls file are NOT ALL real excel files, they are tabulated files for programs such as excel. Make sure you parameter your program to have tabulations (\t) as fields delimitors.

Compressed folder

percent
85.zip  
results
results.zip  

Unzip these folders.
NOTE : I have below separated the most usefull files for biologists. You will get less data on your hard disk, without missing any important info !

fasta and bio_xls files

zipped files of tag (metazoa removed).
Format of fasta:
>FSGHFSFJSD_n
TGCTAGCTAGCTAGCTAGCTAGCAT

FSGHFSFJSD: tag identifier
n : number of times tag occurred in sample


bio_xls files
_bio.xls files
85
90
92
94
96
98
99

Comparisons of samples

Excel files containing comparisons of sample in a format compatible with DNA chip analysis programs.
In columns : the various samples
In lines : the "gene", species, genus,...phylum class to which tags are assigned
Contents : expression level, two different sub tables,
Each form of the file is name according to the kind of assignment done. Different files for assignments at different % of similarity.
"-" ==> Could not be assigned to a public sequence at that level of similarity
"unassigned" ==> Could not be assigned to a public sequence with good taxonomy, but was assigned to some kind of clone sequence.
Note that
DNA chip like "expression"  files
85
90
92
94
96
98
99

The python program

Tutorial
windows exe
Linux and Mac source code.
Complete documentation.


New use. Open the program. Select working dir and project name
Creat a project
It will contain a list of subdirs you can use with the present version of the program
It contains the following directories
Copy the result of unzipped folders in the respective folders created
I do not provide all results, since it is presently more than 40 Go (big outputs of blasts)

Some trees

sample1
circular