Bastian Schiffthaler, Nicolas Delhomme
Example command:
Step 1: Estimate priors
#!/usr/bin/env/bash
mkdir ~/evaluation
rsem-eval-estimate-transcript-length-distribution \
~/raw_data/trinity_illumina_only/Trinity.fasta ~/evaluation/detonate.params
Step 2: calculate score
#!/usr/bin/env/bash
rsem-eval-calculate-score \
-p 2 \
--bowtie2 \
--transcript-length-parameters ~/evaluation/detonate.params \
--paired-end \
~/raw_data/simulated/ze_1_1.fastq.gz \
~/raw_data/simulated/ze_1_2.fastq.gz \
~/raw_data/trinity_illumina_only/Trinity.fasta \
~/evaluation/detonate \
250 |& tee ~/evaluation/detonate.log
Detonate scores are always negative, but higher values (less negative) are better.
Score -11585832.86
BIC_penalty -3115.42
Prior_score_on_contig_lengths_(f_function_canceled) -2952.68
Prior_score_on_contig_sequences -489758.39
Data_likelihood_in_log_space_without_correction -11090459.05
Correction_term_(f_function_canceled) -452.68
Number_of_contigs 488
Expected_number_of_aligned_reads_given_the_data 333817.00
Number_of_contigs_smaller_than_expected_read/fragment_length 31
Number_of_contigs_with_no_read_aligned_to 5
Maximum_data_likelihood_in_log_space -11006679.43
Number_of_alignable_reads 333817
Number_of_alignments_in_total 373089
Example command:
#!/usr/bin/env bash
gzip -d -c ~/raw_data/simulated/ze_1_1.fastq.gz > ~/evaluation/ze_1_1.fq
gzip -d -c ~/raw_data/simulated/ze_1_2.fastq.gz > ~/evaluation/ze_1_2.fq
transrate \
--assembly=~/raw_data/trinity_illumina_only/Trinity.fasta \
--left=~/evaluation/ze_1_1.fq \
--right=~/evaluation/ze_1_2.fq |& tee ~/evaluation/transrate.log
Transrate output
[ INFO] 2020-12-09 02:56:51 : Contig metrics:
[ INFO] 2020-12-09 02:56:51 : -----------------------------------
[ INFO] 2020-12-09 02:56:51 : n seqs 488
[ INFO] 2020-12-09 02:56:51 : smallest 203
[ INFO] 2020-12-09 02:56:51 : largest 3393
[ INFO] 2020-12-09 02:56:52 : n bases 353286
[ INFO] 2020-12-09 02:56:52 : mean len 723.95
[ INFO] 2020-12-09 02:56:52 : n under 200 0
[ INFO] 2020-12-09 02:56:52 : n over 1k 101
[ INFO] 2020-12-09 02:56:52 : n over 10k 0
[ INFO] 2020-12-09 02:56:52 : n with orf 289
[ INFO] 2020-12-09 02:56:52 : mean orf percent 98.16
[ INFO] 2020-12-09 02:56:52 : n90 357
[ INFO] 2020-12-09 02:56:52 : n70 587
[ INFO] 2020-12-09 02:56:52 : n50 894
[ INFO] 2020-12-09 02:56:52 : n30 1473
[ INFO] 2020-12-09 02:56:52 : n10 2380
[ INFO] 2020-12-09 02:56:52 : gc 0.45
[ INFO] 2020-12-09 02:56:52 : bases n 0
[ INFO] 2020-12-09 02:56:52 : proportion n 0.0
[ INFO] 2020-12-09 02:56:52 : Contig metrics done in 0 seconds
[ INFO] 2020-12-09 02:56:52 : Calculating read diagnostics...
Transrate output
[ INFO] 2020-12-09 02:57:17 : Read mapping metrics:
[ INFO] 2020-12-09 02:57:17 : -----------------------------------
[ INFO] 2020-12-09 02:57:17 : fragments 341812
[ INFO] 2020-12-09 02:57:17 : fragments mapped 335790
[ INFO] 2020-12-09 02:57:17 : p fragments mapped 0.98
[ INFO] 2020-12-09 02:57:17 : good mappings 333470
[ INFO] 2020-12-09 02:57:17 : p good mapping 0.98
[ INFO] 2020-12-09 02:57:17 : bad mappings 2320
[ INFO] 2020-12-09 02:57:17 : potential bridges 7
[ INFO] 2020-12-09 02:57:17 : bases uncovered 5229
[ INFO] 2020-12-09 02:57:17 : p bases uncovered 0.01
[ INFO] 2020-12-09 02:57:17 : contigs uncovbase 31
[ INFO] 2020-12-09 02:57:17 : p contigs uncovbase 0.06
[ INFO] 2020-12-09 02:57:17 : contigs uncovered 4
[ INFO] 2020-12-09 02:57:17 : p contigs uncovered 0.01
[ INFO] 2020-12-09 02:57:17 : contigs lowcovered 5
[ INFO] 2020-12-09 02:57:17 : p contigs lowcovered 0.01
[ INFO] 2020-12-09 02:57:17 : contigs segmented 6
[ INFO] 2020-12-09 02:57:17 : p contigs segmented 0.01
[ INFO] 2020-12-09 02:57:17 : TRANSRATE ASSEMBLY SCORE 0.7492
[ INFO] 2020-12-09 02:57:17 : -----------------------------------
[ INFO] 2020-12-09 02:57:17 : TRANSRATE OPTIMAL SCORE 0.7881
[ INFO] 2020-12-09 02:57:17 : TRANSRATE OPTIMAL CUTOFF 0.4913
[ INFO] 2020-12-09 02:57:17 : good contigs 474
[ INFO] 2020-12-09 02:57:17 : p good contigs 0.97
Example command
#!/usr/bin/env bash
cd ~/evaluation
busco \
-i ~/raw_data/trinity_illumina_only/Trinity.fasta \
-c 2 \
-o busco_trinity \
-m tran \
--long \
-l /db/busco/embryophyta_odb10 |& tee ~/evaluation/busco.log
--------------------------------------------------
|Results from dataset embryophyta_odb10 |
--------------------------------------------------
|C:0.4%[S:0.3%,D:0.1%],F:0.4%,M:99.2%,n:1614 |
|6 Complete BUSCOs (C) |
|5 Complete and single-copy BUSCOs (S) |
|1 Complete and duplicated BUSCOs (D) |
|7 Fragmented BUSCOs (F) |
|1601 Missing BUSCOs (M) |
|1614 Total BUSCO groups searched |
--------------------------------------------------