7.1.4. Mikado pick

This is the final stage of the pipeline, in which Mikado identifies gene loci and selects the best transcripts.

7.1.4.1. Input files

mikado pick requires as input files the following:

  1. A sorted GTF files with unique transcript names, derived through the prepare stage.
  2. A database containing all the external data to be integrated with the transcript structure, derived through the serialisation stage.
  3. A scoring file specifying the minimum requirements for transcripts and the relevant metrics for scoring. See the section on scoring files for details.

7.1.4.2. Output files

mikado pick will produce three kinds of output files: a GFF3 file, a metrics file, and a scores file. This triad will be produced for the loci level, and optionally also for the subloci and monoloci level.

7.1.4.2.1. GFF3 files

This output file is a standard-compliant GFF, with the addition of the superloci to indicate the original spans. An example with two superloci, one on the negative and one on the positive strand, follows:

##gff-version 3
##sequence-region Chr5 1 26975502
Chr5    Mikado_loci     superlocus      26584796        26601707        .       +       .       ID=Mikado_superlocus:Chr5+:26584796-26601707;Name=superlocus:Chr5+:26584796-26601707
Chr5    Mikado_loci     gene    26584796        26587912        23      +       .       ID=mikado.Chr5G1;Name=mikado.Chr5G1;multiexonic=True;superlocus=Mikado_superlocus:Chr5+:26584796-26601707
Chr5    Mikado_loci     mRNA    26584796        26587912        24      +       .       ID=mikado.Chr5G1.2;Parent=mikado.Chr5G1;Name=mikado.Chr5G1.2;alias=st_Stringtie_STAR.21710.1;canonical_junctions=1,2,3,4,5,6,7,8,9,10;canonical_number=10;canonical_proportion=1.0;ccode=j;cov=25.165945;primary=False
Chr5    Mikado_loci     exon    26584796        26584879        .       +       .       ID=mikado.Chr5G1.2.exon1;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     five_prime_UTR  26584796        26584879        .       +       .       ID=mikado.Chr5G1.2.five_prime_UTR1;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26585220        26585273        .       +       .       ID=mikado.Chr5G1.2.exon2;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     five_prime_UTR  26585220        26585222        .       +       .       ID=mikado.Chr5G1.2.five_prime_UTR2;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26585223        26585273        .       +       0       ID=mikado.Chr5G1.2.CDS1;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26585345        26585889        .       +       0       ID=mikado.Chr5G1.2.CDS2;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26585345        26585889        .       +       .       ID=mikado.Chr5G1.2.exon3;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26585982        26586102        .       +       1       ID=mikado.Chr5G1.2.CDS3;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26585982        26586102        .       +       .       ID=mikado.Chr5G1.2.exon4;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26586217        26586294        .       +       0       ID=mikado.Chr5G1.2.CDS4;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26586217        26586294        .       +       .       ID=mikado.Chr5G1.2.exon5;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26586420        26586524        .       +       0       ID=mikado.Chr5G1.2.CDS5;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26586420        26586524        .       +       .       ID=mikado.Chr5G1.2.exon6;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26586638        26586850        .       +       0       ID=mikado.Chr5G1.2.CDS6;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26586638        26586850        .       +       .       ID=mikado.Chr5G1.2.exon7;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26586934        26586996        .       +       0       ID=mikado.Chr5G1.2.CDS7;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26586934        26586996        .       +       .       ID=mikado.Chr5G1.2.exon8;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26587084        26587202        .       +       0       ID=mikado.Chr5G1.2.CDS8;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26587084        26587202        .       +       .       ID=mikado.Chr5G1.2.exon9;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26587287        26587345        .       +       1       ID=mikado.Chr5G1.2.CDS9;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26587287        26587345        .       +       .       ID=mikado.Chr5G1.2.exon10;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     CDS     26587427        26587755        .       +       2       ID=mikado.Chr5G1.2.CDS10;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     exon    26587427        26587912        .       +       .       ID=mikado.Chr5G1.2.exon11;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     three_prime_UTR 26587756        26587912        .       +       .       ID=mikado.Chr5G1.2.three_prime_UTR1;Parent=mikado.Chr5G1.2
Chr5    Mikado_loci     mRNA    26584930        26587912        23      +       .       ID=mikado.Chr5G1.1;Parent=mikado.Chr5G1;Name=mikado.Chr5G1.1;alias=st_Stringtie_STAR.21710.3;canonical_junctions=1,2,3,4,5,6,7,8,9,10;canonical_number=10;canonical_proportion=1.0;cov=2.207630;primary=True
Chr5    Mikado_loci     exon    26584930        26585023        .       +       .       ID=mikado.Chr5G1.1.exon1;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     five_prime_UTR  26584930        26585023        .       +       .       ID=mikado.Chr5G1.1.five_prime_UTR1;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26585220        26585273        .       +       .       ID=mikado.Chr5G1.1.exon2;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     five_prime_UTR  26585220        26585222        .       +       .       ID=mikado.Chr5G1.1.five_prime_UTR2;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26585223        26585273        .       +       0       ID=mikado.Chr5G1.1.CDS1;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26585345        26585889        .       +       0       ID=mikado.Chr5G1.1.CDS2;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26585345        26585889        .       +       .       ID=mikado.Chr5G1.1.exon3;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26585982        26586102        .       +       1       ID=mikado.Chr5G1.1.CDS3;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26585982        26586102        .       +       .       ID=mikado.Chr5G1.1.exon4;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26586217        26586294        .       +       0       ID=mikado.Chr5G1.1.CDS4;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26586217        26586294        .       +       .       ID=mikado.Chr5G1.1.exon5;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26586420        26586524        .       +       0       ID=mikado.Chr5G1.1.CDS5;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26586420        26586524        .       +       .       ID=mikado.Chr5G1.1.exon6;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26586638        26586850        .       +       0       ID=mikado.Chr5G1.1.CDS6;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26586638        26586850        .       +       .       ID=mikado.Chr5G1.1.exon7;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26586934        26586996        .       +       0       ID=mikado.Chr5G1.1.CDS7;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26586934        26586996        .       +       .       ID=mikado.Chr5G1.1.exon8;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26587084        26587202        .       +       0       ID=mikado.Chr5G1.1.CDS8;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26587084        26587202        .       +       .       ID=mikado.Chr5G1.1.exon9;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26587287        26587345        .       +       1       ID=mikado.Chr5G1.1.CDS9;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26587287        26587345        .       +       .       ID=mikado.Chr5G1.1.exon10;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     CDS     26587427        26587755        .       +       2       ID=mikado.Chr5G1.1.CDS10;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     exon    26587427        26587912        .       +       .       ID=mikado.Chr5G1.1.exon11;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     three_prime_UTR 26587756        26587912        .       +       .       ID=mikado.Chr5G1.1.three_prime_UTR1;Parent=mikado.Chr5G1.1
Chr5    Mikado_loci     gene    26588402        26592561        20      +       .       ID=mikado.Chr5G2;Name=mikado.Chr5G2;multiexonic=True;superlocus=Mikado_superlocus:Chr5+:26584796-26601707
Chr5    Mikado_loci     mRNA    26588402        26592561        24      +       .       ID=mikado.Chr5G2.2;Parent=mikado.Chr5G2;Name=mikado.Chr5G2.2;alias=st_Stringtie_STAR.21710.9.split1;canonical_junctions=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21;canonical_number=21;canonical_proportion=1.0;ccode=j;cov=0.000000;primary=False
Chr5    Mikado_loci     exon    26588402        26588625        .       +       .       ID=mikado.Chr5G2.2.exon1;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     five_prime_UTR  26588402        26588625        .       +       .       ID=mikado.Chr5G2.2.five_prime_UTR1;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26589203        26589279        .       +       .       ID=mikado.Chr5G2.2.exon2;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     five_prime_UTR  26589203        26589237        .       +       .       ID=mikado.Chr5G2.2.five_prime_UTR2;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26589238        26589279        .       +       0       ID=mikado.Chr5G2.2.CDS1;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26589386        26590167        .       +       0       ID=mikado.Chr5G2.2.CDS2;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26589386        26590167        .       +       .       ID=mikado.Chr5G2.2.exon3;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26590261        26590393        .       +       1       ID=mikado.Chr5G2.2.CDS3;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26590261        26590393        .       +       .       ID=mikado.Chr5G2.2.exon4;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26590495        26590566        .       +       0       ID=mikado.Chr5G2.2.CDS4;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26590495        26590566        .       +       .       ID=mikado.Chr5G2.2.exon5;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26590641        26590739        .       +       0       ID=mikado.Chr5G2.2.CDS5;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26590641        26590739        .       +       .       ID=mikado.Chr5G2.2.exon6;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26590880        26591092        .       +       0       ID=mikado.Chr5G2.2.CDS6;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26590880        26591092        .       +       .       ID=mikado.Chr5G2.2.exon7;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26591174        26591236        .       +       0       ID=mikado.Chr5G2.2.CDS7;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26591174        26591236        .       +       .       ID=mikado.Chr5G2.2.exon8;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26591324        26591442        .       +       0       ID=mikado.Chr5G2.2.CDS8;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26591324        26591442        .       +       .       ID=mikado.Chr5G2.2.exon9;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26591520        26591578        .       +       1       ID=mikado.Chr5G2.2.CDS9;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26591520        26591578        .       +       .       ID=mikado.Chr5G2.2.exon10;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26591681        26592002        .       +       2       ID=mikado.Chr5G2.2.CDS10;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26591681        26592002        .       +       .       ID=mikado.Chr5G2.2.exon11;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     CDS     26592528        26592561        .       +       1       ID=mikado.Chr5G2.2.CDS11;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     exon    26592528        26592561        .       +       .       ID=mikado.Chr5G2.2.exon12;Parent=mikado.Chr5G2.2
Chr5    Mikado_loci     mRNA    26588402        26592561        20      +       .       ID=mikado.Chr5G2.1;Parent=mikado.Chr5G2;Name=mikado.Chr5G2.1;alias=st_Stringtie_STAR.21710.6.split3;canonical_junctions=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21;canonical_number=21;canonical_proportion=1.0;cov=0.000000;primary=True
Chr5    Mikado_loci     exon    26588402        26588625        .       +       .       ID=mikado.Chr5G2.1.exon1;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     five_prime_UTR  26588402        26588625        .       +       .       ID=mikado.Chr5G2.1.five_prime_UTR1;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26589196        26589279        .       +       .       ID=mikado.Chr5G2.1.exon2;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     five_prime_UTR  26589196        26589237        .       +       .       ID=mikado.Chr5G2.1.five_prime_UTR2;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26589238        26589279        .       +       0       ID=mikado.Chr5G2.1.CDS1;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26589386        26590167        .       +       0       ID=mikado.Chr5G2.1.CDS2;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26589386        26590167        .       +       .       ID=mikado.Chr5G2.1.exon3;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26590261        26590393        .       +       1       ID=mikado.Chr5G2.1.CDS3;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26590261        26590393        .       +       .       ID=mikado.Chr5G2.1.exon4;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26590495        26590566        .       +       0       ID=mikado.Chr5G2.1.CDS4;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26590495        26590566        .       +       .       ID=mikado.Chr5G2.1.exon5;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26590641        26590739        .       +       0       ID=mikado.Chr5G2.1.CDS5;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26590641        26590739        .       +       .       ID=mikado.Chr5G2.1.exon6;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26590880        26591092        .       +       0       ID=mikado.Chr5G2.1.CDS6;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26590880        26591092        .       +       .       ID=mikado.Chr5G2.1.exon7;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26591174        26591236        .       +       0       ID=mikado.Chr5G2.1.CDS7;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26591174        26591236        .       +       .       ID=mikado.Chr5G2.1.exon8;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26591324        26591442        .       +       0       ID=mikado.Chr5G2.1.CDS8;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26591324        26591442        .       +       .       ID=mikado.Chr5G2.1.exon9;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26591520        26591578        .       +       1       ID=mikado.Chr5G2.1.CDS9;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26591520        26591578        .       +       .       ID=mikado.Chr5G2.1.exon10;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26591681        26592002        .       +       2       ID=mikado.Chr5G2.1.CDS10;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26591681        26592002        .       +       .       ID=mikado.Chr5G2.1.exon11;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     CDS     26592528        26592561        .       +       1       ID=mikado.Chr5G2.1.CDS11;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     exon    26592528        26592561        .       +       .       ID=mikado.Chr5G2.1.exon12;Parent=mikado.Chr5G2.1
Chr5    Mikado_loci     gene    26592649        26595691        19      +       .       ID=mikado.Chr5G3;Name=mikado.Chr5G3;multiexonic=True;superlocus=Mikado_superlocus:Chr5+:26584796-26601707
Chr5    Mikado_loci     mRNA    26592720        26595691        19      +       .       ID=mikado.Chr5G3.1;Parent=mikado.Chr5G3;Name=mikado.Chr5G3.1;alias=st_Stringtie_STAR.21710.7.split2;canonical_junctions=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20;canonical_number=20;canonical_proportion=1.0;cov=0.000000;primary=True
Chr5    Mikado_loci     CDS     26592720        26593365        .       +       0       ID=mikado.Chr5G3.1.CDS1;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26592720        26593365        .       +       .       ID=mikado.Chr5G3.1.exon1;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26593449        26593836        .       +       2       ID=mikado.Chr5G3.1.CDS2;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26593449        26593836        .       +       .       ID=mikado.Chr5G3.1.exon2;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26593930        26594062        .       +       1       ID=mikado.Chr5G3.1.CDS3;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26593930        26594062        .       +       .       ID=mikado.Chr5G3.1.exon3;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26594172        26594243        .       +       0       ID=mikado.Chr5G3.1.CDS4;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26594172        26594243        .       +       .       ID=mikado.Chr5G3.1.exon4;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26594318        26594416        .       +       0       ID=mikado.Chr5G3.1.CDS5;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26594318        26594416        .       +       .       ID=mikado.Chr5G3.1.exon5;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26594569        26594772        .       +       0       ID=mikado.Chr5G3.1.CDS6;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26594569        26594772        .       +       .       ID=mikado.Chr5G3.1.exon6;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26594860        26594922        .       +       0       ID=mikado.Chr5G3.1.CDS7;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26594860        26594922        .       +       .       ID=mikado.Chr5G3.1.exon7;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26595003        26595121        .       +       0       ID=mikado.Chr5G3.1.CDS8;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26595003        26595121        .       +       .       ID=mikado.Chr5G3.1.exon8;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26595210        26595268        .       +       1       ID=mikado.Chr5G3.1.CDS9;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26595210        26595268        .       +       .       ID=mikado.Chr5G3.1.exon9;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     CDS     26595366        26595691        .       +       2       ID=mikado.Chr5G3.1.CDS10;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     exon    26595366        26595691        .       +       .       ID=mikado.Chr5G3.1.exon10;Parent=mikado.Chr5G3.1
Chr5    Mikado_loci     mRNA    26592649        26595268        21      +       .       ID=mikado.Chr5G3.2;Parent=mikado.Chr5G3;Name=mikado.Chr5G3.2;abundance=2.390309;alias=cl_Chr5.6283;canonical_junctions=1,2,3,4,5,6,7,8;canonical_number=8;canonical_proportion=1.0;ccode=j;primary=False
Chr5    Mikado_loci     exon    26592649        26593365        .       +       .       ID=mikado.Chr5G3.2.exon1;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     five_prime_UTR  26592649        26592719        .       +       .       ID=mikado.Chr5G3.2.five_prime_UTR1;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26592720        26593365        .       +       0       ID=mikado.Chr5G3.2.CDS1;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26593449        26593836        .       +       2       ID=mikado.Chr5G3.2.CDS2;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26593449        26593836        .       +       .       ID=mikado.Chr5G3.2.exon2;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26593930        26594095        .       +       1       ID=mikado.Chr5G3.2.CDS3;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26593930        26594095        .       +       .       ID=mikado.Chr5G3.2.exon3;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26594172        26594243        .       +       0       ID=mikado.Chr5G3.2.CDS4;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26594172        26594243        .       +       .       ID=mikado.Chr5G3.2.exon4;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26594318        26594416        .       +       0       ID=mikado.Chr5G3.2.CDS5;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26594318        26594416        .       +       .       ID=mikado.Chr5G3.2.exon5;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26594569        26594772        .       +       0       ID=mikado.Chr5G3.2.CDS6;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26594569        26594772        .       +       .       ID=mikado.Chr5G3.2.exon6;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26594860        26594922        .       +       0       ID=mikado.Chr5G3.2.CDS7;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26594860        26594922        .       +       .       ID=mikado.Chr5G3.2.exon7;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26595003        26595121        .       +       0       ID=mikado.Chr5G3.2.CDS8;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26595003        26595121        .       +       .       ID=mikado.Chr5G3.2.exon8;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     CDS     26595210        26595268        .       +       1       ID=mikado.Chr5G3.2.CDS9;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     exon    26595210        26595268        .       +       .       ID=mikado.Chr5G3.2.exon9;Parent=mikado.Chr5G3.2
Chr5    Mikado_loci     gene    26596207        26598231        20      +       .       ID=mikado.Chr5G4;Name=mikado.Chr5G4;multiexonic=False;superlocus=Mikado_superlocus:Chr5+:26584796-26601707
Chr5    Mikado_loci     mRNA    26596207        26598231        20      +       .       ID=mikado.Chr5G4.1;Parent=mikado.Chr5G4;Name=mikado.Chr5G4.1;alias=st_Stringtie_STAR.21710.6.split3;canonical_junctions=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21;canonical_number=21;canonical_proportion=1.0;cov=0.000000;primary=True
Chr5    Mikado_loci     CDS     26596207        26598192        .       +       0       ID=mikado.Chr5G4.1.CDS1;Parent=mikado.Chr5G4.1
Chr5    Mikado_loci     exon    26596207        26598231        .       +       .       ID=mikado.Chr5G4.1.exon1;Parent=mikado.Chr5G4.1
Chr5    Mikado_loci     three_prime_UTR 26598193        26598231        .       +       .       ID=mikado.Chr5G4.1.three_prime_UTR1;Parent=mikado.Chr5G4.1
Chr5    Mikado_loci     gene    26599417        26601137        20      +       .       ID=mikado.Chr5G5;Name=mikado.Chr5G5;multiexonic=True;superlocus=Mikado_superlocus:Chr5+:26584796-26601707
Chr5    Mikado_loci     mRNA    26599417        26601137        20      +       .       ID=mikado.Chr5G5.1;Parent=mikado.Chr5G5;Name=mikado.Chr5G5.1;abundance=0.371780;alias=cl_Chr5.6286;canonical_junctions=1,2,3,4,5,6;canonical_number=6;canonical_proportion=1.0;primary=True
Chr5    Mikado_loci     exon    26599417        26599654        .       +       .       ID=mikado.Chr5G5.1.exon1;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     five_prime_UTR  26599417        26599612        .       +       .       ID=mikado.Chr5G5.1.five_prime_UTR1;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26599613        26599654        .       +       0       ID=mikado.Chr5G5.1.CDS1;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26599767        26600053        .       +       0       ID=mikado.Chr5G5.1.CDS2;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     exon    26599767        26600053        .       +       .       ID=mikado.Chr5G5.1.exon2;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26600151        26600244        .       +       1       ID=mikado.Chr5G5.1.CDS3;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     exon    26600151        26600244        .       +       .       ID=mikado.Chr5G5.1.exon3;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26600314        26600394        .       +       0       ID=mikado.Chr5G5.1.CDS4;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     exon    26600314        26600394        .       +       .       ID=mikado.Chr5G5.1.exon4;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26600497        26600616        .       +       0       ID=mikado.Chr5G5.1.CDS5;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     exon    26600497        26600616        .       +       .       ID=mikado.Chr5G5.1.exon5;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26600696        26600908        .       +       0       ID=mikado.Chr5G5.1.CDS6;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     exon    26600696        26600908        .       +       .       ID=mikado.Chr5G5.1.exon6;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     CDS     26600987        26601085        .       +       0       ID=mikado.Chr5G5.1.CDS7;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     exon    26600987        26601137        .       +       .       ID=mikado.Chr5G5.1.exon7;Parent=mikado.Chr5G5.1
Chr5    Mikado_loci     three_prime_UTR 26601086        26601137        .       +       .       ID=mikado.Chr5G5.1.three_prime_UTR1;Parent=mikado.Chr5G5.1
###
Chr5    Mikado_loci     superlocus      26575364        26579730        .       -       .       ID=Mikado_superlocus:Chr5-:26575364-26579730;Name=superlocus:Chr5-:26575364-26579730
Chr5    Mikado_loci     ncRNA_gene      26575364        26579730        18      -       .       ID=mikado.Chr5G6;Name=mikado.Chr5G6;multiexonic=True;superlocus=Mikado_superlocus:Chr5-:26575364-26579730
Chr5    Mikado_loci     ncRNA   26575711        26579730        18      -       .       ID=mikado.Chr5G6.1;Parent=mikado.Chr5G6;Name=cl_Chr5.6271;abundance=1.141582;alias=cl_Chr5.6271;canonical_junctions=1,2,3,4,5,6,7,8,9,10;canonical_number=10;canonical_proportion=1.0;primary=True
Chr5    Mikado_loci     exon    26575711        26575797        .       -       .       ID=mikado.Chr5G6.1.exon1;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26575885        26575944        .       -       .       ID=mikado.Chr5G6.1.exon2;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26576035        26576134        .       -       .       ID=mikado.Chr5G6.1.exon3;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26576261        26577069        .       -       .       ID=mikado.Chr5G6.1.exon4;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26577163        26577288        .       -       .       ID=mikado.Chr5G6.1.exon5;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26577378        26577449        .       -       .       ID=mikado.Chr5G6.1.exon6;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26577856        26577937        .       -       .       ID=mikado.Chr5G6.1.exon7;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26578239        26578792        .       -       .       ID=mikado.Chr5G6.1.exon8;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26579079        26579161        .       -       .       ID=mikado.Chr5G6.1.exon9;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26579301        26579395        .       -       .       ID=mikado.Chr5G6.1.exon10;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     exon    26579602        26579730        .       -       .       ID=mikado.Chr5G6.1.exon11;Parent=mikado.Chr5G6.1
Chr5    Mikado_loci     ncRNA   26578496        26579563        13      -       .       ID=mikado.Chr5G6.3;Parent=mikado.Chr5G6;Name=tr_c73_g1_i1.mrna1.160;alias=tr_c73_g1_i1.mrna1.160;canonical_junctions=1;canonical_number=1;canonical_proportion=1.0;ccode=j;gene_name=c73_g1_i1;primary=False
Chr5    Mikado_loci     exon    26578496        26578518        .       -       .       ID=mikado.Chr5G6.3.exon1;Parent=mikado.Chr5G6.3
Chr5    Mikado_loci     exon    26579301        26579563        .       -       .       ID=mikado.Chr5G6.3.exon2;Parent=mikado.Chr5G6.3
Chr5    Mikado_loci     ncRNA   26575364        26578163        16      -       .       ID=mikado.Chr5G6.2;Parent=mikado.Chr5G6;Name=cuff_cufflinks_star_at.23553.1;alias=cuff_cufflinks_star_at.23553.1;fpkm=2.9700103727;canonical_junctions=1,2,3,4,5,6,7,8;canonical_number=8;canonical_proportion=1.0;ccode=j;conf_hi=3.260618;conf_lo=2.679403;cov=81.895309;frac=0.732092;primary=False
Chr5    Mikado_loci     exon    26575364        26575410        .       -       .       ID=mikado.Chr5G6.2.exon1;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26575495        26575620        .       -       .       ID=mikado.Chr5G6.2.exon2;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26575711        26575797        .       -       .       ID=mikado.Chr5G6.2.exon3;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26575885        26575944        .       -       .       ID=mikado.Chr5G6.2.exon4;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26576035        26576134        .       -       .       ID=mikado.Chr5G6.2.exon5;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26576261        26577069        .       -       .       ID=mikado.Chr5G6.2.exon6;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26577163        26577288        .       -       .       ID=mikado.Chr5G6.2.exon7;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26577378        26577449        .       -       .       ID=mikado.Chr5G6.2.exon8;Parent=mikado.Chr5G6.2
Chr5    Mikado_loci     exon    26577856        26578163        .       -       .       ID=mikado.Chr5G6.2.exon9;Parent=mikado.Chr5G6.2
###
Things to note:
  • multiple RNAs for the same gene are identified by progressive enumeration after a “.” (eg. mikado.Chr5G5.1, mikado.Chr5G5.2, etc.).
  • All RNAs retain their old name under the attribute “alias”. If a transcript was split due to the presence of multiple ORFs, its alias will end with “.split<progressive ID>”.
  • RNAs have the boolean attribute “primary”, which identifies them as the primary transcript of the gene or as an alternative splicing isoform.
  • Non-primary RNAs have the additional “ccode” field, which identifies the class code assigned to them when they were compared to the primary transcript.
  • multiexonic RNAs have the attributes “canonical_junctions”, “canonical_number”, and “canonical_proportion” assigned to them. These properties are calculated by Mikado during the prepare stage.

7.1.4.2.2. Metrics files

These are tabular files that enumerate all the metrics raw values for each transcript. This is the section of the metrics file corresponding to the GFF3 file above:

tid parent  score   best_bits       blast_score     canonical_intron_proportion     cdna_length     cds_not_maximal cds_not_maximal_fraction        combined_cds_fraction   combined_cds_intron_fractioncombined_cds_length combined_cds_num        combined_cds_num_fraction       combined_utr_fraction   combined_utr_length     end_distance_from_junction      end_distance_from_tes   exon_fraction   exon_num        five_utr_length five_utr_num    five_utr_num_complete   has_start_codon has_stop_codon  highest_cds_exon_number highest_cds_exons_num   intron_fraction is_complete     max_intron_length       min_intron_length       non_verified_introns_num        num_introns_greater_than_max    num_introns_smaller_than_min    number_internal_orfs    proportion_verified_introns     proportion_verified_introns_inlocusretained_fraction    retained_intron_num     selected_cds_exons_fraction     selected_cds_fraction   selected_cds_intron_fraction    selected_cds_length     selected_cds_num        selected_cds_number_fraction    selected_end_distance_from_junction     selected_end_distance_from_tes  selected_start_distance_from_tss        snowy_blast_score       source_score    start_distance_from_tss three_utr_length        three_utr_num   three_utr_num_complete  utr_fraction    utr_length      utr_num utr_num_complete        verified_introns_num
mikado.Chr5G1.2     mikado.Chr5G1   19.0    1086.25 1086.25 1.0     1927    0       0.0     0.87    1.0     1683    10      0.91    0.13    244     0       157     0.92    11      87      2       1       TrueTrue        10      10      0.91    True    340     71      10      0       0       1       0.0     0       0.0     0       0.91    0.87    1.0     1683    10      0.91    0       157     87      13.78   0       87      157     1       0       0.13    244     3       1       0
mikado.Chr5G1.1     mikado.Chr5G1   21.89   1086.63 1086.63 1.0     1937    0       0.0     0.87    1.0     1683    10      0.91    0.13    254     0       157     0.92    11      97      2       1       TrueTrue        10      10      0.91    True    196     71      10      0       0       1       0.0     0       0.0     0       0.91    0.87    1.0     1683    10      0.91    0       157     97      13.78   0       97      157     1       0       0.13    254     3       1       0
mikado.Chr5G2.2     mikado.Chr5G2   19.04   1140.95 1140.95 1.0     2197    0       0.0     0.88    1.0     1938    11      0.92    0.12    259     0       0       0.92    12      259     2       1       TrueTrue        11      11      0.92    True    577     74      11      0       0       1       0.0     0       0.0     0       0.92    0.88    1.0     1938    11      0.92    0       0       259     16.66   0       259     0       0       0       0.12    259     2       1       0
mikado.Chr5G2.1     mikado.Chr5G2   20.06   1140.95 1140.95 1.0     2204    0       0.0     0.88    1.0     1938    11      0.92    0.12    266     0       0       0.92    12      266     2       1       TrueTrue        11      11      0.92    True    570     74      11      0       0       1       0.0     0       0.0     0       0.92    0.88    1.0     1938    11      0.92    0       0       266     16.66   0       266     0       0       0       0.12    266     2       1       0
mikado.Chr5G3.2     mikado.Chr5G3   8.59    1193.72 1193.72 1.0     1887    0       0.0     0.96    0.8     1816    9       1.0     0.04    71      0       0       0.75    9       71      1       0       TrueFalse       9       9       0.8     False   152     74      8       0       0       1       0.0     0       0.0     0       1.0     0.96    0.8     1816    9       1.0     0       0       71      14.16   0       71      0       0       0       0.04    71      1       0       0
mikado.Chr5G3.1     mikado.Chr5G3   19.0    1353.19 1353.19 1.0     2109    0       0.0     1.0     0.9     2109    10      1.0     0.0     0       0       0       0.83    10      0       0       0       TrueTrue        10      10      0.9     True    152     74      9       0       0       1       0.0     0       0.0     0       1.0     1.0     0.9     2109    10      1.0     0       0       0       16.66   0       00      0       0       0.0     0       0       0       0
mikado.Chr5G4.1     mikado.Chr5G4   20.0    1258.43 1258.43 1.0     2025    0       0.0     0.98    0       1986    1       1.0     0.02    39      0       39      1.0     1       0       0       0       TrueTrue        1       1       0       True    0       0       0       0       0       1       0       0       0.0     0       1.0     0.98    0       1986    1       1.0     0       39      0       16.66   0       039     1       0       0.02    39      1       0       0
mikado.Chr5G5.1     mikado.Chr5G5   20.0    565.46  565.46  1.0     1184    0       0.0     0.79    1.0     936     7       1.0     0.21    248     0       52      1.0     7       196     1       0       TrueTrue        7       7       1.0     True    112     69      6       0       0       1       0.0     0       0.0     0       1.0     0.79    1.0     936     7       1.0     0       52      196     13.67   0       196     52      1       0       0.21    248     2       0       0
mikado.Chr5G6.2     mikado.Chr5G6   17.1    0       0       1.0     1735    0       0       0.0     0       0       0       0.0     1.0     0       0       0       0.56    9       0       0       0       False   False   0       0       0.62    False   406     84      0       0       0       0       1.0     0.89    0.0     0       0.0     0.0     0       0       0       0.0     0       0       0       0       00      0       0       0       1.0     0       0       0       8
mikado.Chr5G6.1     mikado.Chr5G6   17.9    0       0       1.0     2197    0       0       0.0     0       0       0       0.0     1.0     0       0       0       0.69    11      0       0       0       False   False   0       0       0.77    False   406     87      3       0       0       0       0.9     1.0     0.0     0       0.0     0.0     0       0       0       0.0     0       0       0       0       00      0       0       0       1.0     0       0       0       9
mikado.Chr5G6.3     mikado.Chr5G6   13.0    0       0       1.0     286     0       0       0.0     0       0       0       0.0     1.0     0       0       0       0.12    2       0       0       0       False   False   0       0       0.08    False   782     782     1       0       0       0       0.0     0.0     0.0     0       0.0     0.0     0       0       0       0.0     0       0       0       0       00      0       0       0       1.0     0       0       0       0

As it can be noted, metrics can assume values in a very wide range. We direct you to the metrics section of the documentation for further details.

7.1.4.2.3. Scoring files

This file contains the scores assigned to each metric for each transcript. Only metrics which have been used for the scoring will be present. This is the section of the metrics file corresponding to the above GFF3 file:

tid parent  score   blast_score     cdna_length     cds_not_maximal cds_not_maximal_fraction        combined_cds_fraction   combined_cds_intron_fraction    combined_cds_length     combined_cds_num        end_distance_from_junction      exon_fraction   exon_num        five_utr_length five_utr_num    highest_cds_exon_number intron_fraction number_internal_orfs    proportion_verified_introns     retained_fraction       retained_intron_num     selected_cds_fraction   selected_cds_intron_fraction    selected_cds_length     selected_cds_num        source_score    three_utr_length        three_utr_num
mikado.Chr5G1.2     mikado.Chr5G1   19.0    0.0     0.0     1       1       0.0     1       1       1       1       1       1       0.0     1.0     1       1       1.0     1       1       1       0.0     1       11      0       0.0     1.0
mikado.Chr5G1.1     mikado.Chr5G1   21.89   1.0     1.0     1       1       0.06    1       1       1       1       1       1       0.77    1.0     1       1       1.0     1       1       1       0.06    1       11      0       0.0     1.0
mikado.Chr5G2.2     mikado.Chr5G2   19.04   1       0.0     1       1       0.0     1       1       1       1       1       1       0.04    1.0     1       1       1.0     1       1       1       0.0     1       11      0       0.0     0.0
mikado.Chr5G2.1     mikado.Chr5G2   20.06   1       1.0     1       1       0.03    1       1       1       1       1       1       0.0     1.0     1       1       1.0     1       1       1       0.03    1       11      0       0.0     0.0
mikado.Chr5G3.1     mikado.Chr5G3   19.0    1.0     1.0     1       1       0.0     1.0     1.0     1.0     1       1.0     1.0     0.0     0.0     1.0     1.0     1.0     1       1       1       0.0     1.01.0  1.0     0       0.0     0.0
mikado.Chr5G3.2     mikado.Chr5G3   8.59    0.0     0.0     1       1       0.19    0.0     0.0     0.0     1       0.0     0.0     0.71    0.5     0.0     0.0     1.0     1       1       1       0.19    0.00.0  0.0     0       0.0     0.0
mikado.Chr5G4.1     mikado.Chr5G4   20.0    1       1       1       1       0.0     1       1       1       1       1       1       0.0     0.0     1       1       1.0     1       1       1       0.0     1       11      0       0.0     1.0
mikado.Chr5G5.1     mikado.Chr5G5   20.0    1       1       1       1       0.0     1       1       1       1       1       1       0.0     0.0     1       1       1.0     1       1       1       0.0     1       11      0       0.0     1.0
mikado.Chr5G6.3     mikado.Chr5G6   13.0    1       0.0     1       1       0.0     1       1       1       1       0.0     0.0     0.0     0.0     1       0.0     0.0     0.0     1       1       0.0     1       11      0       0.0     0.0
mikado.Chr5G6.2     mikado.Chr5G6   17.1    1       0.76    1       1       0.0     1       1       1       1       0.78    0.78    0.0     0.0     1       0.78    0.0     1.0     1       1       0.0     1       11      0       0.0     0.0
mikado.Chr5G6.1     mikado.Chr5G6   17.9    1       1.0     1       1       0.0     1       1       1       1       1.0     1.0     0.0     0.0     1       1.0     0.0     0.9     1       1       0.0     1       11      0       0.0     0.0

The final score value is obtained by summing all the individual metrics.

Important

If you compare the scores assigned to transcripts at the loci level with those assigned at the subloci level, you will notice that the scores are different and that even some of the raw metrics values are. The former phenomenon is due to the fact that the Mikado scoring system is not absolute but relative; the latter, to the fact that some metrics are locus-dependent, ie their values change due the presence or absence of other transcripts. A typical example is given by the “retained_intron” metrics; retained introns are identified by looking for non-coding regions of transcript which fall inside the intron of another transcript. Changing the transcripts in the locus will change the value associated to this metric, as non-coding sections will or will not be classified as “retained introns”, and therefore the score associated with both the metric and the transcript.

7.1.4.2.3.1. Transcript padding

After calculating the final loci, Mikado can try to uniform the ends of transcripts present in the locus, by extending the shorter ones so that their ends coincide with those of longer transcripts in the locus. The procedure is explained more in detail in the dedicated section in the Algorithms page. The approach has been inspired by the consolidation approach taken by the Araport annotation for Arabidopsis thaliana [AraPort].

7.1.4.3. Usage

mikado pick allows to modify some of the parameters regarding the run at runtime. However, some sections - such as most of the settings regarding alternative splicing - are left untouched by the utility, and are best modified by editing the configuration file itself. The available parameters are as follows:

  • json-conf: required. This is the configuration file created in the first step of the pipeline.

  • gff; optionally, it is possible to point Mikado prepare to the GTF it should use here on the command line. This file should be the output of the preparation step. Please note that this file should be in GTF format, sorted by chromosome and position; if that is not the case, Mikado will fail.

  • db: Optionally, it is possible to specify the database to Mikado on the command line, rather than on the configuration file. Currently, this option supports SQLite databases only.

  • Options related to how Mikado will treat the data:

    • intron_range: this option expects a couple of positive integers, in ascending order, indicating the 98% CI where most intron lengths should fall into. Gene models with introns whose lengths fall outside of this range might be penalized, depending on the scoring system used. If uncertain, it is possible to use the included stats utility on the gene annotation of a closely related species.
    • no-purge: flag. If set, Mikado will not not exclude putative fragments from the output, but will report them (appropriately flagged).
    • flank: for the purposes of identifying fragments, it is useful to consider together loci which are not necessarily overlapping but which are lying relatively near on the genome sequence. This parameter (a positive integer) specifies the maximum distance for Mikado for gathering data together for this purpose.
    • mode: how Mikado will treat BLAST and ORF data in the presence of putative chimeras. Please refer to the algorithms section for details.
  • Options regarding the output files:

    • output-dir: Output directory. By default, Mikado will write all files and the log on the current directory.
    • loci_out: required. This it the main output file, in GFF format.
    • prefix: this ID will be prefixed to all gene and transcript models. IN general, IDs will be of the form “<prefix>.<chromosome><progressive ID>”. Default: Mikado.
    • source: source field prefix for the output files. Useful for eg loading Mikado runs into WebApollo [Apollo].
    • no_cds: if present, this flg will indicate to Mikado not to print out the CDS of selected models but only their transcript structures.
    • subloci_out: If requested, Mikado can output the data regarding the first intermediate step, ie the subloci. See the introduction for details.
    • monoloci_out: If requested, Mikado can output the data regarding the second intermediate step, ie the monosubloci. See the introduction for details.
  • Options regarding the resources to be used:

    • procs: number of processors to use.
    • start-method: multiprocessing start method. See the explanation on Python multiprocessing
    • single: flag. If present, multiprocessing will be disabled.
  • Options regarding logging:

    • log: name of the log file. By default, “pick.log”
    • verbose: sets the log level to DEBUG. Please be advised that the debug mode is extremely verbose and is bestly invoked only for real, targeted debugging sessions.
    • noverbose: sets the log level to ERROR. If set, in most cases, the log file will be practically empty.
    • log-level: this flag directly sets the log level. Available values: DEBUG, INFO, WARNING, ERROR.
  • Options related to padding:

    • pad: if set, this option will enforce transcript padding. The default is inferred from the configuration (on by default).
    • no-pad: if set, this option will disable transcript padding. The default is inferred from the configuration (on by default).
    • pad-max-splices: maximum amount of splicing sites that an expanded exon can cross. Default is inferred from the configuration file (currently default is 1)
    • pad-max-distance: Maximum amount of basepairs that transcripts can be padded with (per side). Default is inferred from the configuration file (default 300 bps)
    • fasta: genome FASTA file. Required if the padding is switched on. Default: inferred from the configuration file.

Usage:

$ mikado pick --help
usage: Mikado pick [-h] [--fasta GENOME]
                   [--start-method {fork,spawn,forkserver}] [--shm | --no-shm]
                   [-p PROCS] --configuration CONFIGURATION
                   [--scoring-file SCORING_FILE]
                   [-i INTRON_RANGE INTRON_RANGE]
                   [--no-pad | --pad | --codon-table CODON_TABLE]
                   [--pad-max-splices PAD_MAX_SPLICES]
                   [--pad-max-distance PAD_MAX_DISTANCE] [-r REGIONS]
                   [-od OUTPUT_DIR] [--subloci-out SUBLOCI_OUT]
                   [--monoloci-out MONOLOCI_OUT] [--loci-out LOCI_OUT]
                   [--prefix PREFIX] [--source SOURCE]
                   [--report-all-external-metrics] [--no_cds] [--flank FLANK]
                   [--max-intron-length MAX_INTRON_LENGTH] [--no-purge]
                   [--cds-only] [--as-cds-only] [--reference-update]
                   [--report-all-orfs] [--only-reference-update] [-eri] [-kdc]
                   [-mco MIN_CLUSTERING_CDNA_OVERLAP]
                   [-mcso MIN_CLUSTERING_CDS_OVERLAP] [--check-references]
                   [-db SQLITE_DB] [--single] [-l LOG]
                   [--verbose | --quiet | -lv {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                   [--mode {nosplit,stringent,lenient,permissive,split}]
                   [--seed SEED | --random-seed]
                   [gff]

Launcher of the Mikado pipeline.

positional arguments:
  gff

optional arguments:
  -h, --help            show this help message and exit
  --fasta GENOME, --genome GENOME
                        Genome FASTA file. Required for transcript padding.
  --start-method {fork,spawn,forkserver}
                        Multiprocessing start method.
  --shm                 Flag. If switched, Mikado pick will copy the database
                        to RAM (ie SHM) for faster access during the run.
  --no-shm              Flag. If switched, Mikado will force using the
                        database on location instead of copying it to /dev/shm
                        for faster access.
  -p PROCS, --procs PROCS
                        Number of processors to use. Default: look in the
                        configuration file (1 if undefined)
  --configuration CONFIGURATION, --json-conf CONFIGURATION
                        Configuration file for Mikado.
  --scoring-file SCORING_FILE
                        Optional scoring file for the run. It will override
                        the value set in the configuration.
  -i INTRON_RANGE INTRON_RANGE, --intron-range INTRON_RANGE INTRON_RANGE
                        Range into which intron lengths should fall, as a
                        couple of integers. Transcripts with intron lengths
                        outside of this range will be penalised. Default: (60,
                        900)
  --no-pad              Disable transcript padding.
  --pad                 Whether to pad transcripts in loci.
  --codon-table CODON_TABLE
                        Codon table to use. Default: 0 (ie Standard, NCBI #1,
                        but only ATG is considered a valid start codon.
  --pad-max-splices PAD_MAX_SPLICES
                        Maximum splice sites that can be crossed during
                        transcript padding.
  --pad-max-distance PAD_MAX_DISTANCE
                        Maximum amount of bps that transcripts can be padded
                        with (per side).
  -r REGIONS, --regions REGIONS
                        Either a single region on the CLI or a file listing a
                        series of target regions. Mikado pick will only
                        consider regions included in this string/file. Regions
                        should be provided in a WebApollo-like format:
                        <chrom>:<start>..<end>
  --no_cds              Flag. If set, not CDS information will be printed out
                        in the GFF output files.
  --flank FLANK         Flanking distance (in bps) to group non-overlapping
                        transcripts into a single superlocus. Default:
                        determined by the configuration file.
  --max-intron-length MAX_INTRON_LENGTH
                        Maximum intron length for a transcript. Default:
                        inferred from the configuration file (default value
                        there is 1,000,000 bps).
  --no-purge            Flag. If set, the pipeline will NOT suppress any loci
                        whose transcripts do not pass the requirements set in
                        the JSON file.
  --cds-only            "Flag. If set, Mikado will only look for overlap in
                        the coding features when clustering transcripts
                        (unless one transcript is non-coding, in which case
                        the whole transcript will be considered). Please note
                        that Mikado will only consider the **best** ORF for
                        this. Default: False, Mikado will consider transcripts
                        in their entirety.
  --as-cds-only         Flag. If set, Mikado will only consider the CDS to
                        determine whether a transcript is a valid alternative
                        splicing event in a locus.
  --reference-update    Flag. If switched on, Mikado will prioritise
                        transcripts marked as reference and will consider any
                        other transcipt within loci only in reference to these
                        reference transcripts. Novel loci will still be
                        reported.
  --report-all-orfs     Boolean switch. If set to true, all ORFs will be
                        reported, not just the primary.
  --only-reference-update
                        Flag. If switched on, Mikado will only keep loci where
                        at least one of the transcripts is marked as
                        "reference". CAUTION: if no transcript has been marked
                        as reference, the output will be completely empty!
  -eri, --exclude-retained-introns
                        Exclude all retained intron alternative splicing
                        events from the final output. Default: False. Retained
                        intron events that do not dirsupt the CDS are kept by
                        Mikado in the final output.
  -kdc, --keep-disrupted-cds
                        Keep in the final output transcripts whose CDS is most
                        probably disrupted by a retained intron event.
                        Default: False. Mikado will try to detect these
                        instances and exclude them from the final output.
  -mco MIN_CLUSTERING_CDNA_OVERLAP, --min-clustering-cdna-overlap MIN_CLUSTERING_CDNA_OVERLAP
                        Minimum cDNA overlap between two transcripts for them
                        to be considered part of the same locus during the
                        late picking stages. NOTE: if --min-cds-overlap is not
                        specified, it will be set to this value! Default: 20%.
  -mcso MIN_CLUSTERING_CDS_OVERLAP, --min-clustering-cds-overlap MIN_CLUSTERING_CDS_OVERLAP
                        Minimum CDS overlap between two transcripts for them
                        to be considered part of the same locus during the
                        late picking stages. NOTE: if not specified, and
                        --min-cdna-overlap is specified on the command line,
                        min-cds-overlap will be set to this value! Default:
                        20%.
  --check-references    Flag. If switched on, Mikado will also check reference
                        models against the general transcript requirements,
                        and will also consider them as potential fragments.
                        This is useful in the context of e.g. updating an *ab-
                        initio* results with data from RNASeq, protein
                        alignments, etc.
  -db SQLITE_DB, --sqlite-db SQLITE_DB
                        Location of an SQLite database to overwrite what is
                        specified in the configuration file.
  --single              Flag. If set, Creator will be launched with a single
                        process, without involving the multithreading
                        apparatus. Useful for debugging purposes only.
  --mode {nosplit,stringent,lenient,permissive,split}
                        Mode in which Mikado will treat transcripts with
                        multiple ORFs. - nosplit: keep the transcripts whole.
                        - stringent: split multi-orf transcripts if two
                        consecutive ORFs have both BLAST hits and none of
                        those hits is against the same target. - lenient:
                        split multi-orf transcripts as in stringent, and
                        additionally, also when either of the ORFs lacks a
                        BLAST hit (but not both). - permissive: like lenient,
                        but also split when both ORFs lack BLAST hits - split:
                        split multi-orf transcripts regardless of what BLAST
                        data is available.
  --seed SEED           Random seed number. Default: 0.
  --random-seed         Generate a new random seed number (instead of the
                        default of 0)

Options related to the output files.:
  -od OUTPUT_DIR, --output-dir OUTPUT_DIR
                        Output directory. Default: current working directory
  --subloci-out SUBLOCI_OUT
  --monoloci-out MONOLOCI_OUT
  --loci-out LOCI_OUT   This output file is mandatory. If it is not specified
                        in the configuration file, it must be provided here.
  --prefix PREFIX       Prefix for the genes. Default: Mikado
  --source SOURCE       Source field to use for the output files.
  --report-all-external-metrics
                        Boolean switch. If activated, Mikado will report all
                        available external metrics, not just those requested
                        for in the scoring configuration. This might affect
                        speed in Minos analyses.

Log options:
  -l LOG, --log LOG     File to write the log to. Default: decided by the
                        configuration file.
  --verbose
  --quiet
  -lv {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Logging level. Default: retrieved by the configuration
                        file.

7.1.4.4. Technical details

mikado pick uses a divide-et-impera algorithm to find and analyse loci separately. As the data to be integrated with the transcripts is stored on the database rather than be calculated on the fly, rerunning pick with different options takes little time and resources. To keep the data sorted, Mikado will write out temporary files during the operation and merge them at the end of the run (see function merge_loci_gff in the picking module.