実際の解析を行うにあたりまずはアノテーションファイル(alternative splicingのカタログのようなファイル)を用意する。
いちおうMISOのサイトにもアップされているが、GRCh38バージョンのヒトゲノム情報とかはないので、自前で作製する方法を覚え書き。
rnaseqlibのインストール
$ git clone https://github.com/yarden/rnaseqlib.git Cloning into 'rnaseqlib'... remote: Counting objects: 3433, done. remote: Total 3433 (delta 0), reused 0 (delta 0), pack-reused 3433 Receiving objects: 100% (3433/3433), 48.70 MiB | 4.55 MiB/s, done. Resolving deltas: 100% (2292/2292), done.
setup.pyを実行するのだがバグがあるらしく
https://stackoverflow.com/questions/12767023/python-packaging
setup.pyのモジュールを読み込む前のあたり(コメントの下辺り)に
from setuptools import setup
と1行追加してから
$ python setup.py install
めちゃめちゃいっぱいwarningは出るもののエラーは出ずに終了。
hg19の場合以下のファイルをダウンロード
$ wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz $ wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz $ wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/ensGene.txt.gz
hg38ではensGene.txt.gzに相当するファイルがない。
$ wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/knownGene.txt.gz $ wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/refGene.txt.gz
$ gunzip *.gz
で解凍し
$ gff_make_annotation ./ ./gff --flanking-rule commonshortest --genome-label hg19_Gencode_v19 WARNING: Cannot import gffutils. GFF-related MISO helper scripts will need gffutils to be installed. Traceback (most recent call last): File "/Users/hoge/MISO/bin/gff_make_annotation", line 11, in <module> load_entry_point('rnaseqlib==0.1', 'console_scripts', 'gff_make_annotation')() File "/Users/hoge/MISO/lib/python2.7/site-packages/pkg_resources/__init__.py", line 561, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "/Users/hoge/MISO/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2649, in load_entry_point return ep.load() File "/Users/hoge/MISO/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2303, in load return self.resolve() File "/Users/hoge/MISO/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2309, in resolve module = __import__(self.module_name, fromlist=['__name__'], level=0) File "/Users/hoge/MISO/lib/python2.7/site-packages/rnaseqlib-0.1-py2.7.egg/rnaseqlib/gff/gff_make_annotation.py", line 12, in <module> import rnaseqlib.events.defineEvents as def_events File "/Users/hoge/MISO/lib/python2.7/site-packages/rnaseqlib-0.1-py2.7.egg/rnaseqlib/events/defineEvents.py", line 9, in <module> import rnaseqlib.events.SpliceGraph as splicegraph File "/Users/hoge/MISO/lib/python2.7/site-packages/rnaseqlib-0.1-py2.7.egg/rnaseqlib/events/SpliceGraph.py", line 12, in <module> import gffutils ImportError: No module named gffutils
とやろうとしたらエラーが出た。gffutilsモジュールがないと。
仕方がないのでまたインストール
$ pip install gffutils Collecting gffutils Downloading gffutils-0.8.7.1.tar.gz (1.5MB) 100% |=============================================| 1.5MB 939kB/s Collecting pyfaidx (from gffutils) Downloading pyfaidx-0.4.9.2.tar.gz Requirement already satisfied: six in /Users/hoge/MISO/lib/python2.7/site-packages (from gffutils) Collecting argh (from gffutils) Downloading argh-0.26.2-py2.py3-none-any.whl Collecting argcomplete (from gffutils) Downloading argcomplete-1.8.2-py2.py3-none-any.whl Collecting simplejson (from gffutils) Downloading simplejson-3.11.1.tar.gz (78kB) 100% |=============================================| 81kB 9.7MB/s Requirement already satisfied: setuptools>=0.7 in /Users/hoge/MISO/lib/python2.7/site-packages (from pyfaidx->gffutils) Building wheels for collected packages: gffutils, pyfaidx, simplejson Running setup.py bdist_wheel for gffutils ... done Stored in directory: /Users/hoge/Library/Caches/pip/wheels/33/16/d0/a3913a8e32c19608a5a386d64ffcfc388c85b707baee573cc5 Running setup.py bdist_wheel for pyfaidx ... done Stored in directory: /Users/hoge/Library/Caches/pip/wheels/23/d6/d6/d9129518cb5fd0595823c5a7a63764dbf97244016777da2d5e Running setup.py bdist_wheel for simplejson ... done Stored in directory: /Users/hoge/Library/Caches/pip/wheels/bf/0a/27/5d5e337ed16a175fd483a8c1486b4343ea2632be7ac57bad5d Successfully built gffutils pyfaidx simplejson Installing collected packages: pyfaidx, argh, argcomplete, simplejson, gffutils Successfully installed argcomplete-1.8.2 argh-0.26.2 gffutils-0.8.7.1 pyfaidx-0.4.9.2 simplejson-3.11.1
そのうえで改めて
$ gff_make_annotation ./ ./gff --flanking-rule commonshortest --genome-label hg19_Gencode_v19 Making GFF alternative events annotation... - UCSC tables read from: /Users/hogw/MISO/hg19 - Output dir: /Users/hoge/MISO/hg19/gff Loaded 3 UCSC tables. Loading tables... Populating graph... Adding splice edges from table knownGene.txt Adding splice edges from table ensGene.txt Adding splice edges from table refGene.txt Populating graph took 28.43 seconds Reading table /Users/hoge/MISO/hg19/ensGene.txt Reading table /Users/hoge/MISO/hg19/knownGene.txt Reading table /Users/hoge/MISO/hg19/refGene.txt Generating skipped exons (SE) Generating mutually exclusive exons (MXE) Generating alternative 3' splice sites (A3SS) Generating alternative 5' splice sites (A5SS) Outputting retained introns... Defining retained introns (RI) Took 1.25 minutes to make the annotation.
と無事GFFファイルは生成
次にそれぞれのスプライシングパターンに対するindexを生成
$ index_gff --index A3SS.hg19_Gencode_v19.gff3 indexed_A3SS_events/ Indexing GFF... - GFF: /Users/hoge/MISO/hg19/A3SS.hg19_Gencode_v19.gff3 - Outputting to: /Users/hoge/MISO/hg19/indexed_A3SS_events Traceback (most recent call last): File "/Users/hoge/MISO/bin/index_gff", line 11, in <module> load_entry_point('misopy==0.5.3', 'console_scripts', 'index_gff')() File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/index_gff.py", line 191, in main compress_id=options.compress_id) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/index_gff.py", line 153, in index_gff gff_genes = gene_utils.load_genes_from_gff(gff_filename) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/Gene.py", line 878, in load_genes_from_gff reverse_recs=reverse_recs) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/gff_utils.py", line 191, in __init__ include_introns=include_introns) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/gff_utils.py", line 201, in from_file FILE = open(filename, "r") IOError: [Errno 2] No such file or directory: '/Users/hoge/MISO/hg19/A3SS.hg19_Gencode_v19.gff3'
またエラーだよ・・・なになにそんなファイルもディレクトリもない・・・なるほど。
$ index_gff --index gff/commonshortest/A3SS.hg19_Gencode_v19.gff3 indexed_A3SS_events/ Indexing GFF... - GFF: /Users/hoge/MISO/hg19/gff/commonshortest/A3SS.hg19_Gencode_v19.gff3 - Outputting to: /Users/hoge/MISO/hg19/indexed_A3SS_events Through 0 genes... Through 5000 genes... Through 10000 genes... Through 15000 genes... Loaded 16392 genes - Loading of genes from GFF took 5.15 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_apd_hap1 - Chromosome serialization took 0.08 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chrY - Chromosome serialization took 0.02 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chrX - Chromosome serialization took 0.16 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr13 - Chromosome serialization took 0.07 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr12 - Chromosome serialization took 0.34 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr11 - Chromosome serialization took 0.40 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr10 - Chromosome serialization took 0.17 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr17 - Chromosome serialization took 0.62 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr16 - Chromosome serialization took 0.33 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr15 - Chromosome serialization took 0.21 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr14 - Chromosome serialization took 0.20 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr19 - Chromosome serialization took 0.60 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr18 - Chromosome serialization took 0.08 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_mann_hap4 - Chromosome serialization took 0.07 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_qbl_hap6 - Chromosome serialization took 0.08 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr7_gl000195_random - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr1_gl000191_random - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr4_ctg9_hap1 - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr17_ctg5_hap1 - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr19_gl000209_random - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_dbb_hap3 - Chromosome serialization took 0.09 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_ssto_hap7 - Chromosome serialization took 0.07 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr20 - Chromosome serialization took 0.16 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr21 - Chromosome serialization took 0.05 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_mcf_hap5 - Chromosome serialization took 0.08 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chrUn_gl000228 - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6_cox_hap2 - Chromosome serialization took 0.09 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr7 - Chromosome serialization took 0.36 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr6 - Chromosome serialization took 0.29 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr5 - Chromosome serialization took 0.22 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr4 - Chromosome serialization took 0.16 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr3 - Chromosome serialization took 0.31 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr2 - Chromosome serialization took 0.43 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr1 - Chromosome serialization took 0.68 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chrUn_gl000223 - Chromosome serialization took 0.00 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr9 - Chromosome serialization took 0.20 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr8 - Chromosome serialization took 0.16 seconds Making directory: /Users/hoge/MISO/hg19/indexed_A3SS_events/chr22 - Chromosome serialization took 0.24 seconds Outputting gene records in GFF format... - Output file: /Users/hoge/MISO/hg19/indexed_A3SS_events/genes.gff - Serialization of genes from GFF took 8.77 seconds Indexing of GFF took 13.92 seconds. (MISO) Guilty-iMac:hg19 hoge$ index_gff --index gff/commonshortest/A5SS.hg19_Gencode_v19.gff3 indexed_A5SS_events/ Indexing GFF... - GFF: /Users/hoge/MISO/hg19/gff/commonshortest/A5SS.hg19_Gencode_v19.gff3 - Outputting to: /Users/hoge/MISO/hg19/indexed_A5SS_events Through 0 genes... Through 5000 genes... 中略 Indexing of GFF took 9.55 seconds. (MISO) Guilty-iMac:hg19 hoge$ index_gff --index gff/commonshortest/MXE.hg19_Gencode_v19.gff3 indexed_MXE_events/ Indexing GFF... - GFF: /Users/hoge/MISO/hg19/gff/commonshortest/MXE.hg19_Gencode_v19.gff3 - Outputting to: /Users/hoge/MISO/hg19/indexed_MXE_events Through 0 genes... Through 5000 genes... Loaded 9660 genes - Loading of genes from GFF took 3.86 seconds 中略 Outputting gene records in GFF format... - Output file: /Users/hoge/MISO/hg19/indexed_MXE_events/genes.gff - Serialization of genes from GFF took 5.43 seconds Indexing of GFF took 9.29 seconds. (MISO) Guilty-iMac:hg19 hoge$ index_gff --index gff/commonshortest/SE.hg19_Gencode_v19.gff3 indexed_SE_events/ Indexing GFF... - GFF: /Users/hoge/MISO/hg19/gff/commonshortest/SE.hg19_Gencode_v19.gff3 - Outputting to: /Users/hoge/MISO/hg19/indexed_SE_events Through 0 genes... Through 5000 genes... Through 10000 genes... Through 15000 genes... Through 20000 genes... Through 25000 genes... Through 30000 genes... Through 35000 genes... Through 40000 genes... Through 45000 genes... Loaded 48591 genes - Loading of genes from GFF took 18.83 seconds 中略 Outputting gene records in GFF format... - Output file: /Users/hoge/MISO/hg19/indexed_SE_events/genes.gff - Serialization of genes from GFF took 25.41 seconds Indexing of GFF took 44.23 seconds. (MISO) Guilty-iMac:hg19 hoge$ index_gff --index gff/commonshortest/RI.hg19_Gencode_v19.gff3 indexed_RI_events/ Indexing GFF... - GFF: /Users/hoge/MISO/hg19/gff/commonshortest/RI.hg19_Gencode_v19.gff3 - Outputting to: /Users/hoge/MISO/hg19/indexed_RI_events Through 0 genes... Through 5000 genes... Loaded 8356 genes 中略 Outputting gene records in GFF format... - Output file: /Users/hoge/MISO/hg19/indexed_RI_events/genes.gff - Serialization of genes from GFF took 3.84 seconds Indexing of GFF took 6.07 seconds.
さあできた。
MISOのセッティングファイルを編集
$ nano misopy/settings/miso_settings.txt [data] filter_results = True min_event_reads = 20 [cluster] cluster_command = qsub [sampler] burn_in = 500 lag = 10 num_iters = 5000 num_chains = 6 num_processors = 4
まあこんな感じで。ちなみにバイナリ版のほうにデフォルトで生成されたファイルをコピーしてきた。pipでインストールすると雛形も生成されなかった。
さて、いよいよ解析だな。
スクリプトを準備。
#!/bin/sh source activate MISO exp=SRR1234567 gff=~/MISO/hg19 read_length="101" for type in A3SS A5SS MXE RI SE do gff_index="${gff}/indexed_${type}_events" outputDir="MISO_output_${exp}/${type}_events/" miso --run ${gff_index} ./${exp}_th.bam --output-dir ${outputDir} \ --read-len 101 --settings-filename ./miso_settings.txt summarize_miso --summarize-samples ${outputDir} ${outputDir} done
こんな感じでやってみると・・・
$ miso_test.sh MISO (Mixture of Isoforms model) Probabilistic analysis of RNA-Seq data for detecting differential isoforms Use --help argument to view options. Using MISO settings file: /Users/hoge/MISO/miso_settings.txt Computing Psi values... - GFF index: /Users/hoge/MISO/hg19/indexed_A3SS_events - BAM: /Users/hoge/MISO/SRR1234567_th.bam - Read length: 101 - Output directory: /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events Checking your GFF annotation and BAM for mismatches... 07/07/2017 12:13:27 PM - miso_main - WARNING - Expected BAM index file /Users/hoge/MISO/SRR1234567_th.bam.bai not found. 07/07/2017 12:13:27 PM - miso_main - WARNING - Are you sure your BAM file is indexed? Checking if BAM has mixed read lengths... 07/07/2017 12:13:27 PM - miso_main - WARNING - Found mixed length reads in your BAM file: /Users/hoge/MISO/SRR1234567_th.bam MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation. Read lengths were: 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101 Traceback (most recent call last): File "/Users/hoge/MISO/bin/miso", line 11, in <module> load_entry_point('misopy==0.5.3', 'console_scripts', 'miso')() File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/miso.py", line 591, in main wait_on_jobs=wait_on_jobs) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/miso.py", line 372, in compute_all_genes_psi given_read_len=read_len) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/run_events_analysis.py", line 171, in check_gff_and_bam miso_logger.warning("It looks like your GFF annotation file and your BAM " \ NameError: global name 'miso_logger' is not defined MISO (Mixture of Isoforms model) Summarize MISO output to get Psi values and confidence intervals. Use --help argument to view options. Loading events from: /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events Writing summary to: /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/summary/A3SS_events.miso_summary - Summarized a total of 0 events. 以下略
なんと長さがバラバラのトリミングをされたBAMはだめなんだそうな。そんなバカな・・・
当然ながら続く比較解析も走りはするが解析できない。
#!/bin/sh for type in A3SS A5SS MXE RI SE do outputDir="./MISO_comparison/${type}_events" MISOFile1="./MISO_output_SRR1234567/${type}_events" MISOFile2="./MISO_output_SRR7654321/${type}_events" compare_miso --compare-samples ${MISOFile1} ${MISOFile2} ${outputDir} done
$ miso_compare.sh /Users/hoge/MISO/bin/miso_compare.sh: line 3: $: command not found /Users/hoge/MISO/bin/miso_compare.sh: line 4: syntax error near unexpected token `do' /Users/hoge/MISO/bin/miso_compare.sh: line 4: ` do' (MISO) Guilty-iMac:MISO hoge$ nano bin/miso_compare.sh (MISO) Guilty-iMac:MISO hoge$ miso_compare.sh Making comparisons directory: /Users/hoge/MISO/MISO_comparison/A3SS_events Given output dir: /Users/hoge/MISO/MISO_comparison/A3SS_events Retrieving MISO files in sample directories... Computing sample comparison between /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events and /Users/hoge/MISO/MISO_output_SRR7654321/A3SS_events... - No. of events in /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events: 0 - No. of events in /Users/hoge/MISO/MISO_output_SRR7654321/A3SS_events: 0 Creating comparisons parent directory: /Users/hoge/MISO/MISO_comparison/A3SS_events/A3SS_events_vs_A3SS_events Compared a total of 0 events. Making comparisons directory: /Users/hoge/MISO/MISO_comparison/A5SS_events Given output dir: /Users/hoge/MISO/MISO_comparison/A5SS_events Retrieving MISO files in sample directories... Computing sample comparison between /Users/hoge/MISO/MISO_output_SRR1234567/A5SS_events and /Users/hoge/MISO/MISO_output_SRR7654321/A5SS_events... - No. of events in /Users/hoge/MISO/MISO_output_SRR1234567/A5SS_events: 0 - No. of events in /Users/hoge/MISO/MISO_output_SRR7654321/A5SS_events: 0 Creating comparisons parent directory: /Users/hoge/MISO/MISO_comparison/A5SS_events/A5SS_events_vs_A5SS_events Compared a total of 0 events. Making comparisons directory: /Users/hoge/MISO/MISO_comparison/MXE_events Given output dir: /Users/hoge/MISO/MISO_comparison/MXE_events Retrieving MISO files in sample directories... Computing sample comparison between /Users/hoge/MISO/MISO_output_SRR1234567/MXE_events and /Users/hoge/MISO/MISO_output_SRR7654321/MXE_events... - No. of events in /Users/hoge/MISO/MISO_output_SRR1234567/MXE_events: 0 - No. of events in /Users/hoge/MISO/MISO_output_SRR7654321/MXE_events: 0 Creating comparisons parent directory: /Users/hoge/MISO/MISO_comparison/MXE_events/MXE_events_vs_MXE_events Compared a total of 0 events. Making comparisons directory: /Users/hoge/MISO/MISO_comparison/RI_events Given output dir: /Users/hoge/MISO/MISO_comparison/RI_events Retrieving MISO files in sample directories... Computing sample comparison between /Users/hoge/MISO/MISO_output_SRR1234567/RI_events and /Users/hoge/MISO/MISO_output_SRR7654321/RI_events... - No. of events in /Users/hoge/MISO/MISO_output_SRR1234567/RI_events: 0 - No. of events in /Users/hoge/MISO/MISO_output_SRR7654321/RI_events: 0 Creating comparisons parent directory: /Users/hoge/MISO/MISO_comparison/RI_events/RI_events_vs_RI_events Compared a total of 0 events. Making comparisons directory: /Users/hoge/MISO/MISO_comparison/SE_events Given output dir: /Users/hoge/MISO/MISO_comparison/SE_events Retrieving MISO files in sample directories... Computing sample comparison between /Users/hoge/MISO/MISO_output_SRR1234567/SE_events and /Users/hoge/MISO/MISO_output_SRR7654321/SE_events... - No. of events in /Users/hoge/MISO/MISO_output_SRR1234567/SE_events: 0 - No. of events in /Users/hoge/MISO/MISO_output_SRR7654321/SE_events: 0 Creating comparisons parent directory: /Users/hoge/MISO/MISO_comparison/SE_events/SE_events_vs_SE_events Compared a total of 0 events.
https://github.com/yarden/MISO/compare/dev#diff-d6eca91d09c3b215347394b6ab767f75
ここを参考にmiso_loggerをmain_loggerと書き換えて再実行。
結果、まだ解析はできていない・・・やはりトリミングの問題なのか?
$ miso_test.sh MISO (Mixture of Isoforms model) Probabilistic analysis of RNA-Seq data for detecting differential isoforms Use --help argument to view options. Using MISO settings file: /Users/hoge/MISO/miso_settings.txt Computing Psi values... - GFF index: /Users/hoge/MISO/hg19/indexed_A3SS_events - BAM: /Users/hoge/MISO/SRR1234567_th.bam - Read length: 101 - Output directory: /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events Checking your GFF annotation and BAM for mismatches... Checking if BAM has mixed read lengths... 07/07/2017 02:41:28 PM - miso_main - WARNING - Found mixed length reads in your BAM file: /Users/hoge/MISO/SRR1234567_th.bam MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation. Read lengths were: 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101 07/07/2017 02:41:33 PM - miso_main - WARNING - It looks like your GFF annotation file and your BAM file might not have matching headers (chromosome names.) If this is the case, your run will fail as no reads from the BAM could be matched up with your annotation. 07/07/2017 02:41:33 PM - miso_main - WARNING - Please see: http://genes.mit.edu/burgelab/miso/docs/ for more information. 07/07/2017 02:41:33 PM - miso_main - WARNING - It looks like your GFF annotation (/Users/hoge/MISO/hg19/indexed_A3SS_events) contains 'chr' chromosomes (UCSC-style) while your BAM file (/Users/hoge/MISO/SRR1234567_th.bam) does not. 07/07/2017 02:41:33 PM - miso_main - WARNING - The first few BAM chromosomes were: 1 BAM references: ('1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '3', '4', '5', '6', '7', '8', '9', 'MT', 'X', 'Y') 07/07/2017 02:41:33 PM - miso_main - WARNING - The first few GFF chromosomes were: chr6_apd_hap1,chrY,chrX,chr13,chr12,chr11,chr10,chr17,chr16,chr15,chr14,chr19,chr18,chr6_mann_hap4,chr6_qbl_hap6,chr4_ctg9_hap1,chr19_gl000209_random,chr6_dbb_hap3,chr22,chr20,chr21,chr6_mcf_hap5,chrUn_gl000228,chr6_cox_hap2,chr7,chr6,chr5,chr4,chr3,chr2,chr1,chr9,chr8,chr6_ssto_hap7 07/07/2017 02:41:33 PM - miso_main - WARNING - Run is likely to fail or produce empty output. Proceeding anyway... Mapping genes to their indexed GFF representation, using /Users/hoge/MISO/hg19/indexed_A3SS_events Searching for /Users/hoge/MISO/hg19/indexed_A3SS_events/genes_to_filenames.shelve.. - Found shelved file. Preparing to run 4 batches of jobs... Running batch of 4098 genes.. - Executing: python /Users/hoge/MISO/lib/python2.7/site-packages/misopy/run_miso.py --compute-genes-from-file "/Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/batch-genes/batch-0_genes.txt" /Users/hoge/MISO/SRR1234567_th.bam /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events --read-len 101 --overhang-len 1 --settings-filename /Users/hoge/MISO/miso_settings.txt - Submitted thread batch-0 Running batch of 4098 genes.. - Executing: python /Users/hoge/MISO/lib/python2.7/site-packages/misopy/run_miso.py --compute-genes-from-file "/Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/batch-genes/batch-1_genes.txt" /Users/hoge/MISO/SRR1234567_th.bam /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events --read-len 101 --overhang-len 1 --settings-filename /Users/hoge/MISO/miso_settings.txt - Submitted thread batch-1 Running batch of 4098 genes.. - Executing: python /Users/hoge/MISO/lib/python2.7/site-packages/misopy/run_miso.py --compute-genes-from-file "/Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/batch-genes/batch-2_genes.txt" /Users/hoge/MISO/SRR1234567_th.bam /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events --read-len 101 --overhang-len 1 --settings-filename /Users/hoge/MISO/miso_settings.txt - Submitted thread batch-2 Running batch of 4098 genes.. - Executing: python /Users/hoge/MISO/lib/python2.7/site-packages/misopy/run_miso.py --compute-genes-from-file "/Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/batch-genes/batch-3_genes.txt" /Users/hoge/MISO/SRR1234567_th.bam /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events --read-len 101 --overhang-len 1 --settings-filename /Users/hoge/MISO/miso_settings.txt - Submitted thread batch-3 Waiting on 4 threads... python(39991,0x7fff7215e000) malloc: *** error for object 0x108208560: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug /bin/sh: line 1: 39991 Abort trap: 6 python /Users/hoge/MISO/lib/python2.7/site-packages/misopy/run_miso.py --compute-genes-from-file "/Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/batch-genes/batch-0_genes.txt" /Users/hoge/MISO/SRR1234567_th.bam /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events --read-len 101 --overhang-len 1 --settings-filename /Users/hoge/MISO/miso_settings.txt >> "/Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/batch-logs/batch-0-07-07-17_14:41:49.log" 07/07/2017 02:46:58 PM - miso_main - WARNING - Thread batch-0 might have failed... MISO (Mixture of Isoforms model) Summarize MISO output to get Psi values and confidence intervals. Use --help argument to view options. Loading events from: /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events Writing summary to: /Users/hoge/MISO/MISO_output_SRR1234567/A3SS_events/summary/A3SS_events.miso_summary Traceback (most recent call last): File "/Users/hoge/MISO/bin/summarize_miso", line 11, in <module> load_entry_point('misopy==0.5.3', 'console_scripts', 'summarize_miso')() File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/summarize_miso.py", line 93, in main use_compressed=use_compressed) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/samples_utils.py", line 309, in summarize_sampler_results output_fields = format_credible_intervals(event_name, samples) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/credible_intervals.py", line 22, in format_credible_intervals cred_interval = compute_credible_intervals(samples) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/credible_intervals.py", line 54, in compute_credible_intervals cred_interval = [samples[lower_bound_indx], samples[upper_bound_indx]] IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices 中略 Loading events from: /Users/hoge/MISO/MISO_output_SRR1234567/SE_events Writing summary to: /Users/hoge/MISO/MISO_output_SRR1234567/SE_events/summary/SE_events.miso_summary Traceback (most recent call last): File "/Users/hoge/MISO/bin/summarize_miso", line 11, in <module> load_entry_point('misopy==0.5.3', 'console_scripts', 'summarize_miso')() File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/summarize_miso.py", line 93, in main use_compressed=use_compressed) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/samples_utils.py", line 309, in summarize_sampler_results output_fields = format_credible_intervals(event_name, samples) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/credible_intervals.py", line 22, in format_credible_intervals cred_interval = compute_credible_intervals(samples) File "/Users/hoge/MISO/lib/python2.7/site-packages/misopy/credible_intervals.py", line 54, in compute_credible_intervals cred_interval = [samples[lower_bound_indx], samples[upper_bound_indx]] IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
これについては~/MISO/lib/python2.7/site-packages/misopy/credible_intervals.pyの46行目辺りに修正
# compute the lower bound of the interval # the lower bound is the (alpha/2)*n-th smallest sample, where n is the # number of samples lower_bound_indx = int(round((alpha/2)*num_samples) - 1) # the upper bound is the (1-alpha/2)*n nth smallest sample, where n is # the number of samples upper_bound_indx = int(round((1-alpha/2)*num_samples) - 1)
で回避されるはず。
- miso_main - WARNING - Found mixed length reads in your BAM file: /Users/hoge/MISO/SRR1234567_th.bam MISO does not support mixed read lengths. Please trim your reads to a uniform length and re-run or run MISO separately on each read length. Proceeding anyway, though this may result in errors or inability to match reads to your annotation. Read lengths were: 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101 07/07/2017 02:41:33 PM - miso_main - WARNING - It looks like your GFF annotation file and your BAM file might not have matching headers (chromosome names.) If this is the case, your run will fail as no reads from the BAM could be matched up with your annotation. 07/07/2017 02:41:33 PM - miso_main - WARNING - Please see: http://genes.mit.edu/burgelab/miso/docs/ for more information. 07/07/2017 02:41:33 PM - miso_main - WARNING - It looks like your GFF annotation (/Users/hoge/MISO/hg19/indexed_A3SS_events) contains 'chr' chromosomes (UCSC-style) while your BAM file (/Users/hoge/MISO/SRR1234567_th.bam) does not. 07/07/2017 02:41:33 PM - miso_main - WARNING - The first few BAM chromosomes were: 1 BAM references: ('1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '3', '4', '5', '6', '7', '8', '9', 'MT', 'X', 'Y') 07/07/2017 02:41:33 PM - miso_main - WARNING - The first few GFF chromosomes were: chr6_apd_hap1,chrY,chrX,chr13,chr12,chr11,chr10,chr17,chr16,chr15,chr14,chr19,chr18,chr6_mann_hap4,chr6_qbl_hap6,chr4_ctg9_hap1,chr19_gl000209_random,chr6_dbb_hap3,chr22,chr20,chr21,chr6_mcf_hap5,chrUn_gl000228,chr6_cox_hap2,chr7,chr6,chr5,chr4,chr3,chr2,chr1,chr9,chr8,chr6_ssto_hap7 07/07/2017 02:41:33 PM - miso_main - WARNING - Run is likely to fail or produce empty output. Proceeding anyway...
相変わらずこの辺のwarningは出るも、なんか完走したっぽい。
segmentation fault: 11が出るのは何なんだ?かといって完全に停止するわけでもないし。
settingをいじればエラーでなくなるんだろうか?
trimmingの件もあるので、このソフトウェアについては一旦pendingとする。