kuroの覚え書き

96の個人的覚え書き

rMATSによる解析(HGCスパコンの環境構築)

rMATS
http://rnaseq-mats.sourceforge.net/user_guide.htm#output
これに従ってやってみる。
サンプルデータはSRSF10のknockdownとcontrol
STARのindexが必要とのことなのでダウンロード
http://rmaps.cecsresearch.org/STAR/STARindex.tgz
70GB以上ある巨大ファイルでhg19、hg38、mm10のpre-built indexファイルが含まれている。
今回使用するのはhg19だけ
実行環境は今回からHGCのスパコン
STARindexのダウンロードも計算ノードからやってみたけど計算ノードはグローバルアドレスを付与されていないらしく
外のネットワークからファイルをダウンロードするとかえって遅いらしい。
ここはログインノードでwgetしときゃよかったかも。

スパコンに作業させるのは初めてなので下準備
まずは.bashrcにPATHを記述しておく。
結構いろいろなツールが/usr/local以下にインストールされているがパスが通っているものと通ってないものがあるようなので

source /etc/bashrc

#general PATH
export PATH=/usr/local:${PATH}

#python2.7
export PYTHONHOME=/usr/local/package/python2.7/current
export PYTHONPATH=~/local/lib/python2.7/site-packages
export PATH=${PYTHONHOME}/bin:${PATH}
export LD_LIBRARY_PATH=${PYTHONHOME}/lib:${LD_LIBRARY_PATH}

#for R
export PATH=/usr/local/package/r/current3/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/package/r/current3/lib64/R/lib:${LD_LIBRARY_PATH}
export R_LIBS=~/.R


#for fastq
export PATH=/usr/local/package/sra/2.1.7/current:${PATH}
export PATH=/usr/local/package/trimmomatic/0.36:${PATH}

#for expression analysis
export PATH=/usr/local/package/tophat/current:${PATH}
export PATH=/usr/local/package/hisat2/current:${PATH}
export PATH=/usr/local/package/cufflinks/current:${PATH}

これくらいを入れておく。samtoolsにはなぜかPATHが通されていた。

pythonとRの設定はモジュール等をホームディレクトリ以下にインストールするために必要。

$ pip list
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
backports.ssl-match-hostname (3.4.0.2)
bcbio-gff (0.6.2)
biopython (1.65)
brewer2mpl (1.4.1)
certifi (2015.4.28)
clint2 (0.3.2)
configobj (5.0.6)
Cython (0.22.1)
DendroPy (4.0.3)
EEL (150708)
egenix-mx-base (3.2.8)
enum34 (1.0.4)
futures (3.0.3)
h5py (2.5.0)
HTSeq (0.6.1)
ipython (3.2.0)
leveldb (0.193)
MACS (1.4.3)
MACS2 (2.1.1.20160309)
matplotlib (1.4.3)
mdtraj (1.4.2)
mercurial (3.4.2)
mock (1.0.1)
more-itertools (2.2)
MySQL-python (1.2.5)
natsort (4.0.3)
nose (1.3.7)
numexpr (2.4.3)
numpy (1.12.0)
pandas (0.16.2)
parse (1.4.1)
Pillow (2.9.0)
pip (9.0.1)
PyClone (0.13.0)
PyDP (0.2.3)
pygist (0.211)
pygraphviz (1.2)
pyparsing (2.0.3)
pysam (0.8.3)
pysql (0.16)
pysqlite (2.6.3)
python-dateutil (2.4.2)
python-memcached (1.54)
pytz (2015.4)
PyVCF (0.6.7)
PyYAML (3.11)
pyzmq (14.7.0)
reportlab (3.2.0)
requests (2.7.0)
rpy2 (2.8.5)
scikit-learn (0.17.1)
scipy (0.15.1)
SCons (2.3.0)
scripttest (1.3)
seaborn (0.7.1)
setuptools (18.0.1)
shove (0.5.6)
singledispatch (3.4.0.3)
six (1.10.0)
SQLAlchemy (1.0.6)
stuf (0.9.4)
tables (3.2.0)
termcolor (1.1.0)
Theano (0.7.0)
tornado (4.2)
wheel (0.29.0)

たいていのものはすでにインストールされているとは思う。

まずはダウンロードしたソフトとindexのテスト
~/bin/rmats.sh

#!bin/sh

#$ -cwd
#$ -S /bin/bash

./testRun.sh /home/hoge/genome_reference/STARindex/hg19

このようなスクリプトを用意し

$ cd local/bin/rMATS.3.2.5/
$ qsub  ./rmats.sh

とすると
bam_test
out_test
の2つのディレクトリが生成される。
output

Testing rMATS with BAM input files
 
Testing MATS with FASTQ input files..
This step involves mapping to GTF and could take up to an hour..
 
Testing MATS finished..
$ ./testRun.sh ~/genome_reference/STARindex/hg19
Testing rMATS with BAM input files
========== SE ========
Junction Counts Only: There are 25 AS events. Of these, 1 events are statistically significant
1 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
Junction Counts and Reads on target Exon Counts: There are 30 AS events. Of these, 1 events are statistically significant
1 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
========== MXE ========
Junction Counts Only: There are 13 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
Junction Counts and Reads on target Exon Counts: There are 18 AS events. Of these, 4 events are statistically significant
4 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
========== A5SS ========
Junction Counts Only: There are 2 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
Junction Counts and Reads on target Exon Counts: There are 4 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
========== A3SS ========
Junction Counts Only: There are 6 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
Junction Counts and Reads on target Exon Counts: There are 9 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
========== RI ========
Junction Counts Only: There are 10 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
Junction Counts and Reads on target Exon Counts: There are 25 AS events. Of these, 0 events are statistically significant
0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
 
Testing MATS with FASTQ input files..
This step involves mapping to GTF and could take up to an hour..
 
Testing MATS finished..

エラー等出なかったので問題ないと判断し、次に進む。

python ./RNASeq-MATS.py -b1 /home/hoge/SRSF10/hisat_results/control/control.sort.bam \
        -b2 /home/hoge/SRSF10/hisat_results/knockdown/knockdown.sort.bam \
        -gtf /home/hoge/genome_reference/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf \
        -o /home/hoge/SRSF10/hisat_results/rMATS -t paired -len 90 -bi /home/hoge/genome_reference/STARindex/hg19

こんな感じで(\は改行のバックスラッシュ)
gtfファイルは実はrMATSにも用意されていたようで、そちらを使ったほうが良いかもしれない。

しかしこの結果出来上がったresultフォルダに目的のファイルは出来上がらなかった。なぜだ?

ここでマニュアルを再読してみてわかったこと。
BAMファイルを出発点にする場合、STARindexはいらない。-biオプションはfastqファイルを使うときのみ。
その他オプションはとりあえずデフォルトで良さそう。bamで-biオプションをつけたときにどういう挙動になるのかは不明。

いっそHISAT2でマッピングしたBAMを出発点にせずfastqから処理したほうがいいのかもしれない。時間はかなり掛かりそうだが。

testRunのログだと

2017-07-20 17:27:52,379 rMATS version: 3.2.5
2017-07-20 17:27:52,380 Start the program with [RNASeq-MATS.py -b1 testData/231ESRP.25K.re
p-1.bam,testData/231ESRP.25K.rep-2.bam -b2 testData/231EV.25K.rep-1.bam,testData/231EV.25K
.rep-2.bam -gtf testData/test.gtf -o bam_test -t paired -len 50 -a 8 -c 0.0001 -analysis U
 -novelSS 1 -keepTemp ]

2017-07-20 17:27:52,407 ################### folder names and associated input files ######
#######
2017-07-20 17:27:52,407 SAMPLE_1\REP_1  testData/231ESRP.25K.rep-1.bam
2017-07-20 17:27:52,407 SAMPLE_1\REP_2  testData/231ESRP.25K.rep-2.bam
2017-07-20 17:27:52,407 SAMPLE_2\REP_1  testData/231EV.25K.rep-1.bam
2017-07-20 17:27:52,407 SAMPLE_2\REP_2  testData/231EV.25K.rep-2.bam
2017-07-20 17:27:52,407 ##################################################################
#######

2017-07-20 17:27:52,408 start mapping..
2017-07-20 17:27:52,408 bam files are provided. skip mapping..
2017-07-20 17:27:52,408 done mapping..
2017-07-20 17:27:52,408 indexing bam files to use pysam
2017-07-20 17:27:52,408 getting unique SAM function..
2017-07-20 17:27:56,899 done indexing bam files..
2017-07-20 17:27:56,903 start getting AS events from GTF and BAM files
2017-07-20 17:27:56,904 getting AS events function..
2017-07-20 17:28:00,245 getting AS events is done with status 0
2017-07-20 17:28:00,246 
2017-07-20 17:28:00,246 done getting AS events..
2017-07-20 17:28:00,260 Setting proper string
2017-07-20 17:28:00,501 start making MATS input files from AS events and SAM files
2017-07-20 17:28:00,501 making MATS input function..
2017-07-20 17:28:03,429 making MATS input is done with status 0
2017-07-20 17:28:03,429 
2017-07-20 17:28:03,429 done making MATS input..
2017-07-20 17:28:03,430 start running MATS for each AS event
2017-07-20 17:28:03,430 running MATS for SE. Using Junction Counts only
2017-07-20 17:28:06,543 running MATS for SE using JC is done with status 0
2017-07-20 17:28:06,543 
2017-07-20 17:28:06,543 running MATS for SE. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:08,944 running MATS for SE using JCEC is done with status 0
2017-07-20 17:28:08,945 
2017-07-20 17:28:08,945 running MATS for MXE. Using Junction Counts only
2017-07-20 17:28:10,500 running MATS for MXE using JC is done with status 0
2017-07-20 17:28:10,501 
2017-07-20 17:28:10,501 running MATS for MXE. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:12,085 running MATS for MXE using JCEC is done with status 0
2017-07-20 17:28:12,085 
2017-07-20 17:28:12,085 running MATS for A5SS. Using Junction Counts only
2017-07-20 17:28:12,696 running MATS for A5SS using JC is done with status 0
2017-07-20 17:28:12,696 
2017-07-20 17:28:12,697 running MATS for A5SS. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:13,277 running MATS for A5SS using JCEC is done with status 0
2017-07-20 17:28:13,277 
2017-07-20 17:28:13,277 running MATS for A3SS. Using Junction Counts only
2017-07-20 17:28:14,106 running MATS for A3SS using JC is done with status 0
2017-07-20 17:28:14,107 
2017-07-20 17:28:14,107 running MATS for A3SS. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:14,941 running MATS for A3SS using JCEC is done with status 0
2017-07-20 17:28:14,942 
2017-07-20 17:28:14,942 running MATS for RI. Using Junction Counts only
2017-07-20 17:28:16,705 running MATS for RI using JC is done with status 0
2017-07-20 17:28:16,705 
2017-07-20 17:28:16,705 running MATS for RI. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,265 running MATS for RI using JCEC is done with status 0
2017-07-20 17:28:19,265 
2017-07-20 17:28:19,265 done running MATS for all AS event types..
2017-07-20 17:28:19,265 start joining MATS results for each AS event
2017-07-20 17:28:19,265 joining MATS for SE. Using Junction Counts only
2017-07-20 17:28:19,341 joining MATS for SE using JC is done with status 0
2017-07-20 17:28:19,341 
2017-07-20 17:28:19,341 joining MATS for SE. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,400 joining MATS for SE using JCEC is done with status 0
2017-07-20 17:28:19,400 
2017-07-20 17:28:19,400 joining MATS for MXE. Using Junction Counts only
2017-07-20 17:28:19,445 joining MATS for MXE using JC is done with status 0
2017-07-20 17:28:19,446 
2017-07-20 17:28:19,446 joining MATS for MXE. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,534 joining MATS for MXE using JCEC is done with status 0
2017-07-20 17:28:19,534 
2017-07-20 17:28:19,534 joining MATS for A5SS. Using Junction Counts only
2017-07-20 17:28:19,577 joining MATS for A5SS using JC is done with status 0
2017-07-20 17:28:19,577 
2017-07-20 17:28:19,577 joining MATS for A5SS. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,623 joining MATS for A5SS using JCEC is done with status 0
2017-07-20 17:28:19,624 
2017-07-20 17:28:19,624 joining MATS for A3SS. Using Junction Counts only
2017-07-20 17:28:19,662 joining MATS for A3SS using JC is done with status 0
2017-07-20 17:28:19,662 
2017-07-20 17:28:19,662 joining MATS for A3SS. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,706 joining MATS for A3SS using JCEC is done with status 0
2017-07-20 17:28:19,707 
2017-07-20 17:28:19,707 joining MATS for RI. Using Junction Counts only
2017-07-20 17:28:19,744 joining MATS for RI using JC is done with status 0
2017-07-20 17:28:19,744 
2017-07-20 17:28:19,744 joining MATS for RI. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,771 joining MATS for RI using JCEC is done with status 0
2017-07-20 17:28:19,771 
2017-07-20 17:28:19,771 done joining MATS results..
2017-07-20 17:28:19,772 ======================= Final Report =============
2017-07-20 17:28:19,772 getting stats for SE. Using Junction Counts only.
2017-07-20 17:28:19,784 getting stats for SE using JC is done with status 0,0 and 0
2017-07-20 17:28:19,785 numLines: 26 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/SE.MATS.JunctionCountOnly.txt
2017-07-20 17:28:19,785 Upregulated: 1
2017-07-20 17:28:19,785 Downregulated: 0
2017-07-20 17:28:19,785 ========== SE ========
2017-07-20 17:28:19,785 Junction Counts Only: There are 25 AS events. Of these, 1 events are statistically significant
2017-07-20 17:28:19,785 1 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,785 getting stats for SE. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,797 getting stats for SE using JCEC is done with status 0,0 and 0
2017-07-20 17:28:19,797 numLines: 31 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/SE.MATS.ReadsOnTargetAndJunctionCounts.txt
2017-07-20 17:28:19,798 Upregulated: 1
2017-07-20 17:28:19,798 Downregulated: 0
2017-07-20 17:28:19,798 Junction Counts and Reads on target Exon Counts: There are 30 AS events. Of these, 1 events are statistically significant
2017-07-20 17:28:19,798 1 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,798 getting stats for MXE. Using Junction Counts only.
2017-07-20 17:28:19,810 getting stats for MXE using JC is done with status 0,0 and 0
2017-07-20 17:28:19,810 numLines: 14 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/MXE.MATS.JunctionCountOnly.txt
2017-07-20 17:28:19,810 Upregulated: 0
2017-07-20 17:28:19,810 Downregulated: 0
2017-07-20 17:28:19,810 ========== MXE ========
2017-07-20 17:28:19,810 Junction Counts Only: There are 13 AS events. Of these, 0 events a
re statistically significant
2017-07-20 17:28:19,811 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,811 getting stats for MXE. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,823 getting stats for MXE using JCEC is done with status 0,0 and 0
2017-07-20 17:28:19,823 numLines: 19 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/MXE.MATS.ReadsOnTargetAndJunctionCounts.txt
2017-07-20 17:28:19,823 Upregulated: 4
2017-07-20 17:28:19,823 Downregulated: 0
2017-07-20 17:28:19,824 Junction Counts and Reads on target Exon Counts: There are 18 AS events. Of these, 4 events are statistically significant
2017-07-20 17:28:19,824 4 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,824 getting stats for A5SS. Using Junction Counts only.
2017-07-20 17:28:19,836 getting stats for A5SS using JC is done with status 0,0 and 0
2017-07-20 17:28:19,836 numLines: 3 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/A5SS.MATS.JunctionCountOnly.txt
2017-07-20 17:28:19,836 Upregulated: 0
2017-07-20 17:28:19,836 Downregulated: 0
2017-07-20 17:28:19,836 ========== A5SS ========
2017-07-20 17:28:19,837 Junction Counts Only: There are 2 AS events. Of these, 0 events are statistically significant
2017-07-20 17:28:19,837 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,837 getting stats for A5SS. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,848 getting stats for A5SS using JCEC is done with status 0,0 and 0
2017-07-20 17:28:19,849 numLines: 5 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/A5SS.MATS.ReadsOnTargetAndJunctionCounts.txt
2017-07-20 17:28:19,849 Upregulated: 0
2017-07-20 17:28:19,849 Downregulated: 0
2017-07-20 17:28:19,849 Junction Counts and Reads on target Exon Counts: There are 4 AS events. Of these, 0 events are statistically significant
2017-07-20 17:28:19,849 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,849 getting stats for A3SS. Using Junction Counts only.
2017-07-20 17:28:19,861 getting stats for A3SS using JC is done with status 0,0 and 0
2017-07-20 17:28:19,861 numLines: 7 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/A3SS.MATS.JunctionCountOnly.txt
2017-07-20 17:28:19,862 Upregulated: 0
2017-07-20 17:28:19,862 Downregulated: 0
2017-07-20 17:28:19,862 ========== A3SS ========
2017-07-20 17:28:19,862 Junction Counts Only: There are 6 AS events. Of these, 0 events are statistically significant
2017-07-20 17:28:19,862 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,862 getting stats for A3SS. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,874 getting stats for A3SS using JCEC is done with status 0,0 and 0
2017-07-20 17:28:19,874 numLines: 10 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/A3SS.MATS.ReadsOnTargetAndJunctionCounts.txt
2017-07-20 17:28:19,874 Upregulated: 0
2017-07-20 17:28:19,875 Downregulated: 0
2017-07-20 17:28:19,875 Junction Counts and Reads on target Exon Counts: There are 9 AS events. Of these, 0 events are statistically significant
2017-07-20 17:28:19,875 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,875 getting stats for RI. Using Junction Counts only.
2017-07-20 17:28:19,887 getting stats for RI using JC is done with status 0,0 and 0
2017-07-20 17:28:19,887 numLines: 11 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/RI.MATS.JunctionCountOnly.txt
2017-07-20 17:28:19,887 Upregulated: 0
2017-07-20 17:28:19,887 Downregulated: 0
2017-07-20 17:28:19,887 ========== RI ========
2017-07-20 17:28:19,887 Junction Counts Only: There are 10 AS events. Of these, 0 events are statistically significant
2017-07-20 17:28:19,887 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,888 getting stats for RI. Using Junction Counts and Reads on target Exon Counts
2017-07-20 17:28:19,899 getting stats for RI using JCEC is done with status 0,0 and 0
2017-07-20 17:28:19,900 numLines: 26 /yshare1/home/hoge/local/bin/rMATS.3.2.5/bam_test/MATS_output/RI.MATS.ReadsOnTargetAndJunctionCounts.txt
2017-07-20 17:28:19,900 Upregulated: 0
2017-07-20 17:28:19,900 Downregulated: 0
2017-07-20 17:28:19,900 Junction Counts and Reads on target Exon Counts: There are 25 AS events. Of these, 0 events are statistically significant
2017-07-20 17:28:19,900 0 significant events have higher inclusion level for SAMPLE_1 and 0 events for SAMPLE_2
2017-07-20 17:28:19,900 done printing out stats..
2017-07-20 17:28:19,901 Program ended
2017-07-20 17:28:19,901 Program ran 00:00:27

それがサンプルを使うと

2017-07-21 10:42:36,165 rMATS version: 3.2.5
2017-07-21 10:42:36,166 Start the program with [./RNASeq-MATS.py -b1 /home//SRSF10/hisat_results/control/control.sort.bam -b2 /home/hoge/SRSF10/hisat_results/knockdown/knockdown.sort.bam -gtf /home/hoge/local/bin/rMATS.3.2.5/gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf -o /home/hoge/SRSF10/hisat_results/rMATS -t paired -len 90 -c 0.0001 -analysis U -novelSS 1 -keepTemp ]

2017-07-21 10:42:36,183 ################### folder names and associated input files #############
2017-07-21 10:42:36,183 SAMPLE_1\REP_1  /home/hoge/SRSF10/hisat_results/control/control.sort.bam
2017-07-21 10:42:36,183 SAMPLE_2\REP_1  /home/hoge/SRSF10/hisat_results/knockdown/knockdown.sort.bam
2017-07-21 10:42:36,183 #########################################################################

2017-07-21 10:42:36,183 start mapping..
2017-07-21 10:42:36,184 bam files are provided. skip mapping..
2017-07-21 10:42:36,184 done mapping..
2017-07-21 10:42:36,184 indexing bam files to use pysam
2017-07-21 10:42:36,184 getting unique SAM function..
2017-07-21 12:04:27,736 done indexing bam files..
2017-07-21 12:04:27,818 start getting AS events from GTF and BAM files
2017-07-21 12:04:27,819 getting AS events function..
2017-07-21 12:27:47,590 getting AS events is done with status 0
2017-07-21 12:27:47,591 
2017-07-21 12:27:47,591 done getting AS events..
2017-07-21 12:27:47,640 Setting proper string
2017-07-21 12:27:48,105 start making MATS input files from AS events and SAM files
2017-07-21 12:27:48,105 making MATS input function..
2017-07-21 12:52:47,710 making MATS input is done with status 0
2017-07-21 12:52:47,712 
2017-07-21 12:52:47,712 done making MATS input..
2017-07-21 12:52:47,712 start running MATS for each AS event
2017-07-21 12:52:47,712 running MATS for SE. Using Junction Counts only

ここで終わってしまっている。原因は特定できない。

ということでfastqからやるかと

sample_dir=/home/hoge/SRSF10/original_fastq/

python ./RNASeq-MATS.py \
        -s1 ${sample_dir}/SRR1271845_1.fastq:${sample_dir}/SRR1271845_2.fastq \
        -s2 ${sample_dir}/SRR1271846_1.fastq:${sample_dir}/SRR1271846_2.fastq \
        -gtf /home/hoge/local/bin/rMATS.3.2.5/gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf \
        -bi /home/hoge/genome_reference/STARindex/hg19 \
        -o /home/hoge/SRSF10/rMATSfq \
        -t paired -len 90 -c 0.0001 -keepTemp

こういうジョブを投げてみたところ

2017-07-21 12:38:40,405 rMATS version: 3.2.5
2017-07-21 12:38:40,406 Start the program with [./RNASeq-MATS.py -s1 /home/hoge/SRSF10/original_fastq//SRR1271845_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271845_2.fastq -s2 /home/hoge/SRSF10/original_fastq//SRR1271846_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271846_2.fastq -gtf /home/hoge/local/bin/rMATS.3.2.5/gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf -bi /home/hoge/genome_reference/STARindex/hg19 -o /home/hoge/SRSF10/rMATSfq -t paired -len 90 -c 0.0001 -keepTemp ]

2017-07-21 12:38:40,426 ################### folder names and associated input files #############
2017-07-21 12:38:40,426 SAMPLE_1\REP_1  /home/hoge/SRSF10/original_fastq//SRR1271845_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271845_2.fastq
2017-07-21 12:38:40,426 SAMPLE_2\REP_1  /home/hoge/SRSF10/original_fastq//SRR1271846_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271846_2.fastq
2017-07-21 12:38:40,426 #########################################################################

2017-07-21 12:38:40,426 start mapping..
2017-07-21 12:38:40,427 mapping the first sample
2017-07-21 12:38:40,431 mapping sample_1, rep_1 is done with status 32512
2017-07-21 12:38:40,431 error in mapping sample_1, rep_1: 32512
2017-07-21 12:38:40,431 error detail: sh: STAR: コマンドが見つかりません
2017-07-21 12:38:40,431 There is an exception in mapping
2017-07-21 12:38:40,431 Exception: <type 'exceptions.Exception'>
2017-07-21 12:38:40,431 Detail: 

STARがインストールされてないぞと、すげなくエラー。
はいそうですかとダウンロードして、PATHも通して、もう一回なげると

2017-07-21 12:49:12,388 rMATS version: 3.2.5
2017-07-21 12:49:12,389 Start the program with [./RNASeq-MATS.py -s1 /home/hoge/SRSF10/original_fastq//SRR1271845_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271845_2.fastq -s2 /home/hoge/SRSF10/original_fastq//SRR1271846_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271846_2.fastq -gtf /home/hoge/local/bin/rMATS.3.2.5/gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf -bi /home/hoge/genome_reference/STARindex/hg19 -o /home/hoge/SRSF10/rMATSfq -t paired -len 90 -c 0.0001 -keepTemp ]

2017-07-21 12:49:12,408 ################### folder names and associated input files #############
2017-07-21 12:49:12,409 SAMPLE_1\REP_1  /home/hoge/SRSF10/original_fastq//SRR1271845_1.fastq:/home/hoge/SRSF10/original_fastq//SRR1271845_2.fastq
2017-07-21 12:49:12,409 SAMPLE_2\REP_1  /home/hoge/SRSF10/original_fastq//SRR1271846_1.fastq:/home/hoge hoge/SRSF10/original_fastq//SRR1271846_2.fastq
2017-07-21 12:49:12,409 #########################################################################

2017-07-21 12:49:12,409 start mapping..
2017-07-21 12:49:12,409 mapping the first sample
2017-07-21 12:49:12,539 mapping sample_1, rep_1 is done with status 27648
2017-07-21 12:49:12,539 error in mapping sample_1, rep_1: 27648
2017-07-21 12:49:12,539 error detail: Jul 21 12:49:12 ..... started STAR run
Jul 21 12:49:12 ..... loading genome

EXITING: fatal error trying to allocate genome arrays, exception thrown: std::bad_alloc
Possible cause 1: not enough RAM. Check if you have enough RAM 30620442518 bytes
Possible cause 2: not enough virtual memory allowed with ulimit. SOLUTION: run ulimit -v 30620442518

Jul 21 12:49:12 ...... FATAL ERROR, exiting
2017-07-21 12:49:12,539 There is an exception in mapping
2017-07-21 12:49:12,540 Exception: <type 'exceptions.Exception'>
2017-07-21 12:49:12,540 Detail: 

メモリが足らんて・・・スパコン舐めとんのか。

識者に聞いてみるとあるあるなようで、結局アルゴリズムをおさえて自分でプログラム書いちゃったほうが早いかもね。だそうで。