How to obtain InsertSize from a BioMap object?

9 vues (au cours des 30 derniers jours)
Razvan
Razvan le 26 Jan 2015
Modifié(e) : Razvan le 7 Fév 2015
Is there a way of obtaining the InsertSize field (available using bamread) when opening a bam file as a BioMap object? Opening the bam files using a BioMap object is much faster compared to the bamread function, but I can't find any way of obtaining the InsertSize (TLEN from the corresponding sam file)...

Réponses (2)

Luuk van Oosten
Luuk van Oosten le 26 Jan 2015
If you use something like:
BAMStruct = bamread(File,RefSeq,Range)
You get a struct. from this you can obtain the InsertSize by using something like:
x = BAMStruct.InsertSize
In the examples of the help doc (which are here) you will find something similar; they show an example how they extract instead of the 'InsertSize' the 'Position' by using the following:
data_one = bamread('ex1.bam','seq1', [100 300]);
data_one(27).Position
Good luck!
  3 commentaires
Luuk van Oosten
Luuk van Oosten le 28 Jan 2015
Can you provide a .bam file for me and you code? I'll give it a try!
Razvan
Razvan le 31 Jan 2015
Modifié(e) : Razvan le 7 Fév 2015
Thanks for spending your time on this! Here is a BAM file and some Matlab code that illustrates my problem: https://app.box.com/s/e3sjphe802z616gzwzelse0o89py6vlm
If you run my script from that folder, you'll see the 2 ways that I know to open a bam file and to compute the fragment lengths. Using bamread, the TLEN field from the SAM/BAM file is easily accessible. Using BioMap objects, the TLEN/InsertSize information is not available anymore and one needs to match the headers of the reads in order to compute to total fragment length. The problem is that bamread is very slow in comparison with BioMap (3s vs. 0.2s), and sorting the headers of millions of reads is also pretty slow. It would be much better if BioMap would simply load the TLEN field of the SAM/BAM files. This is available information stored in the SAM/BAM file which is simply disregarded by BioMap (as far as I know). Any chance of getting this TLEN info using the faster BioMap loading function?

Connectez-vous pour commenter.


Razvan
Razvan le 5 Fév 2015
Any other suggestions?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by