A stitch in time: Efficient computation of genomic DNA melting bubbles

Author: A Wada, AT Sumner, BY Tong, C Benham, C Benham, C Flamm, CH Choi, CH Choi, CJ Benham, CJ Benham, CR Calladine, D Poland, DJ Wales, DL Stein, E Carlon, E Carlon, E T√∏stesen, E T√∏stesen, E T√∏stesen, E Yeramian, E Yeramian, E Yeramian, E Yeramian, E Yeramian, Eivind T√∏stesen, F Liu, G Altan-Bonnet, GI Jerstad, GJ King, H Wang, J Stelling, KA Dill, KA Marx, KH Hoffmann, M Fixman, MT Wolfinger, P Ak, R Blossey, RA Dimitrov, RD Blake, T Ambjörnsson, TS van Erp, TS van Erp, TS van Erp
Publisher: Springer Science and Business Media LLC

ABOUT BOOK

Background: It is of biological interest to make genome-wide predictions of the locations of DNA melting bubbles using statistical mechanics models. Computationally, this poses the challenge that a generic search through all combinations of bubble starts and ends is quadratic. Results: An efficient algorithm is described, which shows that the time complexity of the task is O(NlogN) rather than quadratic. The algorithm exploits that bubble lengths may be limited, but without a prior assumption of a maximal bubble length. No approximations, such as windowing, have been introduced to reduce the time complexity. More than just finding the bubbles, the algorithm produces a stitch profile, which is a probabilistic graphical model of bubbles and helical regions. The algorithm applies a probability peak finding method based on a hierarchical analysis of the energy barriers in the Poland-Scheraga model. Conclusions: Exact and fast computation of genomic stitch profiles is thus feasible. Sequences of several megabases have been computed, only limited by computer memory. Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.Comment: 16 pages, 10 figure

Powered by: