None
Accessible Page | test-statistic

Co-localization analyses of genomic features: Methods and potential pitfalls

Chakravarthi Kanduri(skanduri@ifi.uio.no), Christoph Bock(cbock@cemm.oeaw.ac.at),  Sveinung Gundersen (sveinungu@ifi.uio.no), Eivind Hovig (ehovig@ifi.uio.no), Geir Kjetil Sandve(geirksa@ifi.uio.no)

(This page provides examples of the pitfalls related to co-localization analysis of genomic features. All the examples are provided through the Genomic HyperBrowser, which is tightly connected to the Galaxy framework. Users not familiar with Galaxy framework can quickly get familiar by following a quick introduction tutorial here (https://galaxyproject.org/tutorials/g101/).)

Example-5: Be aware whether the test statistic is asymmetric

It is noteworthy that some forms of test statistic used in co-localization analyses are asymmetric in nature. This means that the direction of analyses will influence the conclusions. In other words, is A closer to B may not necessarily be the same as is B closer to A. Therefore, it is important to be aware of this potential asymmetry and the direction of research question formulation should be guided by biological knowledge or intuition.

As an example, we evaluate the co-localization of the binding sites of two transcription factors p300 and IRF3. p300 is known to be a co-activator and is known to be recruited by several transcription factors, therefore it has several binding sites across the DNA sequence. When we asked the question whether IRF3 peaks are falling inside p300 peaks more than expected by chance, the observed test statistic is 4154 (history element-3). When we asked the question in reverse direction, the observed test statistic is 2590 (history element-4). Irrespective of their statistical significance, these test statistics show the direction of stronger effect. Here, it tells us that IRF3 peaks are more likely to co-localize with p300 peaks than the effect in opposite direction. A similar question in both directions show that IRF3 peaks are closer to p300 peaks than p300 peaks being closer to IRF3 peaks (history elements 6 and 7). However, in certain question formulations, it is sufficient to inquire the shared sequence (overlap and coverage) of both the tracks. In such cases, the test statistic in both directions would be the same (history element 5).