Benchmarking

This section contains quality control plots from the unit testing.

Position sampling

The following section looks at the position sampling algorithms.

Segment sampling algorithms

The following plots benchmark the segment sampling behaviour of the various segment sampling algorithms implemented in GAT.

Statistics

For 1-sized fragments (i.e. SNPs), the statistics can be checked against a hypergeometric distribution (sampling without replacement). All the tests below use a single continuous workspace of 1000 nucleotides seeded with a varying number of SNPs.

../test/test_testSingleSNP.TestStatsSNPSampling.png

Test with a single SNP. Here, there are no issues with multiple hits. The workspace contains a single annotation of increasing size (1,3,5,...,99)

../test/test_testMultipleSNPsFullOverlap.TestStatsSNPSampling.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation of size (10, 15, ..., 105). All SNPs overlap the annotated part of the workspace and hence all results are highly signficant.

../test/test_testMultipleSNPsPartialOverlap.TestStatsSNPSampling.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation. Annotations are all of size 10, but the overlap of SNPs with annotations varies from 0 to 10.

Statistics

Gat

SNPs

For 1-sized fragments (i.e. SNPs), the statistics can be checked against a hypergeometric distribution (sampling without replacement). All the tests below use a single continuous workspace of 1000 nucleotides seeded with a varying number of SNPs.

../test/testSingleSNP.TestStatsGat.png

Test with a single SNP. Here, there are no issues with multiple hits. The workspace contains a single annotation of increasing size (1,3,5,...,99)

../test/testMultipleSNPsFullOverlap.TestStatsGat.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation of size (10, 15, ..., 105). All SNPs overlap the annotated part of the workspace and hence all results are highly signficant.

../test/testMultipleSNPsPartialOverlap.TestStatsGat.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation. Annotations are all of size 10, but the overlap of SNPs with annotations varies from 0 to 10.

../test/testWorkspaces.TestStatsGat.png

workspace = 500 segments of size 1000, separated by a gap of 1000 annotations = 500 segments of size 1000, separated by a gap of 1000, shifted up 100 bases segments = a SNP every 100 bp

Intervals

../test/testIntervalsPartialOverlap.TestStatsGat.png

In this test, there is one segment of size 100. Annotations are of size 100 with decreasing overlap.

Annotator

SNPs

For 1-sized fragments (i.e. SNPs), the statistics can be checked against a hypergeometric distribution (sampling without replacement). All the tests below use a single continuous workspace of 1000 nucleotides seeded with a varying number of SNPs.

../test/testSingleSNP.TestStatsTheAnnotator.png

Test with a single SNP. Here, there are no issues with multiple hits. The workspace contains a single annotation of increasing size (1,3,5,...,99)

../test/testMultipleSNPsFullOverlap.TestStatsTheAnnotator.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation of size (10, 15, ..., 105). All SNPs overlap the annotated part of the workspace and hence all results are highly signficant.

../test/testMultipleSNPsPartialOverlap.TestStatsTheAnnotator.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation. Annotations are all of size 10, but the overlap of SNPs with annotations varies from 0 to 10.

../test/testWorkspaces.TestStatsTheAnnotator.png

workspace = 500 segments of size 1000, separated by a gap of 1000 annotations = 500 segments of size 1000, separated by a gap of 1000, shifted up 100 bases segments = a SNP every 100 bp

Intervals

../test/testIntervalsPartialOverlap.TestStatsTheAnnotator.png

In this test, 10 SNPs are in the segment list. The workspace contains a single annotation. Annotations are all of size 10, but the overlap of SNPs with annotations varies from 0 to 10.