Haplogroup R and Subclades

DNA Helix

Haplogroup R is defined by rs2032658 also known as M207. The group is believed to have developed about 19,000 to 34,000 years ago in Central Asia. In modern times descendants are common in Europe, South Asia, and Central Asia.

This site's emphasis is on collecting the original BAM raw data when possible to construct a phylogentic tree using the GRCh38 human genome reference. To contribute please use the Submission Tool.

Supporting data from publicly available Haplogroup R related repositories is integrated as a service to the community. The Kits page contains a cross-reference list to track sample donors between labs. This allows testers to be placed to their closest branch in the Experimental Tree. To aid converting coordinates of variants placed in the tree, consult the Variant Index.

YSEQ customers are encouraged to join Group 223: haplogroup-r.org Public Results. This group is the primary location monitored to collect new sequencing results.



Addressed an issue with group assignment reported by R-FGC5494 samples. The problem was caused by recurrent SNPs in branches related within 3000 years and sparse data loads. This has introduced a new control value for the best fit algorithm, which may require additional training. Please report any issues you may notice.

Adjusted the algorithm for the "Private Variants" report. Diploid (calls having more than one possible allele value) positions with low read depths have been removed.

"Known SNPs" report enhanced to show immediate descendants of the kit's terminal branch. This is intended to further support the placement, but could be used as a guide to see if any shared downstream variants remain to be tested.

Removed all terminal tree branch leaves with less than two supporting kits. Future updates will remove interior branches without supporting splits.


Matrix report generation is on-hold while enhancing the software. They are expected to resume late-August or early-September.


The kit buttons in the experimental tree are now active. The popup report lists private mutations and coverage statistics from any NGS/WGS testing. There is also a STR Information tab that will display any known testing. These do not contain inferred values from BAMs at this time.


The original haplogroup classifier written in 2015 has been replaced with a new algorithm. The new approach fixes all known edge-case problems with recurrent mutations as long as sufficient SNPs are tested. This will enable phase 1 of a new automated tree builder for NGS testers. The current tree will be archived and a fresh version developed from the 2017 Y Tree skeleton is coming using only BAM submissions to establish branching. This reset will shed branches not supported the Call Matrix.