You are on page 1of 19

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013

ISSN: 1920-2997 http://ru.rjgg.org


RJGG

84

__________________________________________________________

Received: December 14 2013; accepted: December 16 2013;
published: January 8 2014
Correspondence: gurianov.vm@gmail.com napobo3@gmail.com
acgt@yfull.com




Phylogenetic Structure
of Q-M378 Subclade
Based On Full
Y-Chromosome Sequencing







Vladimir Gurianov
1

Leon Kull
2

Roman Sychev
3

Vladimir Tagankin
3

Vadim Urasin
3


1
The Q-L275 Research Project, Russia,

2
Full Genomes Corporation, USA,

3
YFull research group, Russia.


Abstract

Q-M378 subclade, which is downstream of Q-L275 haplogroup, is marked by a wide area of its distribution
and a minor share of presence in modern populations of Eurasia. Phylogenetic structure of the subclade, known
so far, did not allow for matching SNP Y-chromosomes to specific populations and to reconstruct possible direc-
tions of their migrations in retrospect.
The conducted research enabled us to form a consistent phylogenetic structure of Q-M378 subclade, validated
by analysis of SNP and STR-markers, based on the data of full Y-chromosome sequencing using next generation
sequencers. As part of the research, new phylogenetic levels of Q-Y2250 (downstream of Q-M378 and including
Q-L301), Q-Y2220 (downstream of Q-L245), Q-Y2200 (downstream of Q-Y2220) were defined.
SNPs, which, in the future, may possibly mark certain European and Asian subclusters of Q-Y2220 (including
the Armenian subcluster), as well as separate branches of the Jewish cluster Q-Y2200, were defined as well.
The research also confirmed connection of Q-M378 subclade distribution with migration of Indo-European
language carriers from Central Asia via Afghanistan and Iran to the West.

Introduction

The Q-M378 subclade
1
, downstream of Q-
L275 haplogroup, is present in a number of pop-
ulations in Europe, Southwest (Western)
2
and
Southern Asia
3
, and also in the Central Asia all
the way to North-West China
4
.

1
yDNA Haplogroup Q and its Subclades 2013 -
http://www.isogg.org/tree/ISOGG_HapgrpQ.html. Hereinafter subclades are referenced in
line with ISOGG notation (International Society of Genetic Genealogy) specifying
single nucleotide polymorphism (SNP) typical for a respective subclade.
2 Cinnioglu et al, Excavating Y-chromosome haplotype strata in Anatolia, 2003.
Haplotypes 337-339 according to predictor by Urasin (http://predictor.ydna.ru/) are
positive to SNP M378. All samples belong to Central-Anatolian and East-Anatolian
regions of Turkey.
3 Sanghamitra Sengupta et al., Polarity and Temporality of High-Resolution Y-
Chromosome Distributions in India Identify Both Indigenous and Exogenous Ex-
pansions and Reveal Minor Genetic Influence of Central Asian Pastoralists, Am J
Hum Genet. 2006 February; 78(2): 202221.
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1380230/ (among the tested inhabitants of
Pakistan 2 out of 176 or 1.14% were positive to SNP M378; SNP M378 was not
identified among sample groups in India and Eastern Asia).
4 Zhong et al., Extended Y-chromosome investigation suggests post-Glacial mi-
grations of modern humans into East Asia via the northern route // Molecular Bi-
ology and Evolution, First published online: September 13, 2010, doi:
10.1093/molbev/msq247 (among four populations of Uigurs from Xinjiang one
such person was found in each of the two populations: 1 out of 71, 1 out of 18).

One of the peculiar features of Q-M378 sub-
clade is a relatively wide area of its distribution
(connected with migrations of ancestral popula-
tions of the Indo-European language family) and
an extremely low percentage in almost all popu-
lations (modern ethnic groups), where it has
been reported by now. The exception is the Jew-
ish Diaspora (primarily Ashkenazi Jews), where
Q-M378 subclade share reaches 5.2 to 7 percent
(Behar 2004
5
, Hammer 2009
6
). Therefore, Q-
M378 locality is often associated with the Middle
East. In the meantime, a more comprehensive
analysis of research data and publicly available
data of commercial tests enables us to draw a
conclusion on more complex and rather unob-

5 Behar DM, Garrigan D, Kaplan ME, Mobasher Z, Rosengarten D, Karafet TM, Quintana-Murci L, Ost-
rer H, Skorecki K, Hammer MF. (2004). "Contrasting patterns of Y chromosome variation in Ashkenazi
Jewish and host non-Jewish European populations". Hum Genet 114 (4): 354365.
doi:10.1007/s00439-003-1073-7. PMID 14740294
6 Hammer MF, Behar DM, Karafet TM, et al.(November 2009). "Extended Y
chromosome haplotypes resolve multiple and unique lineages of the Jewish
priesthood". Human Genetics 126 (5): 707717. doi:10.1007/s00439-009-0727-
5. PMC 2771134. PMID 19669163.

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

85

vious correlations between carriers of this Y-
chromosome mutation for the last millennium.

The article's aim is to, based on the available
data from open sources and conducted research
data, specify phylogenetic structure of Q-M378
subclade and provide classification of its major
clusters (haplotypes, combined according to the
following criteria: pertaining to a sequence of a
single SNP - single nucleotide polymorphisms,
phylogenetic similarity, geographical distribu-
tion).



Source data and methodology

Data sets for comparison

Data from the Personal Genome Project
7

and the 1000 Genomes Project
8
were used
within the framework of the conducted research.
Samples, taken from the specified projects (Ta-
ble 1), have PGP and HG prefixes respectively.

7
http://www.personalgenomes.org/ See also: Ball, M.P., et al., A public
resource facilitating clinical use of genomes. Proceedings of the National
Academy of Sciences, 2012. 109(30): p. 11920-11927.
8
http://www.1000genomes.org/ See also: 1000 Genomes Project Consortium.
An integrated map of genetic variation from 1,092 human genomes. Nature,
2012. 491(7422): p. 56-65.
Table 1. Information based on the data from The Personal Genome Project and 1000 Genomes Project.

Sample code Population Verified origin
HG03914 Bengali (BEB) Bangladesh
HG03652 Punjabi (PJL) Pakistan (Lahore)
HG03864 Telugu (ITU) India
PGP130 N/A Northern Africa (Morocco)

Samples HG03914, HG03652, HG03864 that
do not belong to Q-M378 subclade were used for
comparison.

Additionally, data from targeted Y-
chromosome sequencing of five individuals,
tested at Full Genomes Corporation (FGC)
9
,
were analyzed.

9
https://www.fullgenomes.com/


Table 2. Information based on test participants' data at Full Genomes Corporation.

Sample code Population Verified origin
AJ1 Ashkenazi Jews Eastern Europe
AJ2 Ashkenazi Jews Eastern Europe
Ar1 Armenians Eastern Turkey
Ir1 Iranians Iran, Khuzestan province
Kz1 Kazakhs Kazakhstan, kozha lineage



The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

86

Genotyping

Data sets in BAM format (BAM/SAM Specifi-
cation
10
) and, in case of PGP130, TSV
11
format
were used for the research.

Next-generation sequencing
12
, performed
by Full Genomes Corporation at Beijing Ge-
nomics Institute using Illumina HiSeq 2000
sequencer, is characterized by the following pa-
rameters: 50x coverage at read length of 100
base pairs, with paired end reads. Mapped cov-
erage at about 23 million base pairs out of ap-
proximately 59 million base pairs, present in a
human Y-chromosome, was obtained.


Data processing and analysis

Clusterization of Q-M378 subclade haplo-
types (including haplotypes that belong to Q-
L275 upstream level and downstream levels)
was carried out based on 222 haplotypes
processing (67 STR-markers
13
), obtained from
public sources
14
. MURKA software
15
was used to
construct the phylogenetic tree.

Processing and analysis of full Y-chromosome
sequencing data was made using FGC software,
along with the software developed by YFull re-
search group
16
.

Samples pertaining to Q-L275 subclade and
having no M378 mutation were used as refer-
ence, along with the samples of an upstream
and parallel subclades on a case-by-case basis.
Each sample was genotyped for both SNPs dis-
covered during the research and SNPs included
in the ISOGG list under Q-L275 subclade and its
downstream subclades.

Presence of mutation in more than two sam-
ples served as the criterion of a new SNP dis-
covery, as well as data consistency between the
new SNPs inter se and the previously known in-

10
An up-to-date specification version can be found at.
https://github.com/samtools/hts-specs
11
TSV (Tab Separated Values) text format for storing and viewing tabular da-
ta.
12
Behjati & Tarpey, What is next generation sequencing?, Arch Dis Child Educ
Pract Ed 2013;98:236-238 doi:10.1136/archdischild-2013-304340
http://ep.bmj.com/content/98/6/236.full
13
STR-markers (short tandem repeats).
14
Public projects data from the Family Tree DNA website:
http://www.familytreedna.com/projects.aspx. Hereinafter haplotypes from the
specified source are marked as follows - FTDNA kit and haplotype number.
15
MURKA by Valery Zaporozhchenko (Research Center of Medical Genetics of the
Russian Academy of Medical Sciences, Moscow, Russia).
http://sourceforge.net/projects/phylomurka/
16
http://www.yfull.com/
formation on phylogenetic structure of a respec-
tive subclade.


Results

Clusterization of Q-M378 subclade
based on SNP and STR-markers analysis

Given that SNPs characterize distribution of
haplotypes into clusters in a more specific way,
primary clusterization was made taking into ac-
count the known data on SNPs, defining sub-
levels of Q-M378 subclade.

There are three downstream subclades cur-
rently known
17
Q-L245, Q-L301, Q-L327. SNPs
with an L prefix, defining the above subclades,
were identified at the Family Tree DNA lab led
by Dr. Thomas Krahn.

Geography of Q-L245 distribution essentially
repeats geography of M378 distribution (except
for Central and Southern Asia).

Q-L301 subclade is localized exclusively in
Iran
18
. Simultaneous presence of two subclades
Q-L301 and Q-L245 in Iran and Iraq among au-
tochthonous population is indicative of the long
duration of residence of M378 mutation carries
among the people living in this region
19

20
.

L327 is a private SNP, represented by a sin-
gle haplotype of a Portuguese from Azores
21
.

Another private SNP
22
is P306, localized in
one Indian. That being said, it was not found
among the tested representatives of Q-M378
subclades (including Q-L301)
23
.

Until recently only two SNPs were acknowl-
edged as downstream of L245
24
: L272.1, de-
tected in Europe (Sicily) and L315 (discovered in

17
Y-DNA Haplogroup Q and its Subclades 2013 -
http://www.isogg.org/tree/ISOGG_HapgrpQ.html
18
FTDNA kit 178026, M7540, M7949.
19
Nadia Al-Zahery et al, In search of the genetic footprints of Sumerians: a sur-
vey of Y-chromosome and mtDNA variation in the Marsh Arabs of Iraq (2011).
http://www.biomedcentral.com/1471-2148/11/288 This work has some data on
Q haplotypes present in the Marsh Arabs (n=143) and Iraqis (n=154). Q-M378
has a frequency of 2.1% in the first case and 1.9% in the second one.
20
Grugni et al., Ancient Migratory Events in the Middle East: New Clues from the
Y-Chromosome Variation of Modern Iranians (2012). DOI:
10.1371/journal.pone.0041252. Among those positive to SNP M378 the following
ethnic groups come under notice Khorasan Persians - 3 out of 59 (5.1%), Es-
fahan Persians - 1 out of 11 (9.1%), Lurs - 2 out of 50 (3.9%), Assyrians - 1 out
of 39 (2.6%), Azerbaijani - 1 out of 63 (1.6%).
21
FTDNA kit 13254.
22
FTDNA kit N78873.
23
FTDNA kit 178026, M7540, 193005, 95307 respectively.
24
Both are private SNPs, i.e. found so far in a single carrier of such mutation.
L315 FTDNA kit 51 and L272.1 (FTDNA kit 95307). L315 may not be stable as
it was positive in HG02291 sample.

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

87

East European Ashkenazi). Below L245 SNP
L619.2 is located as well, discovered in two rep-
resentatives of Armenian Diaspora
25
. Further-
more, the fact that this SNP emerged relatively
recently is confirmed by existence of Armenian
Diaspora representatives, who showed no sign
of this polymorphism
26
.

Consequently, until very recently Q-L245
subclade could not be clusterized using SNPs.
Thereby phylogenetic definitions and analysis of
STR-markers were used for clusterization. A
segment of DYF395S1 chromosome of low va-
riability
27
was used for clusterization (the ap-
proach was initially proposed by Q yDNA
Project
28
administrator Rebekah A. Canada),
which allowed formation of stable clusters with
respective geographical and ethnic reference.

For example, the following clusters were hig-
hlighted using this approach.

DYF395S1=14-17

It includes four haplotypes: two Dagestanis
(identifiers according to the cited publication
29
-
Avar Dag 511 and Kaitag Dag06 894), a Turk
30

and an Arab of Iraq
31
. The latter belongs to the
legendary tribe of Quraysh (Adnan-Modar tribal
self-definition).

This cluster is located closer to the tree root
L245 than any other one and, apparently, is the
nearest to the ancestral haplotype.

DYF395S1=15-17

It includes a whole group of haplotypes of
people of various origin. One can pinpoint the
following subclusters in the cluster:

- Central European (localization of most
ancestral lineages Switzerland
32
, part of them
is linked to a Mennonite community);

25
FTDNA kit E5340, 191379.
26
FTDNA kit 173902, 178717.
27
Vladislav Ryzhkov, Calculating time to the most recent common ancestor by
separate panels of Y-STR markers, sorted by increasing mutation rate constants,
The Russian Journal of Genetic Genealogy (Russian version): Vol. 3, No. 2, 2011,
ISSN: 1920-2997 http://ru.rjgg.org
28
Q yDNA Project http://www.familytreedna.com/public/yDNA_Q/
29
Balanovsky et al, Parallel Evolution of Genes and Languages in the Caucasus
Region. Molecular Biology and Evolution, 13 May 2011.
30
FTDNA kit 303617.
31
FTDNA kit 197506.
32
The SCHACKE surname appeared in Germany at least as early as the 1600s
and perhaps earlier. The JAGGI surname in Switzerland goes back much further.
With this DNA Project we hope to learn more about our early ancestors and
where our ancestors originated. Johann Christoffel SCHACKE, the paternal
ancestor of most who carry the SHOCKEY surname, was born in
Kirchheimbolanden, Pfalz, Germany in 1720 to Swiss parents. He arrived in
Philadelphia PA in 1737. The Anglicized version of his name became John
Christopher Shockey. He and his wife Barbara had nine children between 1739
- North-European (localization of most
ancestral lineages Netherlands
33
);

- Italian (including haplotypes with partial
SNP L272.1);

- Armenian;

- Southwest Asian.

It should be noted that according to
DYF395S1=15-17 attribute, a number of haplo-
types with no L245 mutation, are part of the
cluster, in particular haplotypes of a level, which
will be further described as Q-Y2250, as well as
haplotypes of level Q-L327, and Q-P306. How-
ever, in view of a thesis adopted by us on priori-
ty of SNP application during clusterization, we
will not do that. This also implies a conclusion
that clusters DYF395S1=14-17 and/or 15-17
were formed already as a part of Q-M378 level.
This hypothesis however can be made more
specific only with the growth of a number of
tested representatives of the cluster.

DYF395S1=15-18

DYF395S1=15-19

These two clusters are represented exclu-
sively by people of Jewish origin.

Individual haplotypes, having RecLOH (the
so-called Recombinational Loss of Heterozygosi-
ty) in this part of Y-chromosome, were not con-
sidered under this clusterization.

It is expected to identify SNPs, corresponding
to each of the above-mentioned STR-based clus-
ters, as part of further research.

and 1756, six sons and three daughters. After Barbara died John Christopher
married Anna Marie COMPTON. John Christopher and Anna Marie had one son
born in 1774 or 1775. This project hopes to help identify the descendants of the
seven sons of John Christopher SHOCKEY as well as learn more about his Swiss
ancestors and their related families from Germany and/or Switzerland.
http://www.familytreedna.com/public/shockey-schacke/default.aspx
33
Huff/Hough Surname Project -
http://www.familytreedna.com/public/HOUGH/default.aspx A Dutch named
Derrick Pauluszen Hoff (1649-1730), who arrived in New Amsterdam (New York)
no later than 1660, is considered to be the common ancestor of the family.

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

88

New phylogenetic structure
of Q-M378 subclade,
upstream and parallel subclades

As a result of processing and analysis of full
Y-chromosome sequencing data some new sin-
gle nucleotide polymorphisms were discovered,
their placements defined on Y-chromosome (ac-
cording to the reference sequence of human ge-
nome hg19
34
), as well as phylogenetic place-
ments on the SNP tree.

The data on the new SNPs was summarized
in Tables 3-5 along with Diagram 1, specifying
SNP tagging according to Y notation
35
and Full
Genomes Corporation notation
36
.

34
hg19 reference sequence or GRCh37. See also: Human Genome Overview.
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/
35
Y SNP prefix according to YFull.
36
FGC SNP prefix according to Full Genomes Corporation.

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

89

Diagram 1. Phylogenetic tree of Q-M378 subclade, upstream and parallel subclades.



__________________________
Notes:

* SNPs included in ISOGG SNP tree (2013).
* SNPs, included by ISOGG in the list of "SNPs under Investigation" or mentioned in public sources.
* SNPs, explored by YFull team or/and Full Gemomes team.
* SNPs, mentioned in public sources, are marked in green.

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

90

As can be seen from the above, below L275
SNP level the following levels, not described to
this day, were discovered:

1) Q-Y1150 level, which is downstream of
Q-L275 and parallel to Q-M378. SNPs of this
level were discovered in only three natives of
Hindustan (HG03914, HG03652, HG03864)
37.


2) Q-Y2250 level, downstream of Q-M378
and parallel to Q-L245. SNPs of this level (Table
3) were found in Ir1 and Kz1 samples. Seeing
that Ir1 sample has a positive SNP L301 value,
and Kz1 is negative to this SNP, it is evident
that Q-L301 level is downstream of Q-Y2250.
Private SNPs of Kz1 sample are listed in Appen-
dix 3. Private SNPs of Ir1 sample are listed in
Appendix 7.

3) Q-Y2220 level, downstream of Q-L245.
This level combines haplotypes of Jewish and
Armenian clusters Q-L245. All tested samples of
this cluster representatives (AJ1, AJ2, Ar1) had
positive SNPs of this level (see Table 4),
excluding PGP130 sample (Moroccan origin).

37
G.R. Magoon, R.H. Banks, C. Rottensteiner, B.E. Schrack, V.O. Tilroe, T. Robb,
A.J. Grierson, Generation of high-resolution a priori Y-chromosome phylogenies
using next-generation sequencing data, 2013, doi:10.1101/00802 (in prepara-
tion, preprint on bioRxiv.org).
4) There is also Q-Y2220 level parallel to Q-
Y2200 (xQ-Y2200) that contains SNPs, defining
Armenian segment of DYF395S1=15-17 cluster.
Due to the fact that these SNPs were found in
only one sample (Ar1) they have a status of pri-
vate ones. Although one can assume the follow-
ing with high probability:

- that part of these SNPs will be characte-
rized by a rather wide range of haplotypes of
DYF395S1=15-17 cluster;

- Q-L619.2 level will be downstream of Q-
Y2220 (xQ-Y2200), since only a part of Arme-
nians, who are positive to SNP L245, belong to
it. Ar1 sample, tested by us, showed no sign of
L619.2 mutation.

5) Q-Y2200 level, downstream of Q-Y2220.
SNPs of this level define Jewish cluster Q-L245
(see Table 5). Private SNPs of samples AJ1 and
AJ2 are listed in Appendices 5, 6. In addition,
both tested samples had no L315 mutation.










The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

91

Table 3. Q-Y2250 level. New SNPs, downstream of positive SNP M378.

Position
(hg19)
Ancestral
value
Derived
value
SNP name (Y)
SNP name (FGC)
or synonym
7115834 C T Y2244 FGC4626
6894323 C T Y2245 PR683
3544336 C G Y2246 FGC4613
2765038 T G Y2247 FGC4607
4070598 G A Y2248 FGC4618
4242831 A G Y2249 FGC4619
4852955 G A Y2250 FGC4620
6537988 A G Y2251 FGC4624
6724553 C T Y2252
8671530 A G Y2255 FGC4631
10077457 T C Y2256 FGC4635
15766997 A C Y2263 FGC4646
18169503 A C Y2264 FGC4656
18803364 C T Y2265 FGC4657
18990293 A G Y2266 FGC4659
22525954 AT A Y2268
23956540 A T Y2269 FGC4675
24452225 G C Y2270 FGC4676
15684681 A T CTS4507
13643442 T C FGC4638
___________________________
Note: Y2268 deletion.

Table 4. Q-Y2220 level. New SNPs, downstream of positive SNP L245.

Position
(hg19)
Ancestral
value
Derived
value
SNP name (Y) SNP name (FGC)
9408770 G T Y2220 FGC1904
18051798 A C Y2209 FGC1917
22017904 G T Y2202 FGC1925
4914530 A G Y2229



The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

92

Table 5. Q-Y2200 level. New SNPs, downstream of positive SNP L245.

Position
(hg19)
Ancestral
value
Value positive
to SNP
SNP name (Y) SNP name (FGC)
23646920 C T Y2196 FGC1934
22953894 A G Y2197 FGC1933
22825080 A G Y2198 FGC1932
22588598 C T Y2200 FGC1929
22471554 A T Y2201 FGC1928
21277083 G A Y2203 FGC1923
19425984 G A Y2206
19053060 C T Y2207 FGC1919
18207170 A G Y2208 FGC1918
18046486 T C Y2210 FGC1916
18043999 G A Y2211 FGC1915
16994660 T A Y2212 FGC1914
15834557 G A Y2213 FGC1912
14385853 T G Y2215 FGC1911
14353022 A C Y2216 FGC1910
14184253 C A Y2218 FGC1909
9892635 C T Y2219 FGC1906
9401947 C A Y2221 FGC1903
8662585 C A Y2224 FGC1899
6949449 C T Y2225 FGC1897
4606181 C T Y2231 FGC1890
3995524 G A Y2232 FGC1888
3148720 A G Y2233 FGC1886



The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

93

Placement of SNPs, listed by ISOGG as SNPs
under Investigation, was specified within the
scope of this work: F108, F803, F815, F1082,
F1126, F1169, F1213, F1337, F1349, F1528,
F1537, F1594, F1734, F1780, F1836, F1839,
F1858, F1875, F1974, F2023, F2145, F2230,
F2313, F2343, F2440, F2628, F2657, F2777,
F2851, F2877, F2894, F2934, F3084, F3121,
F3193, F3207, F3389, F3621, F3680. On May 8,
2013 all of the above SNPs were classified by
ISOGG as pertaining to level L245 or below. The
analysis showed necessity to modify the pro-
posed scheme. All SNPs, apart from F1213,
F1349, F1594, F1734, F1780, F1836, F1839,
F2230, F2877, pertain to level Q-L275, as they
are positive for samples HG03914, HG03652,
HG03864, AJ1, AJ2, Ar1, Ir1. The remaining
SNPs, in their turn, are positive to all samples in
the research that are positive to M378 and L245.
Consequently, the said SNPs are at the same
level with Q-L275 and Q-M378 respectively
38.


Besides, a considerable amount of new SNPs
was discovered at the same level with L275,
M378 and L245.

For example, the following SNPs pertain to
level Q-L275 - Y1014-Y1022, Y1024-Y1057,
Y1059-Y1069, Y1071-Y1137, Y1139, Y1142,
Y1153, Y1160, Y1164, Y1166, Y1167, Y1169,
Y1195, Y1220, Y1240, Y1978-Y1983, Y1985-
Y1989, Y1991-Y1993, Y1995, Y1996-Y1997,
Y2003, Y2005-Y2007, Y2009, Y2239, Y2243;

to level Q-M378 - Y2012, Y2013, Y2016-
Y2082, Y2084-Y2095, Y2097, Y2098, Y2113-
Y2115, Y2226, Y2361 (Appendix 1, Table 6);

to level Q-L245 - Y2116-2149, Y2195,
Y2199, Y2204, Y2217, Y2222, Y2223, Y2235,
Y2237 (Appendix 2, Table 7).

The said SNPs do not at the moment have
any phylogenetic meaning, but it can be as-

38
It should be noted that FTDNA research team led by Dr. Thomas Krahn, with
the participation of Q yDNA Project administrator Rebekah A. Canada, came to a
similar conclusion earlier. Respective data can be found on the SNP tree draft
version page of the Family Tree DNA website:
http://ytree.ftdna.com/index.php?name=Draft&parent=31182976 There was no
published justification of such conclusions, but, presumably, samples, tested un-
der National Geographic Geno 2.0 project, were used for the analysis.
signed to them later after a full sequencing of
samples, pertaining to these levels and without
SNP mutation, defining downstream levels.


Summary

The research proved high efficiency of full Y-
chromosome sequencing to define phylogenetic
structure, allowed for forming a consistent phy-
logenetic structure of Q-M378 subclade, con-
firmed by analysis of SNP and STR-markers.

As part of the research, new phylogenetic le-
vels of Q-Y2250 (downstream of Q-M378 and in-
cluding Q-L301), Q-Y2220 (downstream of Q-
L245), Q-Y2200 (downstream of Q-Y2220) were
defined. SNPs, which, in the future, may possi-
bly mark certain European and Asian subclusters
Q-Y2220 (including the Armenian subcluster), as
well as separate branches of the Jewish cluster
Q-Y2200, were also defined.

The research confirmed connection of Q-
M378 subclade distribution with migration of In-
do-European language carriers from Central Asia
via Afghanistan and Iran to the West. That being
said, the amount of materials at the researchers'
disposal at the moment is not enough to form
an entire picture of the mentioned migration
processes. The specified task can be resolved in
the near future, while statistically significant da-
ta is being accumulated.


Acknowledgements

The authors of the article wish to thank the
following people, who rendered their assistance
in its preparation and conducting the research:

Mikhail Edelstein (Russia)
Askar Abdullin (Kazakhstan)
Igor Bukharov (Russia)
Nazaret Chitilian (Lebanon)
Justin Allen Loe (United States)
Gregory Magoon (United States)










The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

94

Appendix 1.
Table 6. SNPs at the same level with M378.

Position
(hg19)
Ancestral
value
Derived value
SNP name (Y)
or synonym
SNP name (FGC)
2806676 A G Y2012 FGC1770
3111159 G C Y2013 FGC1758
3815203 G C Y2016 FGC1774
3929337 C A Y2017 FGC1988
4234101 A G Y2018 FGC1775
4332151 G A Y2019 FGC1776
4634427 C A Y2020 FGC1777
4775787 T C Y2021 FGC1779
4778576 A G Y2022 FGC1780
4783438 T C Y2023
4961249 C A Y2024 FGC1781
5011266 A G Y2025
5266522 A G Y2026 FGC1782
5496739 A C Y2027 FGC1783
5687522 T A Y2028 FGC1784
5751055 T G Y2029 FGC1785
5872168 C T Y2226
5963558 G A Y2030
6085717 C A Y2031 FGC1788
6430659 T G Y2032 FGC1789
6617825 T C Y2033 FGC1790
6618215 T C Y2034 FGC1791
6746675 T C Y2035 FGC1792
6774328 T C Y2036 FGC1793
6986250 T C Y2037 FGC1794
7045044 C T Y2038 FGC1795
7071796 C G Y2039 FGC1796
7094691 A G Y2040 FGC1797
7159039 C G Y2041 FGC1798
7160439 G A Y2042 FGC1799
7339849 G T Y2043 FGC1801
7431253 C T Y2044 FGC1803
7437821 C G Y2045 FGC1804
7550568 G C Y2046 FGC1805
7652630 G A Y2047
7778164 G A Y2048 FGC1807
7856334 A G Y2049 FGC1808
7952263 C T Y2050 FGC1809
8067818 C G Y2051 FGC1810
8681004 T C Y2052 FGC1812

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

95

8682184 C T Y2053 FGC1813
8821295 A G Y2054 FGC1814
9074666 C T Y2055 FGC1815
9170505 G T Y2056 FGC1817
13127815 A G Y2057 FGC1818
13928638 G C Y2058 FGC1820
14017272 A G Y2059 FGC1825
14193680 G A Y2060 FGC1827
14293849 T A Y2061 FGC1830
14435779 A G Y2062 FGC1833
14540558 C T Y2063 FGC1834
14674385 C T Y2064 FGC1835
14733633 C A Y2065 FGC1836
15498011 C A Y2066
15521110 T C Y2067 FGC1838
15699493 C T Y2068 FGC1841
16217389 A AT Y2069
16654310 C G Y2070 FGC1842
16678163 C T Y2071 FGC1843
17230548 G A Y2072 FGC1844
17447489 C T Y2073 FGC1845
17959860 A G Y2074 FGC1850
18243302 C T Y2075 FGC1852
18714407 C A Y2076 FGC1854
18768735 G T Y2077
18768736 C A Y2078
18769454 A G Y2079 FGC1767
18803642 T G Y2080 FGC1855
18856911 G C Y2081 FGC1856
19373808 A T Y2082 FGC1858
21365952 G A Y2084 FGC1861
21479863 G A Y2085 FGC1862
21647670 G C Y2086 FGC1863
21832029 C A Y2087 FGC1864
22022365 A G Y2088 FGC1865
22101157 C T Y2089 FGC1866
22440644 G A Y2361
22624047 G A Y2090 FGC1768
22931328 T A Y2091 FGC1869
23053626 A G Y2092 FGC1872
23078557 G T Y2093 FGC1873
23166596 T C Y2094 FGC1874
23279919 G T Y2095 FGC1875
23566714 C T Y2097 FGC1877

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

96

23615574 AT A Y2098
28516009 A T Y2113
28593688 T C Y2114
28687807 A G Y2115

________________________

*Note: Y2098 deletion, Y2069 insertion.


Appendix 2.
Table 7. SNPs at the same level with L245.

Position
(hg19)
Ancestral
value
Derived value SNP name (Y) SNP name (FGC)
2794289 C G Y2116 FGC1987
3127708 T C Y2117 FGC1771
3709585 A C Y2118 FGC1773
4502969 T C Y2119 FGC1759
4671322 C A Y2120 FGC1778
7219594 T C Y2121 FGC1800
7408851 C A Y2122 FGC1802
7590793 C T Y2123 FGC1806
8614513 C G Y2124 FGC1811
9144039 A T Y2223 FGC1901
9382621 G T Y2222 FGC1902
9798919 G A Y2125 FGC1816
13956388 G A Y2126 FGC1821
13982835 C T Y2127 FGC1823
14012662 G A Y2128 FGC1824
14045736 T C Y2129 FGC1826
14202870 A G Y2130 FGC1828
14285880 C G Y2131 FGC1829
14296099 C A Y2217 FGC1831
14402304 G A Y2132 FGC1832
15569048 C T Y2133 FGC1839
15614105 C G Y2134 FGC1840
16519324 A G Y2135
16757414 G GA Y2237
17686482 T C Y2136 FGC1846
17686883 A G Y2137 FGC1847
17763793 T A Y2138 FGC1848
17860015 G T Y2139 FGC1849
18134822 T C Y2140 FGC1851
18575106 G A Y2141 FGC1853
19300050 C T Y2142 FGC1857
21118566 T C Y2143 FGC1859
22015887 C A Y2144 FGC1989

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

97

22934317 ATC A Y2235
23010582 C T Y2145 FGC1870
23042385 C A Y2146 FGC1871
23648959 T G Y2147 FGC1878
23733052 A G Y2148 FGC1879
28520821 A G Y2149
28646637 C G Y2195 FGC1883
22767464 G A Y2199 FGC1868
21235857 A G Y2204 FGC1860

________________________

*Note: Y2235 deletion, Y2237 insertion.


Appendix 3.
Table 8. Private SNPs for Kz1 sample.

Position
(hg19)
Ancestral
value
Derived value SNP name (Y) SNP name (FGC)
2980949 T C YFS026208
3027441 C A YFS026210 FGC4858
3751684 G A YFS026242 FGC4859
4164029 A G YFS026250 FGC4860
4515848 G A YFS026257 FGC4862
4714529 G T YFS026264 FGC4864
5394870 T C YFS026279 FGC4865
5398133 A T YFS026280 FGC4866
6088200 T C YFS026301 FGC4867
6675390 A G YFS026321 FGC4868
7058898 G A YFS026329 FGC4869
7208802 C T YFS026339 FGC4870
7278041 G A YFS026340 FGC4871
7704050 C T YFS026351 FGC4856
7929100 A C YFS026356 FGC4872
8268654 G A YFS026361 FGC4873
8684090 G A YFS026366 FGC4874
8714870 C T YFS026367 FGC4875
9154952 G A YFS026372 FGC4876
9990725 C G FGC4878
13230336 G A FGC4879
13313894 G C FGC4880
13637299 G A FGC4881
14599760 G A YFS026426 FGC4882
15353330 C T YFS026439 FGC4883
15540398 G A YFS026445 FGC4884
15617600 G A YFS026447 FGC4885
15656595 A C YFS026448

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

98

15881099 G A YFS026457 FGC4886
17344441 A G YFS026496 FGC4887
17455705 C G YFS026499 FGC4888
17619239 A C YFS026502 FGC4889
18132430 T A YFS026506 FGC4890
18205189 C A YFS026508 FGC4891
18235952 C A YFS026509 FGC4892
18427622 C T YFS026514 FGC4893
18699065 G A YFS026522 FGC4894
19119009 G A YFS026534 FGC4895
21794826 T C YFS026585 FGC4896
21824228 C T YFS026586 FGC4897
22216997 C A YFS026594 FGC4898
22263424 G T FGC4899
22464918 G A YFS029304
22470401 G T YFS029305 FGC4901
22476862 T A FGC4902
22779292 G A YFS026598 FGC4904
22845858 T A YFS026600 FGC4905
22980932 G A YFS026603 FGC4906
23097922 G T YFS026606 FGC4907
23188736 C T YFS026608 FGC4908
23574588 G T YFS026618 FGC4909
28577678 T G FGC4857
28556325 T G YFS026709


Appendix 4.
Table 9. Private SNPs for Ar1 sample.

Position
(hg19)
Ancestral
value
Derived value SNP name (Y) SNP name (FGC)
2837084 G A YFS030295
4687602 C T YFS030307
3264534 G T YFS030298
3692600 G A YFS030300
6849037 A G YFS030309
7389018 T C YFS030314
7809088 C T YFS030318 FGC2000
8227956 C T YFS030321 FGC2001
8310172 G A YFS030322 FGC2002
8891034 A G YFS030324 FGC2003
9455617 G C YFS030326 FGC2004
9507128 G A YFS030327 FGC2005
13207417 C T FGC2006
13862984 G A YFS030335 FGC2007

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

99

14037704 A G YFS030339 FGC2008
14266100 G A YFS030343
14271743 G T YFS030344 FGC2009
14645998 A T YFS030350
15487465 T C YFS030354 FGC2010
15532493 G C YFS030355 FGC2011
15562737 G A YFS030356 FGC2012
15649426 C G YFS030357
15949197 C T YFS030358 FGC2013
16033272 G A YFS030359 FGC2014
16914913 A T YFS030368
17143642 G A YFS030370 FGC2015
17264341 C T YFS030371 FGC2016
17350212 G T YFS030372 FGC2017
17468836 G A YFS030374 FGC2018
17522056 C A YFS030375 FGC2019
17547056 C T YFS030376 FGC1986
17969724 T C YFS030377 FGC2020
18005360 G A YFS030378 FGC2021
18082500 T C YFS030379 FGC2022
18143358 C T YFS030380
18269281 T C YFS030381 FGC2023
19295864 G A YFS030386 FGC2024
19305808 C G YFS030387 FGC2025
21920836 G T YFS030396 FGC2026
22195671 T G YFS030398 FGC2027
22546195 T C YFS030431 FGC2029
23036871 A C YFS030432 FGC2030
23193319 C G YFS030433 FGC2031
23633830 T C YFS030434 FGC2032
23749442 C G YFS030435 FGC2033
23952561 G A YFS030438 FGC2034
28546577 A G YFS030460 FGC2035
28697215 C T YFS030463 FGC2036
28728861 A G YFS030465 FGC2037
28773229 G A YFS030466 FGC2038






The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

100

Appendix 5.
Table 10. Private SNPs for AJ1 sample.

Position
(hg19)
Ancestral
value
Derived value
SNP name (Y) SNP name (FGC)
3014878 G C YFS028077
3279492 T C YFS028084
4705139 G A YFS028121
4734829 G T YFS028122
5007712 T C YFS028135
6028097 T C YFS028158 FGC4835
6671453 T A YFS028174
6985833 G C YFS028180 FGC4836
7116693 C G YFS028187 FGC4837
13225084 C A FGC4839
13227006 C T FGC4840
14174284 C T YFS028277 FGC4841
14683323 G A YFS028303
15749472 C G YFS028328 FGC4842
15911171 T A YFS028333 FGC4843
17216758 C G YFS028365 FGC4844
17842405 G A YFS028379 FGC4845
18697269 A G YFS028399 FGC4846
22541678 G A YFS028484
22545510 G T YFS028485 FGC4850
22809218 A T YFS028490 FGC4851
22816094 C T YFS028491 FGC4852
22989959 T C YFS028498 FGC4853
23338485 T C YFS028509 FGC4854


Appendix 6.
Table 11. Private SNPs for AJ2 sample.

Position
(hg19)
Ancestral
value
Derived value
SNP name (Y) SNP name (FGC)
3085515 C A YFS030088 FGC1885
4157714 C T YFS030093 FGC1889
7357489 C T YFS030117 FGC1898
8757232 C A YFS030130 FGC1900
9761433 C T YFS030140 FGC1924
16933881 C T YFS030164 FGC1913
19228285 T C YFS030189 FGC1920
21322098 A G YFS030210 FGC1924
22128896 C T YFS030218 FGC1926
22612418 A T YFS030247 FGC1930
22720359 C T YFS030248 FGC1931

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

101

Appendix 7.
Table 12. Private SNPs for Ir1 sample.

Position (hg19) Ancestral value Derived value SNP name (Y)
2808294 G A YFS030486
2848925 C T YFS030487
3241019 G A YFS030493
3331565 C T YFS030495
3617298 G A YFS030498
3905106 T C YFS030501
3983695 G A YFS030503
4048861 C G YFS030505
4976524 T C YFS030521
4976526 T C YFS030522
5021496 G C YFS030523
5219277 T A YFS030526
5844571 C T YFS030529
6531744 G A YFS030531
7398730 T C YFS030543
7685828 G T YFS030547
7997281 G C YFS030548
8350958 G A YFS030550
8482074 C G YFS030551
8874735 C A YFS030553
9459692 A G YFS030555
9832592 A G YFS030556
14022660 C A YFS030564
14273656 A G YFS030573
14401614 C T YFS030575
14532575 G T YFS030582
14916116 G A YFS030585
14996654 G A YFS030588
15012864 C A YFS030589
15240341 G C YFS030591
15799031 G C YFS030596
15933501 T A YFS030599
16253494 C T YFS030602
16280147 C T YFS030603
16304710 T C YFS030604
16875622 C T YFS030608
17529042 G A YFS030616
18106050 C T YFS030618
18903761 A C YFS030626
19157289 G A YFS030633
19198307 A T YFS030634

The Russian Journal of Genetic Genealogy ( ): 5, 1, 2013
ISSN: 1920-2997 http://ru.rjgg.org
RJGG

102

19526472 A C YFS030637
21359025 C G YFS030656
21567329 G A YFS030657
22564450 C T YFS030684
22621906 G T YFS030685
22687343 A T YFS030686
22910874 G A YFS030688
23018638 T C YFS030689
23054174 T G YFS030690
23198785 A T YFS030691
23435852 A C YFS030694
24484883 T C YFS030706
28759876 C T YFS030732
17188634 T C YFS030609
19001468 C T YFS030630
20534862 T C YFS030645
21599239 A G YFS030658
21836635 A T YFS030661

You might also like