The Kay Family Association UK

Project 50

The Y chromosome

The X and Y chromosomes determine our gender. A man has an X chromosome (inherited from his mother) and a Y chromosome (inherited from his father); a woman has two X chromosomes (one inherited from her mother and one from her father). It is the presence or absence of the Y chromosome that determines whether we are male or female. This is what makes the Y chromosome such a powerful tool in family research, as it is passed down from father to son through the generations, sometimes changing, but then only by a small factor.

Our chromosomes are made up of chains of four chemicals, grouped together, and it is possible to measure these groupings to get the ‘signature’ of the chromosome, represented as a series of ten or more numbers. When two men have signatures that are identical or very close, we can be sure they share a common paternal ancestor.

How it started

Project 50 started as a result of Project Marcus. Marcus Key of Virginia, who has traced his roots back to the West Riding of Yorkshire, suggested to Frank Kaye of Arkansas, who had similar origins, that they organised a DNA test to prove common parental ancestry. The results being identical, Marcus then applied to the Kay Family Association (UK) to see if we could supply members who could also trace their ancestry back to the West Riding, to prove the common ancestry. In the event, five more samples were supplied – three from Yorkshire Kayes, and two from Lancashire Kays.

The results were remarkable. Of the seven samples provided, six of the Y-chromosome signatures were identical, and one differed by just one point. This shows a common ancestry for Kays from Yorkshire, Lancashire, Virginia and Arkansas. Hence Project 50. We wanted to pick up as many Kays as we could from all parts of the world to see if the same results still applied.

This came at an opportune time for us as we had for some time been considering the possibility of mounting a DNA project. Of the seven members from the UK and USA who took part in Project Marcus, name spellings were Key, Kay and Kaye. DNA testing was carried out by the UK firm of Oxford Ancestors using the following ten Y-chromosome markers, in the order shown in the Table of Results

DYS19–DYS388–DYS390–DYS 391–DYS392–DYS393–DYS389i–DYS389ii–DYS425–DYS426

It is possible by utilising a mathematical procedure advised by Oxford Ancestors, to calculate the era of a Common Parental Ancestor (CPA). The result obtained for the small group of seven members in the Marcus group yielded such a wide span of years that no firm conclusions could be made. A larger sample was needed in order to improve the accuracy and to determine if all Kays belonged to this group.

The logistics of the project were formidable. To find a larger number of male Kays willing to take part in such an exercise was not easy, nor was the matter of funding. Peter Cameron organised the project, and for a while his fridge was taken over with the storage of samples before their dispatch in bulk to Oxford Ancestors. Finally, we had to analyse the findings as they came back and try to make sense on the data.

In such a project, some sensitivity is needed as well, and the outcome was not always felicitous. There is the possibility that the results could throw up an issue of illegitimacy somewhere in a family tree; this did actually happen in our project, but only proved what was already suspected. However there was also upset caused with participants who had thought they would belong to the main group, but didn’t.

The Results

The second project was started early in 2003, aiming at a sample of about 50 participants. It was an ambitious target as, at that time, we did not have 50 male Kay members in our Association! All our male members and one or two interested non-members, were invited to participate and 30 joined in with enthusiasm.

Our American cousins in the Kay Family Association USA (www.RobertKayFamily.org) were invited to participate. The KFA-USA has over 800 members and is largely composed of descendants of Robert Kay, who was born in Virginia c1725 and died in South Carolina in 1801. It was realised that this new project could provide an ideal opportunity to confirm a genetic link with the Kays of Lancashire. The KFA-USA decided to fund 10 participants for this Y-Line study, selecting two widely separated men from descendants of each of the five sons of Robert Kay. Included also were several others who wanted to participate and paid their own fee. Needless to say, the results were awaited by our American cousins with much apprehension.

The KFA-UK contribution was made up of 26 members and 4 non-members, located throughout the UK and as far afield as New Zealand, Australia, Belgium and several in the USA. Two non-Kays were included for comparison, one who had originally thought he was a Kay but whose family history research had suggested that there was a paternity problem and it was very likely that genetically he was not a Kay. The other was our chairman, David Kay Phillips. The Kay in his name is a legacy from his maternal line and therefore his Y-Line signature should be different to the Kays. He was interested to confirm this.

The results from the original 7 man ‘Marcus’ project were merged as well, bringing the overall total for assessment to 52. Surname spellings include KEY (3), KAYE (12) and KAY (36), plus one non-Kay.

We really had no idea what to expect. The findings of the original ‘Marcus’ project had been remarkable. The results of this project were equally remarkable and held some surprises. Watching the results as they came in, it was clear that, instead of all being identical, as in the Marcus project, we now appeared to have a number of groups. Tabulating the results, we had:

Signature Number Origin
15 . 13 . 22 . 10 . 11 . 14 . 09 . 17 . 14 . 11
15 W Yorkshire (3), USA (1), Virginia (11)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 17 . 14 . 11
11 Lancashire (5), E Yorkshire (1), Yorkshire (4), Virginia (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 18 . 14 . 11
2 Lancashire (2)
15 . 13 . 23 . 10 . 11 . 15 . 09 . 17 . 14 . 11
1 Lancashire (1)
14 . 13 . 23 . 10 . 11 . 15 . 09 . 17 . 14 . 11
1 Lancashire (1)
15 . 13 . 23 . 10 . 11 . 14 . 09 . 17 . 14 . 11
1 W Yorkshire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 16 . 13 . 11
1 Yorkshire (1)
15 . 12 . 22 . 10 . 11 . 14 . 09 . 17 . 12 . 11
1 Virginia (1)
15 . 13 . 23 . 10 . 12 . 14 . 10 . 18 . 12 . 11
1 Isle of Man (1)
15 . 15 . 23 . 10 . 11 . 13 . 10 . 16 . 12 . 11
1 Lancashire (1)
15 . 12 . 23 . 09 . 13 . 13 . 10 . 16 . 12 . 12
1 W Yorkshire (1)
14 . 12 . 24 . 11 . 13 . 12 . 10 . 16 . 12 . 12
1 Brabant, Belgium (1)
14 . 12 . 24 . 11 . 13 . 13 . 10 . 16 . 12 . 12
1 Derbyshire (1)
14 . 12 . 24 . 11 . 13 . 13 . 11 . 16 . 12 . 12
1 Lanarkshire (1)
14 . 12 . 24 . 10 . 13 . 13 . 09 . 17 . 14 . 11
1 USA (1)
14 . 12 . 23 . 11 . 13 . 13 . 08 . 16 . 12 . 13
1 Palenge, Belgium (1)
14 . 12 . 24 . 11 . 16 . 14 . 10 . 16 . 12 . 12
1 USA (1)
14 . 12 . 23 . 11 . 11 . 13 . 09 . 16 . 12 . 12
1 Cheshire (1)
14 . 12 . 23 . 11 . 13 . 13 . 09 . 15 . 12 . 12
1 Yorkshire (1)
14 . 12 . 23 . 11 . 13 . 13 . 10 . 16 . 12 . 12
1 Lancashire (1)
14 . 12 . 23 . 10 . 11 . 12 . 10 . 18 . 12 . 12
1 Lancashire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 17 . 12 . 11
1 Lancashire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 19 . 14 . 11
1 Lancashire (1)
14 . 12 . 23 . 10 . 13 . 13 . 10 . 16 . 12 . 12
1 Norfolk (1)
14 . 12 . 24 . 10 . 13 . 13 . 11 . 16 . 12 . 12
1 Lancashire (1)
14 . 12 . 24 . 11 . 13 . 13 . 10 . 17 . 12 . 12
1 N Yorkshire (1)
14 . 12 . 24 . 10 . 13 . 13 . 10 . 16 . 12 . 12
1 Lancashire (1)

The first thing that jumps at you from these figures is that exactly half of the samples fall into just two groups, the first two on the list. A good job too, otherwise we’d have wasted an awful lot of time and money to no avail. Equally clearly, these figures confirmed the results of Project Marcus, that there is a common ancestry for the Kays of Lancashire and Yorkshire, and that the descendants of Robert Kay of South Carolina come from the same stock.

The task now was to see what conclusions could be drawn from this. At the time of our first analysis, Oxford Ancestors suggested as a rule of thumb that there was a 98% chance of the signature staying unchanged as it’s passed from father to son, and a 2% chance of it changing. This carries on through the generations, with a gradually increasing probability that one of the markers in a signature will change. But it’s a very slow process; even after 33 generations, there’s still a 50% chance that the Y-chromosome signature won’t have changed. If we assume an average of 25 years per generation, that means that there is still a 50% chance that the signature won’t have changed after 825 years.

In Oxford Ancestors’ terminology, a signature has changed by one point if just one of the numbers has changed value by just one; it has changed by two points if two of the numbers have changed by just one, or one number has changed by two, and so one. In our sample the two main groups were just one point apart, because just one number has changed. Looking at our sample in this way, and taking the largest group as our starting point, we got the following (changes are shown in red):

Signature Number Origin
15 . 13 . 22 . 10 . 11 . 14 . 09 . 17 . 14 . 11
15 W Yorkshire (3), USA (1), Virginia (11)
One point different
15 . 13 . 22 . 10 . 11 . 15 . 09 . 17 . 14 . 11
11 Lancashire (5), E Yorkshire (1), Yorkshire (4), Virginia (1)
15 . 13 . 23 . 10 . 11 . 14 . 09 . 17 . 14 . 11
1 W Yorkshire (1)
Two points different
15 . 13 . 22 . 10 . 11 . 15 . 09 . 18 . 14 . 11
2 Lancashire (2)
15 . 13 . 23 . 10 . 11 . 15 . 09 . 17 . 14 . 11
1 Lancashire (1)
Three points different
14 . 13 . 23 . 10 . 11 . 15 . 09 . 17 . 14 . 11
1 Lancashire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 16 . 13 . 11
1 Yorkshire (1)
15 . 12 . 22 . 10 . 11 . 14 . 09 . 17 . 12 . 11
1 Virginia (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 17 . 12 . 11
1 Lancashire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 19 . 14 . 11
1 Lancashire (1)
More than three points different
15 . 13 . 23 . 10 . 12 . 14 . 10 . 18 . 12 . 11
1 Isle of Man (1)
15 . 15 . 23 . 10 . 11 . 13 . 10 . 16 . 12 . 11
1 Lancashire (1)
15 . 12 . 23 . 09 . 13 . 13 . 10 . 16 . 12 . 12
1 W Yorkshire (1)
14 . 12 . 24 . 11 . 13 . 12 . 10 . 16 . 12 . 12
1 Brabant, Belgium (1)
14 . 12 . 24 . 11 . 13 . 13 . 10 . 16 . 12 . 12
1 Derbyshire (1)
14 . 12 . 24 . 11 . 13 . 13 . 11 . 16 . 12 . 12
1 Lanarkshire (1)
14 . 12 . 24 . 10 . 13 . 13 . 09 . 17 . 14 . 11
1 USA (1)
14 . 12 . 23 . 11 . 13 . 13 . 08 . 16 . 12 . 13
1 Palenge, Belgium (1)
14 . 12 . 24 . 11 . 16 . 14 . 10 . 16 . 12 . 12
1 USA (1)
14 . 12 . 23 . 11 . 11 . 13 . 09 . 16 . 12 . 12
1 Cheshire (1)
14 . 12 . 23 . 11 . 13 . 13 . 09 . 15 . 12 . 12
1 Yorkshire (1)
14 . 12 . 23 . 11 . 13 . 13 . 10 . 16 . 12 . 12
1 Lancashire (1)
14 . 12 . 23 . 10 . 11 . 12 . 10 . 18 . 12 . 12
1 Lancashire (1)
14 . 12 . 23 . 10 . 13 . 13 . 10 . 16 . 12 . 12
1 Norfolk (1)
14 . 12 . 24 . 10 . 13 . 13 . 11 . 16 . 12 . 12
1 Lancashire (1)
14 . 12 . 24 . 11 . 13 . 13 . 10 . 17 . 12 . 12
1 N Yorkshire (1)
14 . 12 . 24 . 10 . 13 . 13 . 10 . 16 . 12 . 12
1 Lancashire (1)

Given the closeness of their places of origin, and the major differences between their signatures and those below them in the table, it seems clear to us that the first 10 groups (those with three or fewer differences from the base group) can all be classed together as having descended from the same originating stock. The remaining members are massively different, the nearest being six points different. Oxford’s Ancestors’ words are that the chance of a CPA when there are more than three points different are “vanishingly small”. It can be said that the inclusion of a significant number of contributions from known descendants of Robert Kay of South Carolina has skewed the results, and it is interesting to see what happens if we look at those top 10 groups again, but use the second group (those with a predominantly English origin) as the base group:

Signature Number Origin
15 . 13 . 22 . 10 . 11 . 15 . 09 . 17 . 14 . 11
11 Lancashire (5), E Yorkshire (1), Yorkshire (4), Virginia (1)
One point different
15 . 13 . 22 . 10 . 11 . 14 . 09 . 17 . 14 . 11
15 W Yorkshire (3), USA (1), Virginia (11)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 18 . 14 . 11
2 Lancashire (2)
15 . 13 . 23 . 10 . 11 . 15 . 09 . 17 . 14 . 11
1 Lancashire (1)
Two points different
15 . 13 . 23 . 10 . 11 . 14 . 09 . 17 . 14 . 11
1 W Yorkshire (1)
14 . 13 . 23 . 10 . 11 . 15 . 09 . 17 . 14 . 11
1 Lancashire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 16 . 13 . 11
1 Yorkshire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 17 . 12 . 11
1 Lancashire (1)
15 . 13 . 22 . 10 . 11 . 15 . 09 . 19 . 14 . 11
1 Lancashire (1)
Four points different
15 . 12 . 22 . 10 . 11 . 14 . 09 . 17 . 12 . 11
1 Virginia (1)

Although one of the groups has now become four points different, the rest have moved closer and we have a better match. It looks like a more viable model. Whichever way we look at it, we have thirty five of our original sample of fifty two who share the same CPA

What we have quite conclusively shown from this group of thirty five is:

Origins

At the time Project 50 was taking place a group of scientists were getting together to look at clarifying and unifying the approach to the classification of the results of DNA testing. As a result of their work they identified that Y-DNA of men tend to cluster into a small number of groups, (about 20 in total) which they identified using the letters of the alphabet A to T. The scientific term for these groupings is haplogroups. As more is learnt and DNA testing becomes more refined, so these groupings are being further subdivided into smaller groupings identified by numbers and lower-case letters. It has been possible to take the results of Project 50 and run them through a conversion program to identify the groups into which our participants fall. The results are as follows:

Haplogroup Number
G 35
R1b 13
I 1
Not possible to identify 3

Haplogroup G: The 35 men identified as the main group in Project 50 all register as group G. This is really interesting as Group G has an overall low frequency in most populations but is widely distributed throughout Europe, northern and western Asia, northern Africa, the Middle East, India, Sri Lanka and Malaysia. The group had its origins in the Middle East 10 – 20,000 years ago. All 35 participants fall into the G2a sub-group to which most Europeans belong. This sub-group originated in the Caucasus region and probably arrived in Europe during the Neolithic or the Bronze Age.

Haplogroup R1b: This is the most frequently occurring haplogroup in Western Europe and it is the dominant group in Britain with about 60% of the male population being R1b. It is usually associated with the Celts who arrived in Britain following the end of the last ice age. 13 participants from Project 50 fall within this group. Looking at the results of DNA testing of other Kays that are publicly available on the internet and it is clear that virtually all Kays with Scottish roots are testing as R1b. It may well be that the participants from our DNA study who registered as R1b also have Scottish origins.

Haplogroup I – Nearly one-fifth of the population of Europe fall within this haplogroup. Today it is found most frequently within Viking/Scandinavian populations in northwest Europe. The one representative we have in this group traced his origins back to the Isle of Man which is known to have strong Viking links. Looking at results from the Key Family DNA Study it appears that many men with the surname Key/Kee also fall within this group.

About 5 years after Project 50, three of the original participants who could trace their ancestors back to Lancashire or Yorkshire in the 1600’s were invited to undertake a further DNA test at the request of one of our American members. This was a more comprehensive test than that used by Project 50. It confirmed the men as belonging to haplogroup G2a. It also identified one of the participants who could trace his line back to Kirkheaton, Yorkshire as having an identical DNA match with a member of the USA Kay Association who could trace his line back to James Kay born about 1600 in Bury, thus confirming the close link between Lancashire Kays and Yorkshire Kayes.

Quite how the G2a group came to be in Lancashire and Yorkshire so far from its “home” in the Caucasus has been the subject of much speculation. There is now a possible explanation and, like many things, it may be thanks to the Romans. Dr Christian Capelli, an expert in genetics at Oxford University, has identified an unusually high occurrence of Haplogroup G in northern England and the Borders region of Scotland. It is quite possible that G2a arrived here with Roman cavalry who were stationed along Hadrians Wall. These cavalry were part a nomadic tribe of Sarmatians who had been defeated by Marcus Aurelius in 170AD. The tribe had moved out of the Caucasus, along the shore of the Black Sea and into the Ukraine. For over 300 years they raided across the Danube and were a thorn in the side of the Roman Empire until eventually being defeated by Marcus Aurelius. He was about to have them massacred but offered them the choice of enlisting on the side of Rome, which they did and they were sent to Britain. After serving on Hadrian’s Wall for several years the Sarmatians were moved to Ribchester in the early third century AD, some were also stationed at Chester. The force originally consisted of about 5,500 cavalrymen, presumably accompanied by their families. It is possible that they continued to recruit from the Danube area. It is believed that many never went home and that their descendants continued to live in the area and to farm land in the Ribble Valley. Is it possible that Kay origins lie with these Sarmatians? After all, one suggestion for the origin of the surname Kay is that it comes from the Latin name Caius.

The common parental ancestor (CPA)

When a group of men with identical or nearly identical signatures is identified, it is possible to apply statistical analysis to find the possible date of that ancestor. The larger the group and, paradoxically, the more variations there are in that group, the better the result. However, that result can only be expressed in generations (usually taken to be 25 years) and as a range of possibilities. In the case of our main group, the likely date was set at somewhere between 1260 and 1460. Interestingly, this result is consistent with the suggested date of the move from Lancashire to Yorkshire proposed by George Redmonds from his documentary research (see article).

The other results

Unfortunately, the remaining group in our sample is too small and has too many differences to allow us to clearly identify other possible family groups. The fact that none of them is in haplogroup G makes it clear they don’t share the same ancestry as the main group, but can we find any others? If we look at these men and compare their signatures with all of the others in the group, we get:

Origin A B C D E F G H I J K L M N O P Q
A Isle of Man 0 6 7 10 9 10 9 11 11 9 10 8 6 7 9 8 8
B Lancashire 6 0 7 10 9 10 11 11 13 7 10 8 8 7 9 10 8
C W Yorkshire 7 7 0 5 4 5 8 6 8 6 5 3 7 2 4 5 3
D Brabant 10 10 5 0 1 2 7 5 5 5 4 2 6 3 3 2 2
E Derbyshire 9 9 4 1 0 1 6 4 4 4 3 1 7 2 2 1 1
F Lanarkshire 10 10 5 2 1 0 7 5 5 5 4 2 8 3 1 2 2
G USA 9 11 8 7 6 7 0 8 10 8 7 7 9 6 6 5 5
H Palenge 11 11 6 5 4 5 8 0 8 4 3 3 9 4 6 5 5
I USA 11 13 8 5 4 5 10 8 0 8 7 5 11 6 6 5 5
J Cheshire 9 7 6 5 4 5 8 4 8 0 3 3 5 4 6 5 5
K Yorkshire 10 10 5 4 3 4 7 3 7 3 0 2 8 3 5 4 4
L Lancashire 8 8 3 2 1 2 7 3 5 3 2 0 6 1 3 2 2
M Lancashire 6 8 7 6 7 8 9 9 11 5 8 6 0 5 7 6 5
N Norfolk 7 7 2 3 2 3 6 4 6 4 3 1 5 0 2 3 1
O Lancashire 9 9 4 3 2 1 6 6 6 6 5 3 7 2 0 3 1
P N Yorkshire 8 10 5 2 1 2 5 5 5 5 4 2 6 3 3 0 2
Q Lancashire 8 8 3 2 1 2 5 5 5 5 4 2 6 1 1 2 0

There do seem to be some groups here, shown by the different colours. More work is needed!

And finally

We had originally expected that the project might take two or three months but it actually took seven months. This was mainly due to the size of the group and the involvement of members in five countries. Like any group of walkers on a long hike, it can only proceed at the pace of the slowest. Communication by email was fast but many participants were not on email and, by comparison, communication by air mail took an inordinate time. Since 2003, when the project finished, we’ve been regularly revisiting the figures to see if we can refine estimates for that elusive CPA. The whole science of Y chromosome analysis is continually advancing, which means that those estimates, and the suggestions for racial origins, are continually changing. This project hasn’t finished yet! And, who knows, we may do it again one day.