-
In humans, the MN blood group locus consists of 2 alleles: the M
allele and the N allele. These alleles encode antigens expressed on the
surface of red blood cells; the two alleles are codominant so that a
heterozygote (MN) has both the M-antigen and the N-antigen. In a study
in Poland, 3100 people were genotyped with the following results:
1126 people were MM
1446 people were MN
528 people were NN
Calculate each of the following:
-
the observed genotype frequencies
-
the allele frequencies
-
the expected number of individuals of each genotype (assuming random
mating)
-
carry out a Chi-square test for goodness of fit to random-mating
proportions (with 1 degree of freedom). Is this population in
Hardy-Weinberg Proportions for this locus? Explain.
To get the observed genotype
frequencies, we just divide the number of individuals of that genotype
by the total number of individuals. Here’s the calcultion for
MM:
obs freq MM = 1126 / 3100 =
0.36
To get the expected
frequency of a genotype, we have to figure out the allele frequencies
and use those to calculate the HW expected fraction. To get allele
frequencies, we can use our observed genotype frequencies as
follows:
For the M allele, 100% of
the alleles in MM individuals have the M allele and half of the alleles
in MN individuals are M alleles so we get:
p = freq M = 0.36*1 +
0.47*0.5 = 0.6
Then we can calculate the
frequency of the N allele by doing 1 - p OR we can do it the same way we
did with M:
q = freq N = 0.17*1 +
0.47*0.5 = 0.4
Now, we use our allele
frequencies to find out how many MM, MN, and NN individuals we’d expect,
given how many of each type of allele we have. We expect to get MM as
often as you expect to draw two M alleles by chance, or p x p =
p2 = 0.6 x 0.6 = 0.36.
In this example, our
expected frequency and our observed frequency turn out to be quite
similar! For MN you use 2pq and for MM you use q2.
Once we have expected
frequencies, we multiply them by the total number of individuals sampled
to see how many individuals we’d expect to have found with each
genotype. For MM, 0.36 x 3100 = 1116 individuals expected.
Our last calculation to do
here is to say how different are our observed and expected numbers of
individuals. This is where we do the calculation \(\frac{(O-E)^2}{E}\). We subtract the
expected number from the observed, square the value, and divide by the
expected number. Here’s the calculation for MM:
(1126 -
1116)2/1116 = 102/1116 = 0.089
We perform the calculation
for each genotype and then sum them up, as in the table below, to get
our \(\chi^2\) value of
3.339.
Genotype
|
ObsNo
|
ObsFreq
|
ExpFreq
|
ExpNo
|
(O-E)^2/E
|
MM
|
1126
|
0.36
|
0.36
|
1116
|
0.089
|
MN
|
1446
|
0.47
|
0.48
|
1488
|
1.185
|
NN
|
528
|
0.17
|
0.16
|
496
|
2.065
|
Total
|
3100
|
1.00
|
1.00
|
3100
|
3.339
|
Finally, we interpret our
value. This population is in Hardy-Weinberg Proportions for this locus
since the estimated \(\chi^2\) value is
less than the critical value of 3.84 for 1 df. Thus, the difference
between observed and expected values is due only to chance with P >
0.05, meaning that there is more than a 5% chance that the difference
between observed and expected is due to chance. Note that it’s important
to provide not just the estimated test statistic but also the P-value as
well as your interpretation of what this result means.
-
Consider the data presented by Jorde and Wooding (2004; only 6 pages
long). Do human populations differ genetically? How do we know?
Jorde and Wooding provide a
variety of ways to understand or measure genetic variation in the human
population. For example, they provide the estimate that a randomly
chosen pair of humans will differ from each other (on average) for about
1 in every 1000-1500 base pairs of DNA (this amounts to 2-3 million base
pairs in the whole genome). This is low compared to many species (see
the content video on genetic variation).
The numbers just provided
are about genetic variation in the whole human population but this
question is asking whether DIFFERENT human populations are also
genetically different from each other (measurably). So, if we consider
the base pairs (or nucleotides) in the genome where humans vary one from
another, what do we see? We find that there is some geographic signal in
the data – people from different locations in the world do tend to
differ for some places in the genome. In fact, about 10-15% of the
variation in the whole species is correlated with geographic location
(continental groups of Africa, Asia, Europe). Meanwhile, 85-90% of the
variation within humanity is common to most human populations, it is
shared by populations in different locations. As a result, we see that
there is very little genetic difference between different peoples and
much much more of our variation is found in all populations.
These data result from
sequencing different parts of the human genome and comparing them for
human genome samples from all over the world. Jorde and Wooding describe
3 kinds of genetic loci (not genes but locations in the genome, may be
non-coding DNA): STR polymorphisms (Short Tandem Repeats), restriction
site polymorphisms, and Alu insertion polymorphisms. All of these loci
represent sequences that do not code for proteins and thus will evolve
neutrally, not likely to have a functional effect on the organism’s
phenotype.
-
Refer to both Jorde and Wooding (2004) and/or Sankar and Cho (2002; only
2 pages long) to answer the following questions:
-
How might we define “race”? What does this term represent? Is it useful?
Harmful? Does it have any biological meaning? How might we decide
whether there’s biological relevance or what role biology might play
here? (I’m not looking for some right answer. I would like you to
reflect on this in light of the readings and your experiences—and also
as potential biologists!)
Responses will vary. Jorde
and Wooding seem to largely argue that “race” is pretty much entirely a
social construct that does more harm than good. They show evidence that
the way that “race” is defined has changed enormously over time and
given the contexts in which it is used is usually causing harm, defining
differences that are then used predjudically. Given the actual data on
human genetic variation, J&W seem to argue that race is not
biologically based or relevant.
-
What might be benefits or costs of using a term like “race” (or racial
descriptors) when discussing human genetic variation? If not by using a
term like race, do you think there are better ways we could describe
groups of humans that differ genetically from each other?
Responses will vary. As noted
above, J&W give evidence (and you may already be aware) of the ways
in which race has been used to cause harm historically as well as today.
Since race has been used to divide and justify power differentials, it
has great potential for harm. Those who argue for a benefit to using
race as a descriptor may argue that there are health benefits, say to
being aware that a given race has a higher or lower risk of some health
conditions compared to other races. However, higher or lower risk
doesn’t mean a person must get a condition or absolutely can’t have it.
Nor does it mean that a medication absolutely will or will not work on a
person of a given race. As noted above, there is TONS more variation
between people overall compared to the number of differences between
folks from different parts of the world. This reflects the fact that the
characeristics that define race (often primarily skin color) are related
to very few genes in the genome. Thus, it’s not logical to assume that
you know the rest of someone’s genome (and therefore health risks) based
just on their skin color. Given everything we’ve talked about all
semester, I hope you can appreciate that the environment has an equally
large role in affecting phenotype, especially for things like health. In
terms of other ways to describe groups of humans, J&W suggest that
the best way to do something like determine health risks is to examine
the individual’s genotype information as well as their lifestyle factors
that are known to impact a given health condition.
Remember to sign the Honor Code on your assignment.