19 Genetic structure of human populations

Tabitha Panmei

epgp books

 

Table Content

 

1.      Introduction

 

a)      Genotyping frequencies

 

b)      Allele frequencies

 

2.      Approach to population Genetic

 

3.      Background Information

 

4.      Consequences of population subdivision and other types of nonrandom mating

 

a)      Hierarchical population structure

 

b)      Reduction in Heterozygosity

 

5.      Technical comment on genetic structure of human population

 

6.      Effect of population structure

 

7.      Genetic structure change

 

a)      Mutation

 

b)      Migration

 

c)      Natural selection

 

d)      Genetic drift

 

e)      Non-random mating

 

8.      Importance of population structure and dynamics

 

9.      Quantifying population structure

 

a )Fst and Genetic Distance

 

b) Model-Based clustering Algorithms

 

 

Learning of the study

 

After studying this module

  • After reading this module you shall know how to quantify the genetic structure.
  • You might also know the genetic structure changes through different ways.
  • We can understand the approach of genetics.

 

1. Introduction

 

Genetics is a term derived from Greek word Gen which means to become or to grow into something. This term was introduced by Batson in1906. Genetics is that branch of biological science which deals with the transmission of hereditary factors from one generation to next and also the ways in which express themselves during the development and life of an individual.Genetics can be studied at various levels. The study of molecular genetics deals with the biochemical nature of a heredity, specifically DNA and RNA. Geneticists focus on the biochemical nature of heredity, including the structure and function of genes and other DNA sequences.Genetic structure is a pattern in the genetic makeup of individuals within a population. In absence of a genetic structures, one can infer little to nothing about the genetic makeup of an individual by studying other members of the population.Population genetic concerned with changes in genetic variation over time, it is genetic differences and similarities. Here we show that this is wrong; population structure matters. The distributions of individuals and the gene flow connections between different areas can be very important in evolution. By population structure, population geneticists mean that, instead of a single, simple population, populations are subdivided in some way. The overall “population of populations” is often called a meta-population, while the individual component populations are often called, well … subpopulations, but also local populations, or demes. In fact, in many real populations, there may not be any obvious individual populations or substructure at all, and the populations are continuous. However, even in effectively continuous populations, different areas can have different gene frequencies, because the whole meta-population is not panmictic. For instance, among humans, Scotland, the North of England, and London have some quite major language differences, suggesting substructure, but you would be hard put to find an exact boundary where there is a changeover. Such populations are structured, but continuously, in space. A very good definition of population structure is when populations have deviations from Hardy-Weinberg proportions, or deviations from panmixia. If there is inbreeding, or selection, or if migration is important, then populations can be said to be structured in some way. There are two way of describing genetic structure

 

a) Genotype of frequencies-Genetic variation in populations can be analyzed and quantified by the frequency of alleles. Genotype frequency in a population is the number of individuals with a given genotype divided by the total number of individuals in the population. In population genetics, the genotype frequency is the frequency or proportion (i.e., 0 < f < 1) of genotypes in a population. Although allele and genotype frequencies are related, it is important to clearly distinguish them. Genotype frequency may also be used in the future (for “genomic profiling”) to predict someone’s having a disease or even a birth defect. It can also be used to determine ethnic diversity.

 

b) Phenotype of frequencies– phenotypes are based upon the content of the underlying genes comprising the genotype, the expression of those genes in observable traits is also to varying degrees, influenced by environmental factors.

 

2. Approach to population Genetic

 

Approach used to investigate phenomena in population genetics, and many other biological disciples, can be generally separated into three basic. The traditional empirical approach in population genetics comprises extensive observation of the genetic variation of a particular gene or genes in a population or populations, perhaps over time, and the measurement of related factors, such as environmental patterns, that may influence this genetic variation. These data may provide associations between the patterns or levels of genetic variation and other factors, thereby suggesting potential problem for further study. The genetic variants used initially in these empirical investigations initially included morphological variants, blood group polymorphisms, and chromosomal inversions, and then starting in the 1960s,allozyme variation-that is, genetic variation in enzymes and protien. Some of the classic example of genetic polymorphism, such color that needto be clarified for a comprehensive understanding of the factors influencing their frequencies. In recent years, similar empirical examinations have focused on DNA sequence variation, mainly between different species but also between individuals within the same species.

 

Generally only experimental tests can provide support for hypotheses developed from empirical data about the effect of particular factors on levels and patterns of genetic variation. Traditional experiments are exemplified by the moving or transplanting of a population to a new environment and comparing them with non-transplanted population to examine the significance of an environment on genetic variants. However, in recent years, the definition of an experiment in evolutionary genetics has become broad and includes, for example, comparisons of DNA sequences between organisms or genes that have different histories, function or other characteristics.

 

Fig1.1 shows the interconnections among these approaches to population genetics

 

3.Background Information

 

Mid 1800’s Discoveries:

 

• Three major events in the mid-1800’s led directly to the development of modern genetics.

 

• 1859: Charles Darwin publishes The Origin of Species, which describes the theory of evolution by natural selection. This theory requires heredity to work.

 

• 1866: Gregor Mendel publishes Experiments in Plant Hybridization, which lays out the basic theory of genetics. It is widely ignored until 1900.

 

• 1871: Friedrich Miescher isolates “nucleic acid” from pus cells.

 

Events in the 20th Century:

 

•   1900:  rediscovery  of  Mendel’s  work  by  Robert  Correns,  Hugo  de  Vries,  and  Erich  von Tschermak .

 

•   1902: Archibald Garrod discovers that alkaptonuria, a human disease, has a genetic basis.

 

•   1904: Gregory Bateson discovers linkage between genes.  Also coins the word “genetics”.

 

•   1910: Thomas Hunt Morgan proves that genes are located on the chromosomes (using Drosophila).

 

• 1918: R. A. Fisher begins the study of quantitative genetics by partitioning phenotypic variance into a genetic and an environmental component.

 

•   1926: Hermann J. Muller shows that X-rays induce mutations.

 

• 1944: Oswald Avery, Colin MacLeod and Maclyn McCarty show that DNA can transform bacteria, demonstrating that DNA is the hereditary material.

 

• 1953: James Watson and Francis Crick determine the structure of the DNA molecule, which leads directly to knowledge of how it replicates

 

•   1966: Marshall Nirenberg solves the genetic code, showing that 3 DNA bases code for one amino acid.

 

• 1972: Stanley Cohen and Herbert Boyer combine DNA from two different species in vitro, then transform it into bacterial cells: first DNA cloning.

 

•   2001: Sequence of the entire human genome is announced.

 

4.  Consequences of population subdivision and other types of nonrandom mating

 

a)   Hierarchical population structure- A population is said to have a hierarchical population structure if the sub-populations can be grouped into progressively inclusive levels in which, at each grouping, the next lower levels are included within the next higher ones. To consider a concrete example, imagine we interested in the population structure of a widespread species of fresh water fish. The lowest population level consists of a local interbreeding population of animals within a stream. A stream may contain more than one such local population. The next-higher level in the hierarchy may be the organization of streams into groups feeding the same river. Another higher level may be watersheds within continents. The aggregation of subpopulations into progressively more inclusive groups may continue for as many levels as in convenient and informative. It is inevitably somewhat arbitrary how the groups at each level are combined to form the next higher level in the hierarchy. The objective of the classification is informative. One tries to group the subpopulations in such a way as to highlight the genetic similar and differences among them. If so much of migration of fish among subpopulation that all members of the species constituted essentially a single, random-mating population, then there would be no need to define a hierarchical population structure because it would be uninformative. Most of the organisms do have significant population substructure.

 

b) Reduction in Heterozygosity- one of the important consequences of population substructure is a reduction in the average proportion of heterozygous genotypes relative to that expected under random mating. The reason for the reduction in heterozygosity may be understood by considering the hypothetical example in Fig 2.1 the outline is the floor plan of a large barn. The organisms of interest are the mice concentrated primarily into two subpopulations of equal size at the west and east ends of the barn. The movement of mice between the subpopulation s is prevented by a large population of hungry and vigilant cats in the centre area. The occasional mouse that comes out of its refuge is quickly eaten. Because of chance effects in the founding of the subpopulations, west and east subpopulations are completely homozygous for alternative alleles of gene. All the mice in the west subpopulation are AA, and all those in the east subpopulation are aa. In the technical terms, the west subpopulation is fixed for the A allele and the east subpopulation is fixed for the a allele. The genotype frequencies of AA, and aa all those in the west subpopulation are 1,0,0, respectively and those in the east subpopulation are 0,0,1 respectively. Within each subpopulation there is random mating, and the genotype frequencies, though extreme, still satisfy the Hardy- Weinberg principle. Therefore, within any one of the subpopulations in the frequency of heterozygotes equals the expected with Hardy-Weinberg equilibrium.

 

5.Technical comment on Genetic structure of Human population

  • Estimates of genetic variance components depend on the type of marker used, the definitions of geographic regions, the populations sampled within these regions, the relative sample sizes from the populations, and the way in which information is combined across loci.
  • For microsatellite markers, estimates also depend on whether the quantity whose variance is partitioned is an allele-size variable or an indicator variable for allelic presence or absence. A main purpose of our variance component estimation was to provide insight into the finescale population structure analysis in. Because the structure algorithm uses only to identity and nonidentity of alleles, descriptive statistics that employ allelic indicator variables are more appropriate for understanding the dependence of structure-based inference on the “level of difference” among groups than are statistics that use allele size.
  • Escoffier and Hamilton performed a complementary variance component analysis, demonstrating that when a subset of our data corresponding to is studied using allele sizes, as was done in, similar estimates to are obtained. Their smaller within-population variance component compared with that in is consistent with the smaller estimate of in relation to microsatellite studies that used indicator variables.
  • Previous indicator-based studies of microsatellites and other markers have not all been in full agreement, a difference in the nature of the variable cannot be the sole source of differing estimates. First, the homogenizing effect of the higher mutation rates of microsatellites, in contrast with those of other markers, probably explains some of the difference of our results from non-microsatellite indicator-based studies. Second, consistent with past observations, the high fraction of tetra-nucleotide loci in our data contributes to higher within-population variance component estimates than are seen in dinucleotide studies. Third, the estimates vary considerably across sampling schemes within regions, and in several cases, past microsatellite samples that included multiple groups per region used populations that are among the most differentiated of the groups.
  • Any estimate computed with the well-separated populations that contribute to the 83.4% within population variance component obtained by Escoffier and Hamilton should be regarded as a lower bound groups in data. Any estimate computed with the well-separated populations that contribute to the 83.4% withinpopulation variance component obtained by Escoffier and Hamilton should be regarded as a lower bound.

 

6.Effect of population Substructure

 

The multiplication in Equation makes a number of assumptions about human populations: 1) that a Hardy Weinberg principle holds for each locus, 2) that each locus is statistically independent of others so that the multiplication across loci is justified, and 3) that the only levels of population substructure that is important for DNA typing is that of race. Critics of the multiplication rule argued that genetically important subpopulations need not coincide with racial designations. For example, the term Hispanic includes a mixture of different subpopulations with variable amounts of Spanish, natives American Indian, and African ancestry. Similarly, there are potentially important differences allele frequency among Caucasian population and black population. Furthermore, if the allele frequencies of different VNTRs different among sub-populations, then the loci are not statistically independent- even if they are genetically unlinked- and so the multiplication across loci is unjustified. Because of population sub-structure, DNA matches across multiple VNTRs could be more common among people within a particular ethnic group than among people drawn at random from the population as a whole, and so calculations of genotype frequency should be based on the ethnic group of the accused person and not on race as a whole. On the other side, defenders of the multiplication rule argued that population substructure would have a relatively minor effect on the final outcome of the calculation and that what matters most is not a high degree of accuracy but rather a general sense of whether a particular multi-locus genotype is rare or common.

 

7.Genetic structure change

 

The changes in genetic structure occur through time in a various way.

 

a)Mutation– Mutation is the ultimate source of genetic variation in the form of new alleles. Mutation can result in several different types of change in DNA sequences; these can have no effect, alter the product of a gene, or prevent the gene from functioning. Mutations can involve large sections of DNA becoming duplicated, usually through genetic recombination. These duplications are a major source of raw material for evolving new genes, with tens to hundreds of genes duplicated in animal genomes every million years. Most genes belong to larger families of genes of shared ancestry. Novel genes are produced by several methods, commonly through the duplication and mutation of an ancestral gene, or by recombining parts of different genes to form new combinations with new functions. Here, domains act as modules, each with a particular and independent function, that can be mixed together to produce genes encoding new proteins with novel properties. For example, the human eye uses four genes to make structures that sense light: three for color vision and one for night vision; all four arose from a single ancestral gene. Another advantage of duplicating a gene (or even an entire genome) is that this increases redundancy; this allows one gene in the pair to acquire a new function while the other copy performs the original function. Other types of mutation occasionally create new genes from previously noncoding DNA.

 

In addition to being a major source of variation, mutation may also function as a mechanism of evolution when there are different probabilities at the molecular level for different mutations to occur, a process known as mutation bias. If two genotypes, for example one with the nucleotide G and another with the nucleotide A in the same position, have the same fitness, but mutation from G to A happens more often than mutation from A to G, then genotypes with A will tend to evolve. Different insertion vs. deletion mutation biases in different taxa can lead to the evolution of different genome sizes. Developmental or mutational biases have also been observed in morphological evolution.For example, according to the phenotype-first theory of evolution, mutations can eventually cause the genetic assimilation of traits that were previously induced by the environment.

 

b)Migration-Many populations are not completely isolated, and exchange genes with other populations of the same species. Individuals migrating into a new population may introduce new alleles into the gene pool and alter the frequencies of existing alleles. Thus migration has the potential to disrupt H-W equilibrium and may influence the evolution of allelic frequencies within populations.Migration usually implies movement of organisms, however in population genetics we are interested in movement of genes, which may or may not occur when organisms move. Movement of genes takes place only when organisms or gametes migrate and contribute their genes to the gene pool of the recipient population. This process is also referred to asgene flow.

 

c)Natural selection-Natural selection is the fact that some traits make it more likely for an organism to survive and reproduce. Population genetics describes natural selection by defining fitness as a propensity or probability of survival and reproduction in a particular environment. The fitness is normally given by the symbol w=1-s where s is the selection coefficient. Natural selection acts on phenotypes, or the observable characteristics of organisms, but the genetically heritable basis of any phenotype which gives a reproductive advantage will become more common in a population. In this way, natural selection converts differences in fitness into changes in allele frequency in a population over successive generations.

 

Before the advent of population genetics, many biologists doubted that small differences in fitness were sufficient to make a large difference to evolution. Population geneticists addressed this concern in part by comparing selection to genetic drift. Selection can overcome genetic drift when s is greater than 1 divided by the effective population size. When this criterion is met, the probability that a new advantageous mutant becomes fixed is approximately equal to 2s.

 

d)  Genetic drift-Genetic drift is a change in allele frequencies caused by random sampling. That is, the alleles in the offspring are a random sample of those in the parents. Genetic drift may cause gene variants to disappear completely, and thereby reduce genetic variability. In contrast to natural selection, which makes gene variants more common or less common depending on their reproductive success,the changes due to genetic drift are not driven by environmental or adaptive pressures, and may be beneficial, neutral, or detrimental to reproductive success.The effect of genetic drift is larger for alleles present in few copies than when an allele is present in many copies. Scientists wage vigorous debates over the relative importance of genetic drift compared with natural selection.

 

e)    Non-random mating-Many populations do not mate randomly for some traits, and when non-random mating occurs, the genotypes will not exist in H-W equilibrium.

  • Positive Assortative mating- Individuals with similar phenotypes mate preferentially.
  • Negative Assortative mating – phenotypic ally dissimilar individuals mate more often than randomly chosen individuals.
  • Neither affects the allelic frequencies, but both may affect the genotypic frequencies if the phenotypes are genetically determined.

 

8. Importance of population structure and Dynamics

 

In trying to conceptualize what a current, or indeed ongoing, global head count of whites would look like (no such reliable enumeration exists), it is imperative to keep in mind the age structure and reproductive profile of whatever population exists, as well as the dynamics of rapid change along key vectors. Nothing is stationary or mirrors the past. Habitual patterns of thought quickly become outmoded without anyone being aware of it.Contemporary demographic statistics conceal racial information. Elites are obsessed by race, and particularly with accelerating in any way possible the decline and disappearance of the white race. But meaningful data on ethnicity do not exist. It is the only major variable not regularly measured or recorded by demographers. If such figures were available, they might jeopardize indefensible policies.

 

Given replacement migration, exceptionally high non-white birth rates, sub-replacement fertility among whites, and widespread culturally-encouraged hybridization with non-whites in all formerly white homelands around the world, the opposite ends of the national age spectrum everywhere now feature a predominantly white elderly population and an increasingly non-white youth population.As a consequence, former First World nations are composed more and more of non-whites and hybrids. The process is taking place with lightning speed. Thanks to an indispensable assist from dishonest media, academia, and governments, plus draconian and repressive laws, even white racialists, never mind the public, fail to grasp the urgency of the situation.The aging and death of baby boomers born between 1946 and 1964, currently in progress, will in short order eradicate a numerically large proportion of the remaining white populace. Despite the lack of precise data, we nevertheless know the overall trends, so revolutionary and sweeping are the top-down social changes that have been imposed.

 

To better understand the crisis, it is helpful to employ a demographic tool known as the population pyramid. A population pyramid is a graphical representation of the distribution of age groups, usually by country or region, shaped like a pyramid when populations are young and growing.

 

There are two basic shapes they are:

 

The classic pyramid: A young, rapidly growing population with a high birth rate. This is characteristic of many non-white races today, and of vibrant subpopulations such as Orthodox Jews.

Fig .2.Representative Age Pyramid for an Expanding Nonwhite Race (Angolan age pyramid, 2005)

 

The box: A stable, replacement-level population characterized by low infant mortality, little or no demographic growth, and long life expectancy.

 

The inverted, upside-down, pyramid: Low birth rate, collapsing population, long life expectancy. This is characteristic of the white race.

 

Fig.3.Hypothetical Inverted Age Pyramid of the White Race (demographic collapse)

 

This is how the white population looked in the 19th and early 20th centuries. But one must go beyond the static snapshot to the underlying dynamics. A race such as this has built-in momentum for future growth because so many young people will reproduce at high rates in the future even if total fertility gradually falls.

 

Contrast this with the rapidly aging and collapsing white populations of today, characterized by many old and few young, many deaths and few births. A contemporary population pyramid for our people would resemble the following hypothetical construct (not a representation of any actual white population, the data for which is unavailable).A simple head count (census) of living whites, though indispensable, does not convey an accurate picture of what is really happening. Older cohorts constituting the largest chunk of an upside-down pyramid, though still alive, do not directly contribute children to future generations because they are beyond the close of their reproductive period.

 

9. Quantifying population structure

 

There are two ways of quantifying population structure:

 

a)Fst and Genetic DistanceNonrandom mating in a population with substructure has two consequences: first, preferential mating between individuals from the same subpopulation is a form of inbreeding , and has the effect of reducing genetic diversity (measured as, say, heterozygosity) in the overall population; second, as the subpopulationsexperience independent genetic drift, allele frequencies at genetic markers tend to diverge. Originally introduced by Wright in 1921 to quantify the inbreeding effect of population substructure, FST has become one of the most widely used measures of genetic differentiation between predefined subpopulations. Consider the simple setting, in which a population consists of several subpopulations. FST is defined as the decrease in heterozygosity among subpopulations(HS), relative to the heterozygosityin the total population (HT):

 

FsT =HT-Hs/HT (20.3)

 

Where, HS is the expected heterozygosity, computed under the assumption that mating is random within each subpopulation (Hardy-Weinberg equilibrium), while HTis analogously computed assuming random mating in the entire population without population structure. Alternatively, FST is often loosely interpreted as the proportion of variance in allele frequencies at a locus that is explained by the subpopulation level of organization. For example, suppose the frequency of an allele is 0 and 1 in two subpopulations, respectively, then FST = 1, meaning the variance in allele frequency is completely explained by the population division. Under this framework, FSTat a biallelic single nucleotide polymorphism (SNP) marker can be computed based on the allele frequencies:

 

FST=σp2/p-(1-p-)(20.4)

 

here σp2is the variance of allele frequencies among subpopulations and denotes the average allele frequency in the pooled population. It can be shown that (20.3) and (20.4) are mathematically equivalent for biallelic markers, but (20.4) often computationally more convenient. FST is often taken as a genetic distance  measure,  with higher  values of FST reflecting a greater  level of genetic divergence. However, both (20.3) and (20.4) define  FST  for a specific locus;  FSTcan vary considerably from locus to locus. Moreover, a locus that is under population- or environment-specific selection can also exhibit unusually high FST. For exampleacross globally-distributed human populations, functional polymorphisms in genes related to skin pigmentation show unusually high levels of FST(i.e., population differentiation) as compared to the genome-wide distribution . To reduce the variance across the markers and the bias due to a small number of strongly selected loci, when FST is reported as an index for genetic distance among subpopulations, it is often calculated by averaging both the numerator and the denominator in (20.3) or (20.4) across loci. When one is interested in quantifying the degree of substructure among predefined populations, FSTis a simple and useful measure of genetic distance. However, it is often the case that we are interested in using the genetic data itself to define the populations. In particular, if we are interested in detecting cryptic or hidden population structure, then we need to resort to other approaches. One method for detecting latent population structure, principal component analysis (PCA), was introduced in Sect.6.4.4. In the next section, we explain a complementary approach, which defines subpopulations based on statistical genetic models for the data.

 

b)   Model-Based clustering Algorithms- Cluster analysis refers to a large family of approaches, whose goal is to simultaneously define subsets (called clusters) and to assign observational units into these clusters, so that members in the same cluster are similar by some criteria. For comprehensive survey of clustering approaches.In the context of inferring genetic structure, the data usually consist of individuals genotyped at multiple genetic markers (e.g., restriction fragment length polymorphisms RFLPs, microsatellites, or SNPs). In the discrete population model, all alleles in an individual are assumed to be drawn randomly from one of the subpopulations, according to a set of allele frequencies that are specific to each subpopulation. The goal of the analysis is to simultaneously estimate subpopulation allele frequencies and group membership (i.e., which individuals are drawn from which subpopulation). However, for many human populations, there is often no single group from which individuals derive their ancestry. That is, recent migration gives rise to 593genetically admixed individuals, whose genomes represent a mixture of alleles from multiple “ancestral” populations. Mathematically, this means that an individual may have partial membership in more than one cluster. These clusters are biologically interpreted as ancestral populations for the admixed individuals. For example, African Americans in the United States are a recently admixed group, deriving ancestry from European and West African ancestral populations. Under the admixture model, an African American individual’s population membership is characterized by the individual ancestry (IA) proportion, which is a vector representing the probability that a randomly selected allele from this individual originates from a European (or alternatively, an African) ancestor. Under either the discrete or the admixture model, individuals’ memberships (or IA values) are jointly inferred with the allele frequencies in each subpopulation, using either maximum likelihood or Bayesian methods. We begin by explaining the maximum likelihood approach for the discrete subpopulation model, as this model illustrates the principles that underlie most of the model-based approaches. Let Gim ( a(i ,m), b(i ,m )) denote the genotype of individual   i   at marker   m , with       a(i,m)        and       b(i,m)       being the unordered pair of alleles. Let     Zi€(1,…, k) and P={Pmik}    indicate the subpopulation membership for individual  i , and l k m Pp =     be the frequency of allele    l   at marker   m   in population   k . Under the assumption that genotypes among markers are independent conditioning on an individual’s membership, and that all markers are in Hardy-Weinberg equilibrium within each subpopulation, the likelihood function, treating Z and P as parameters, is simply the product of the probability of observing each allele:

 

L(p,Z;G)α ПП,Pa(I,m) Pzim(i,m)(20.5)

 

For the admixture model, one can substitute Ziby(Zi,ma,Zbi,m), the population origin of each allele, and model ,Zai,mand Zbi,mas independent draws from the multinomial probability vectors of individual ancestry. The inference of population structure amounts to the inference on Zp, or the genome-wide average of (Zai,m, Zbi,m). In the maximum likelihood approach, the expectation maximization (EM) algorithm can be used to find the maximum likelihood estimates for theparameter values,(P,Z). Alternatively, Bayesian approaches incorporate prior distributions into the likelihood, in order to evaluate the posterior distribution. The Bayesian methods offer a flexible framework for incorporating more complex population history models. For example, one of the widely used Bayesian programs, structure, includes useful features such as modeling linkage among loci, and the ability to model correlated allele frequencies between evolutionarily related ancestral populations.

 

Summary

  •  Genetics is that branch of biological science which deals with the transmission of hereditary factors from one generation to next and also the ways in which express themselves during the development and life of an individual.
  • Genotype of frequencies is a Genetic variation in populations can be analyzed and quantified by the frequency of alleles. Genotype frequency in a population is the number of individuals with a given genotype divided by the total number of individuals in the population.
  • Phenotype of frequencies is phenotypes are based upon the content of the underlying genes comprising the genotype.
  • The traditional empirical approach in population genetics comprises extensive observation of the genetic variation of a particular gene or genes in a population or populations, perhaps over time, and the measurement of related factors, such as environmental patterns, that may influence this genetic variation.
  • A population is said to have a hierarchical population structure if the sub-populations can be grouped into progressively inclusive levels in which, at each grouping, the next lower levels are included within the next higher ones.
  • One of the important consequences of population substructure is a reduction in the average proportion of heterozygous genotypes relative to that expected under random mating.
  • The multiplication in Equation makes a number of assumptions about human populations: 1) that a Hardy Weinberg principle holds for each locus, 2) that each locus is statistically independent of others so that the multiplication across loci is justified, and 3) that the only levels of population substructure that is important for DNA typing is that of race.
  • Critics of the multiplication rule argued that genetically important subpopulations need not coincide with racial designations.
  • Mutation is the ultimate source of genetic variation in the form of new alleles. Mutation can result in several different types of change in DNA sequences; these can have no effect, alter the product of a gene, or prevent the gene from functioning.
  • Migration is  many  populations  are  not  completely  isolated,  and  exchange  genes  with  other populations of the same species.
  • Natural selection is the fact that some traits make it more likely for an organism to survive and reproduce.
  • Genetic drift is a change in allele frequencies caused by random sampling. That is, the alleles in the offspring are a random sample of those in the parents.
  • Many populations do not mate randomly for some traits, and when non-random mating occurs, the genotypes will not exist in H-W equilibrium.
  • There are two ways of quantifying population structure:a)Fst and Genetic Distance Nonrandom mating in a population with substructure has two consequences: first, preferential mating between individuals from the same subpopulation is a form of inbreeding , and has the effect of reducing genetic diversity (measured as, say, heterozygosity) in the overall population; second, as the subpopulations experience independent genetic drift, allele frequencies at genetic markers tend to diverge.b)Model-Based clustering Algorithms Cluster analysis refers to a large family of approaches, whose goal is to simultaneously define subsets (called clusters) and to assign observational units into these clusters, so that members in the same cluster are similar by some criteria.

 

you can view video on Genetic structure of human populations

Reference Suggested books

  • Daniel L. Hartl and Andrew G. Clark (1997) Principles of Population Genetics. Library of Congress cataloging-in-Publication Data, 3rd edition
  • Phillip W. Heidrick (2005) Genetics of Population. Jones and Barlett Publishers, Inc Matthew B. Hamilton (2009) Population Genetics. John Wiley and Sons Ltd, Publication.