## Testing Genetic Identity by Descent of Affected Siblings

This book summarizes an array of strategies for learning relations between expressed heritable traits and genes - the carrier of the genetic information for the formation of proteins. One of these strategies, called the affected sib-pairs (ASP) approach, calls for the collection of a large number of nuclear families, each with a pair of affected siblings that share the condition under investigation. Chapter 9 goes into details in describing statistical issues related to this design. Here we consider an artificial, but somewhat simpler, scenario where all the sibling pairs are actually half-siblings, who share only one parent in common, and we concentrate on a single gene, which may or may not contribute to the disease. The aim is to test the null hypothesis of no contribution.

The gene may be embodied in any one of several variant forms, called alleles. On autosomal chromosomes an individual carries two homologous copies of the gene, one inherited from the mother and the other from the father. Therefore, each offspring carries two versions of the given gene, which may not be identical in form. Still, one of the genes is an identical copy of one of the two homologous genes in the common parent while the other is a copy of one of the homologous genes in the other parent. Concentrate on the copies in the half-siblings that originated from the common parent. There are two possibilities: both half-siblings' copies emerge from a common ancestral source or else each was inherited from a different source. In the former case we say that the two copies are identical by descent (IBD), and in the latter case we say that they are not IBD. It is natural to model the IBD status of a given pair as a Bernoulli trial, with an IBD event standing for success. Counting the number of half-sibling pairs for which a gene is inherited IBD would produce a binomial random variable.

At a locus unrelated to the trait, Mendel's laws governing segregation of genetic material from parent to offspring will produce IBD or not with equal probabilities, since whichever gene the first child inherited, the second child has a 50% chance of inheriting the same gene. This probability of IBD is the probability of success when the gene does not contribute to the trait. Suppose, however, that the gene does contribute to the trait. Since both siblings share the trait one may reasonably expect an elevated level of sharing of genetic material within the pair, thus an elevated probability of IBD. Denote by J the IBD count for a given pair, with J = 0 or J = 1, and let n be the probability that J = 1. A natural formulation of the statistical hypothesis is given by H0 : n = 1/2 versus H1 : n > 1/2. Given a sample of n pairs of half-siblings who share the trait, one may use as a test statistic the number of pairs that share an allele IBD. Since each pair can be regarded as a Bernoulli trial, if we also assume the parents are unrelated to one another, the trials are independent, so the sum is binomially distributed with paramters n and n. One may standardize this binomially distributed statistic by subtracting out the expectation and dividing by the standard deviation, both computed under the null distribution: B(n, 1/2). The standardized statistic is:

A common recommendation is to reject the null hypothesis if Zn exceeds a threshold of z = 1645, since according to the central limit theorem (discussed below) this will produce a significance level of about 0.05. Values of the test statistic Zn above that threshold lead to the rejection of the null hypothesis and to the conclusion that the gene contributes to the trait.

Let us investigate the significance level of the proposed test. Assume that a total of n = 100 pairs were collected and that in fact n = 1/2. Then the results of the test may look like this:

## Post a comment