When a distribution of scores is very large, it tends to approximate a pattern called a normal distribution. When plotted as a frequency polygon, a normal distribution forms a symmetrical, bell-shaped pattern often called a normal curve (see Figure 5.1). We say that the pattern approximates a normal distribution because a true normal distribution is a theoretical construct not actually observed in the real world.
The normal distribution is a theoretical frequency distribution that has certain special characteristics. First, it is bell-shaped and symmetrical—the right half is a mirror image of the left half. Second, the mean, median, and mode are equal and are located at the center of the distribution. Third, the normal distribution is unimodal—it has only one mode. Fourth, most of the observations are clustered around the center of the distribution, with far fewer observations at the ends, or “tails,” of the distribution. Lastly, when standard deviations are used on the x-axis, the percentage of scores falling between the mean and any point on the x-axis is the same for all normal curves. We will discuss the normal distribution more extensively in later lessons.
Figure 5.1 Normal Curve
Although we typically think of the normal distribution as being similar to the curve depicted in Figure 5.1, there are variations in the shape of normal distributions. Kurtosis refers to how flat or peaked a normal distribution is. In other words, kurtosis refers to the degree of dispersion among the scores, or whether the distribution is tall and skinny or short and fat. The normal distribution depicted in Figure 5.1 is called mesokurtic—meso means “middle.” Mesokurtic curves have peaks of medium height and the distributions are moderate in breadth. Now look at the two distributions depicted in Figure 5.2.
The normal distribution on the left is leptokurtic—lepto means “thin.” Leptokurtic curves are tall and thin, with only a few scores in the middle of the distribution having a high frequency. Last, see the curve on the right side of Figure 5.2. This is a platykurtic curve—platy means “broad” or “flat.” Platykurtic curves are short and more dispersed (broader). In a platykurtic curve, there are many scores around the middle score that all have a similar frequency.
Figure 5.2 Kurtosis
Positively Skewed Distributions
Most distributions do not approximate a normal or bell-shaped curve. Instead, they are skewed, or lopsided. In a skewed distribution, scores tend to cluster at one end or the other of the x-axis, with the tail of the distribution extending in the opposite direction. In a positively skewed distribution, the peak is to the left of the center point and the tail extends toward the right, or in the positive direction. (See Figure 5.3.)
Notice that what is skewing the distribution, or throwing it off center, are the scores toward the right or positive direction. A few individuals have extremely high scores that pull the distribution in that direction. Notice also what this does to the mean, median, and mode. These three measures do not have the same value, nor are they all located at the center of the distribution as they are in a normal distribution. The mode—the score with the highest frequency—is the high point on the distribution. The median divides the distribution in half. The mean is pulled in the direction of the tail of the distribution; that is, the few extreme scores pull the mean toward them and inflate it.
Negatively Skewed Distributions
The opposite of a positively skewed distribution is a negatively skewed distribution—a distribution in which the peak is to the right of the center point and the tail extends toward the left, or in the negative direction. The term negative refers to the direction of the skew. As can be seen in Figure 5.3, in a negatively skewed distribution, the mean is pulled toward the left by the few extremely low scores in the distribution. As in all distributions, the median divides the distribution in half, and the mode is the most frequently occurring score in the distribution.
Figure 5.3 Skewness
The kth central moment (or moment about the mean) of a data population is:
Similarly, the kth central moment of a data sample is:
In particular, the second central moment of a population is its variance.
The skewness of a data population is defined by the following formula, where μ2 and μ3 are the second and third central moments.
The kurtosis of a univariate population is defined by the following formula, where μ2 and μ4 are the second and fourth central moments.