1.1 The Why
The objective of statistics can be divided into three parts:
 To Describe
 To Explain and
 To Predict
Description begins with careful observation. It involves carefully observing behavior in order to describe it. Description allows us to learn about behavior and when it occurs. Let’s say, for example, that you were interested in the channelsurfing behavior of males and females. Careful observation and description would be needed in order to determine whether or not there were any gender differences in channelsurfing. Description allows us to observe that two events are systematically related to one another. Without description as a first step, predictions cannot be made.
Explanation allows us to identify the causes that determine when and why a behavior occurs. In order to explain a behavior, we need to demonstrate that we can manipulate the factors needed to produce or eliminate the behavior. For example, in our channelsurfing example, if gender predicts channelsurfing, what might cause it? It could be genetic or environmental. Maybe males have less tolerance for commercials and thus channelsurf at a greater rate. Maybe females are more interested. Maybe the attention span of females is greater. Maybe something associated with having a Y chromosome increases channelsurfing, or something associated with having two X chromosomes leads to less channelsurfing. Obviously the possible explanations are numerous and varied. As scientists, we test these possibilities to identify the best explanation of why a behavior occurs.
Prediction allows us to identify the factors that indicate when an event or events will occur. In other words, knowing the level of one variable allows us to predict the approximate level of the other variable.
Variable: An event or behavior that has at least two values
We know that if one variable is present at a certain level, then there is a greater likelihood that the other variable will be present at a certain level. For example, if we observed that males channelsurf with greater frequency than females, we could then make predictions about how often males and females might change channels when given the chance.
1.2 The What
So we want to describe or explain or predict. The question is what? We are interested in phenomena; that is, an event, a characteristic or behavior. We could be interested in the channel surfing behavior of people or the relationship between traffic levels and weather patterns. To do that we have to make observations; to observe the phenomena we want to describe or explain or predict.
An observation is an instance of a variable
This is a good time to talk about a variable. A variable is a phenomena that can take on two or more values. The weights of a group of people in a classroom, for instance, is a variable. Each person (observation) will have their own weight (variable).
Data, therefore, is a set of observations that is arranged in a meaningful way. We shall talk about the arrangement of data later. For now let us focus on the observations themselves.
Data is the focal point of all our statistical activity. How much and what kind of data we have will greatly influence what we can do. It may seem that we are spending a lot of time on what may be considered unnecessary but these are the building blocks.
Characteristics or Properties of Data
These properties include identity, magnitude, equal unit size, and absolute zero. When a measure has the property of identity, objects that are different receive different scores. For example, if members of this class had different heights, they would all receive different measurements. Measurements have the property of magnitude (also called ordinality) when the ordering of the numbers reflects the ordering of the variable. In other words, numbers are assigned in order so that some numbers represent more or less of the variable being measured than others.
Measurements have an equal unit size when a difference of 1 is the same amount throughout the entire scale. For example, the difference between people who are 64 kilos and 65 kilos is the same as the difference between people who are 72 kilos and 73 kilos. The difference in each situation (1 kg) is identical. Notice how this differs from the property of magnitude. Were we to simply line up and rank a group of individuals based on their weight, the scale would have the properties of identity and magnitude, but not equal unit size. Can you think about why this would be so? We would not actually measure people’s weight in kilos, but simply order them in terms of how big they appear, from smallest (the person receiving a score of 1) to biggest (the person receiving the highest score). Thus, our scale would not meet the criteria of equal unit size. In other words, the difference in weight between the two people receiving scores of 1 and 2 might not be the same as the difference in height between the two people receiving scores of 3 and 4.
Lastly, measures have an absolute zero when assigning a score of zero indicates an absence of the variable being measured. For example, bank account balance would have the property of absolute zero because a score of 0 on this measure would mean an individual has no money in the bank. However, a score of 0 is not always equal to the property of absolute zero. As an example, think about the temperature scale. That measurement scale has a score of 0 (the thermometer can read 0 degrees), but does that score indicate an absence of temperature? No, it indicates a very cold temperature. Hence, it does not have the property of absolute zero.
SCALES OF MEASUREMENT
Why are the properties of data important? They are important because data in itself might not be very useful. If we have a bunch of data (please do not use this phrase outside of this class!) we want to do something with it. That “something” is called manipulation. I am not talking about the evil and conniving manipulation of soap opera villains. Manipulation is simply the use of some techniques to convert one or more quantities into another quantity or quantities.
Every January, people are always pledging to join the gym because they have “added weight”. This means they had a starting weight and then gained some more weight to total into the weight they now have (say 65kg + 2kg = 67kg). Conversely, let’s say that a group of men went for a workshop on how to become better men in society. Can we say they have added “manness”? No. The quantity gender cannot be added. Can you explain why that is from the qualities of data above?
As noted previously, the level or scale of measurement depends on the properties of the data. There are four scales of measurement (nominal, ordinal, interval, and ratio), and each of these scales has one or more of the properties described in the previous section. As we will see later on, it is important to establish the scale of measurement of your data in order to determine the appropriate statistical test to use when analyzing the data.
A nominal scale is one in which objects or individuals are broken into categories that have no numerical properties. Nominal scales have the characteristic of identity but lack the other properties. Variables measured on a nominal scale are often referred to as categorical variables because the measuring scale involves dividing the data into categories. However, the categories carry no numerical weight. Some examples of categorical variables, or data measured on a nominal scale, include ethnicity, gender, and political affiliation.
An ordinal scale is one in which objects or individuals are categorized and the categories form a rank order along a continuum. Data measured on an ordinal scale have the properties of identity and magnitude but lack equal unit size and absolute zero. Ordinal data are often referred to as ranked data because the data are ordered from highest to lowest, or biggest to smallest. For example, the number ranks students are given in school based on performance in an exam (number 1, 2, etc.) is an ordinal scale. This variable would carry identity and magnitude because each individual receives a rank (a number) that carries identity, and beyond simple identity it conveys information about order or magnitude (how many students performed better or worse in the class).
An interval scale is one in which the units of measurement (intervals) between the numbers on the scale are all equal in size. When using an interval scale, the properties of identity, magnitude, and equal unit size are met. For example, the temperature scale is an interval scale of measurement. A given temperature carries identity (days with different temperatures receive different scores on the scale), magnitude (cooler days receive lower scores and hotter days receive higher scores), and equal unit size (the difference between 20 and 21 degrees is the same as that between 30 and 31 degrees.) However, the temperature scale does not have an absolute zero. Because of this, we are not able to form ratios based on this scale (for example, 50 degrees is not twice as hot as 25 degrees).
A ratio scale is one in which, in addition to order and equal units of measurement, there is an absolute zero that indicates an absence of the variable being measured. Ratio data have all four properties of measurement—identity, magnitude, equal unit size, and absolute zero. Examples of ratio scales of measurement include weight, time, and height. Each of these scales has identity (individuals who weigh different amounts would receive different scores), magnitude (those who weigh less receive lower scores than those who weigh more), and equal unit size (1 kg is the same weight anywhere along the scale and for any person using the scale). These scales also have an absolute zero, which means a score of zero reflects an absence of that variable. This also means that ratios can be formed. For example, a weight of 100 kg is twice as much as a weight of 50 kg.
Table 1.1: SCALES OF MEASUREMENT

Nominal 
Ordinal 
Interval 
Ratio 
Example 
Ethnicity Religion Gender 
Class rank Letter Grade 
Temperature 
Weight Height Time 
Properties 
Identity

Identity Magnitude

Identity Magnitude Equal unit size

Identity Magnitude Equal unit size Absolute zero 
Mathematical Operations 
None 
Rank Order 
Add Subtract Multiple Divide 
Add Subtract Multiple Divide 
Typical Statistics Used 
Mode Chi Square 
Mode Median Wilcoxon Test 
Mode Median Mean t test ANOVA 
Mode Median Mean t test ANOVA 
Another means of classifying variables is in terms of whether they are discrete or continuous in nature. Discrete variables usually consist of wholenumber units or categories. They are made up of chunks or units that are detached and distinct from one another. A change in value occurs a whole unit at a time, and decimals do not make sense with discrete scales. Most nominal and ordinal data are discrete. For example, gender, political party, and ethnicity are discrete scales. Some interval or ratio data can be discrete. For example, the number of children someone has would be reported as a whole number (discrete data), yet it is also ratio data (you can have a true zero and form ratios).
Continuous variables usually fall along a continuum and allow for fractional amounts. The term continuous means that it “continues” between the wholenumber units. Examples of continuous variables are age (22.7 years), height (64.5 inches), and weight (113.25 kg). Most interval and ratio data are continuous in nature.
1.3 The How
Psychiatrists say that we cannot remember the first three years of our lives. Our minds just blot those memories out. For most of us, it is not just these memories that we forget, but where we put our keys, our friends’ birthdays and that all important meeting are just some of the things we regularly forget. Man is by his very nature, inadequate. That is why he built machines!
Al throughout school, you have been trained to memorize concepts. This is not one of those classes. I do not want you to memorize anything. I want you to understand the concept. Anything else can always be found when needed. When you understand how to do it, you can always find the means and the tools to do the job required. Especially in exploratory analysis (you don’t have to know what this is now), the job is in trying to figure out what you want to do, rather the doing itself.
I will try as much as possible to provide reference material that further explains the concepts introduced in this course. Nevertheless, we shall focus on the practical application of the concepts rather than the theoretical.
TOOLS
As mentioned earlier, man built tools to help make his work easier. Therefore, in this class, we shall also make use of statistical tools/packages to help make our work easier. For statistical analysis, we have several options:
 MS Excel
 SPSS
 SAS
 STATA
 R
The above 5 tools are the most commonly used tools for statistical analysis. Each one has its own advantages and disadvantages and the choice of which one to use will depend on many factors including expertise of the user, size of task, cost, etc. In this class we shall be using R. There are many reasons for using R but the main reason we are using it in this class is because it is free while all the others have to be purchased.
Please see our R Tutorials. For this lesson, we need to be able to install and run R, create data and import data into R.
Next: Descriptive Statistics I – Organizing Data
Pingback: COURSE MOTIVATION  Do Thy Math
What is the difference between bytrait factorial analysis and byperson factorial analysis? How would this difference play out for ordinal measures with a rankorder entries (entered through Likert scale / semantic differential) where subjectivity is what is being measured?
LikeLike
I do believe what you are asking is more to do with Personality Theory. Yes, factor analysis is a statistical technique but it is use in exploratory analysis more than in inferential statistics.
LikeLike