| Course
Portal
Notice Board
Course
Syllabus
Electronic
Communication
Requirement
Exercises
[1] [2] [3]
[4] [5] [6]
Research
Paper
WebCT
Bulletin Board
WebCT
Chat Room |

SOCIOLOGY AND THE
INTERNET
Exericse 5: Doing Data Analysis and Testing Hypotheses Online
Robert E. Wood
Department of Sociology, Anthropology and Criminal Justice
Spring Semester 2000

The purpose of this exercise is
twofold. First, it aims to teach and/or reinforce skills in elementary data analysis.
Multivariate analysis looks at how different variables are related to each other. In the
class lab session and in this assignment, we will explore how to produce and interpret
cross-tabulations of two variables, and how to introduce a third, control, variable.
Furthermore, we will learn how to do these types of multivariate analysis over the
internet using the data from the General Social Survey. Knowing how to do this has many
potential practical applications. When writing papers for other classes, you can easily
call up and print out data on attitudes and behavior that can support or test your
arguments. In business and service agency settings, you can immediately access data to
help guide decisions about marketing, program design, and many other things. With not only
the General Social Survey but several other major data sets at this web site at your
instant command, you can become an essential resource person in a variety of group and
organizational settings.
Due Date: Tuesday, March 9th
Preparation: We will learn about several
interactive websites that allow you to select and analyze data over the internet in class
lab sessions in February.
Part of these sessions will involve a review of the basic
steps in quantitative data analysis, including:
- Formulating a two-variable hypothesis
- Operationalizing variables
- Creating a cross-tabulation
- Determining whether the data are statistically significant
- Deciding whether your hypothesis is supported or not
- Introducing a third, control variable, and seeing how it
affects the original relationship.
Before proceeding with the computing part of this exercise,
you should plan ahead what you are going to do. This involves two steps:
1) I want you to explore two ways in which social class
background is related to attitudes, behavior, or social attributes using data from the General Social Survey. As
you undoubtedly know from other sociology courses, social class is one of the most central
concepts in sociology, and one of the most important predictors of individual behavior and
life chances. To measure social class, we will use a self-identification question in which
the respondents are asked to choose between different class labels. The code for this
variable is "class". This will be your independent variable.
To choose your dependent variables,
consult the "Standard
Variable List" for the 1972-1994 General Social Survey Cumulative File. This is
available online, but I have printed it out and appended excerpts to these instructions
for your convenience. You can probably figure out what most of them involve, but you can
always call up the full question by going to the codebook site and clicking on the
question you are interested in. Choose two variables (from different sections of
the list) that you think are likely to be related to class, and think about what kind of
relationship you expect to find. For each of these, write down a hypothesis.
For example, if you were to choose "happy", think about whether you would expect
happiness to increase or decrease as you go up the class scale. Write down the underlined
code names for these two variables and your hypothesis about how each is related to social
class. In most cases, your hypothesis should be in the form of: The
higher a persons's social class position, the less likely (more likely) is the person
to.....
Worksheet
List Your Two Dependent
Variables |
State Your Hypothesis For Each Variable |
| 1. |
1. |
| 2. |
2. |
Using the instructions below, you will generate
cross-tabulations of these two variables with the "class" variable and interpret
the results.
2) In the second part of this exercise, I want you to
introduce a "control" variable. Controlling for a third variable allows you to
see if the original relationship between two variables is altered when you compare people
who have similar characteristics on the third. In this exercise, the control variable you
will use is "race". What you want to see is whether controlling for race alters
the association you found between "class" and one of the variables you listed
above.
Running the Cross-Tabulations: If
you are reading a hardcopy version of these instructions, now proceed to a computer with
internet access. Access this page from the course home page at http://camden-www.rutgers.edu/~wood/445ex5.html,
then click on
SDA Survey Data Archive for Survey
Documentation and Analysis
You are now ready to proceed with your analysis.
You will note that there are actually several data bases you
can also use, but for the purposes of this analysis, click on GSS Cumulative Data
File 1972-1996. At the webpage that appears, under "Select an
action," the default is to browse the codebook, but we don't have to do that since
you've already chosen your variables. So with the mouse, click on the icon to the left of
"Run Crosstabulation." Then click on the box marked "Start" and then
on "Continue Submission" if an intermediate screen appears.
You should now be at a screen that will read:
CSA Tables Program
(Selected study: GSS 1972-1996 Cumulative Datafile)
In the "Horizontal" box, type "class" as
your independent variable.
In the "Vertical" box, type the code name of the first dependent variable you
chose. Then:
Click on "Vertical" for "Percentaging"
Click on "Yes" for "Statistics"
Click on "Yes" for "Question Text?"
Then click on the box, "Run the Table."
In a few seconds, you will see the cross-tabulation you have
produced. Make sure that your independent variable is listed as the horizontal variable
and your dependent variable as the vertical variable. Print out your table.
(You may choose to save it as a text file and paste it into your final paper, although
you'll have to fix some of the lost formatting.)
Now click on "Back" and run and print out the table
for your other variable, entering your second variable code name in the
"Vertical" box and keeping "class" in the "Horizontal" box.
Be sure that vertical percentaging, statistics, and question text? are specified properly
(as above).
Before proceeding further, examine the Chisquare
"p" value to make sure that at least one of your tables is statistically
significant (that is, meets the standard of there being less than a one-in-twenty
probability that the results could be obtained by chance); "p" must equal .05 or
less. If neither of your hypotheses produced statistically-significant results,
experiment with alternative dependent variables until you get one relationship that meets
the test of statistical significance.
You now have just one more computer task to perform, which is
to rerun one of your cross-tabulations using a control variable. Choose one of your two
variables; it should be one where the differences were statistically significant (p=0.5 or
less). Set up your your variables exactly as before (using your chosen variable and class
again), but this time type in "race" in the "Control" box. Now run and
print out this table (which will come out as four, one for each racial category and one
for all combined). Print these tables out also. (The electronically-savvy among you may
want to save the tables as a text file, in order subsequently to paste them electronically
into your paper.) Assuming you have three sets of tables in hand, you are now finished
with the computer work necessary for this project. If you are doing this at the computer
lab, be sure to exit fully out of your account.

Interpreting and Writing Up Your Findings: In
your written report, I want you to provide a brief summary of your findings, table by
table. Label your printouts Tables 1-3 (the last table should include the two sub-tables
for whites and for blacks--omit the others) and cut and paste them into your report
(scissors and tape are fine; advanced computer users may download the tables and insert
them into a word-processing program if they wish). In the body of your report, for each of
the first two tables, be sure to include the following:
- State your hypothesis and explain briefly your reasoning
behind it. [For example: The higher the social class of a person, the more likely he or
she is to be happy. This seems likely because people of higher social classes are more
likely to have the material and other resources to secure the things that are pleasurable
to them.]
- Paste in the relevant table.
- State whether the results were statistically significant.
[That is, is the Chisquare p less than 0.05?]
- Was your hypothesis supported? If the differences in the data
were not statistically significant, the hypothesis is rejected. If the data were
statistically significant, you still need to examine the table to see if the data go in
the direction of your hypothesis. State your conclusion about whether your hypothesis was
supported. Summarize briefly what the data say.
- Repeat this for your second variable.
For the final set of tables (with the control variable),
examine the tables for whites and blacks to see if controlling for race produces different
results, and respond to these questions:
- Do the class differences hold when race is held constant? Or
does race significantly modify the relationship you found in your earlier table?
- If the introduction of race does modify your original
findings, explain how. Are the findings changed for both whites and blacks or just for one
group?
- Based on your findings when you control for race, which
variable would you conclude is more important for explaining variation in the dependent
variable: class or race? Summarize your interpretation of the data.
Return to Sociology and the Internet
Course Homepage
Jan. 6, 2000
|