Jason's Main Page

Data Analysis I

Data Analysis II

SEM Class

MLR Class

Stats Notes

SEMrefs

Statistical Links

Other Links

Example Data Sets for Data Analysis Courses

Right click and “save target as” or “save file as” to download  these files.

 

Data Analysis II (Correlation, Regression, & Logistic Regression)

HW 1  Urban Mobility Report 2009 on traffic delays and population for 90 urban areas.

 

HW 2  Urban Mobility Report 2009 (new version) Population and traffic delay data with two regional indicators:  EASTWEST (0=East, 1=West) and REGION (1=Northeast, 2=South, 3=Midwest/West, 4=Southwest)

HW 3 Urban Mobility Report 2009 (same as HW 2).  Population, density, and traffic delay data with two regional indicators:  EASTWEST (0=East, 1=West) and REGION (1=Northeast, 2=South, 3=Midwest/West, 4=Southwest) 

Chilean plebiscite data. VOTE is the survey respondent’s preference six months before the election (0 = Pinochet, 1 = new government).  SEX is the sex of the respondent (0=female, 1=male), AGE is the age of the respondent, EDUC is a variable for three levels of education (primary, secondary, post secondary), INCOME is income in Chilean Pesos, and STATQUO is a score on a measure of political support for the status quo (standardized scores)

Data Analysis I (Significance Testing, t-tests, Chi-square, Correlation, Reliability, ANOVA)

HW1 no downloads.

HW2    

Portland police racial profiling data.  Contains data on the minority status of the driver stopped and whether driver was arrested for 2004. 

 

NPC School Survey Data for Reliability Analysis.  (Right click and “save target as” or “save file as” to download  this file). Responses to the following items were on a 4-point scale (0=NO! 1=no 2=yes 4=YES!).

 

CRIME, How much does "crime" and/or "drug selling" describe your neighborhood?

FIGHTS, How much does "fights" describe your neighborhood?

BLDINGS, How much does "lots of empty or abandoned buildings" describe your neighborhood?

GRAFFITI, How much does "lots of graffiti describe your neighborhood?

GETOUT, I'd like to get out of my neighborhood.
MOVE, People move in and out of my neighborhood.    

HW3 
Urban Socioeconomic Income and Growth Data for 50 largest cities with 4 regions (Problems 1-2)

U.S. and Canada artificial car and truck mpg data (Problems 4-5)

Urban Socioeconomic Income and Growth data for 50 largest cities 1960-1970 (Problem 6)

Multilevel Regression

HW 1
School alcohol and drug survey (SPSS)
.  (Right click and “save target as” or “save file as” to download  these files.) The data set used in this homework is from a real survey about drug use and violence collected from 11th grade students in Oregon by NPC Research, Inc. Twenty-eight schools are selected for these analyses, with 2,283 individuals total.  Students responded to a wide variety of questions about the student’s drug use, alcohol use, violence, community, and family.   In this homework, we will focus on the possible school-level and individual-level predictors of alcohol use.  HLM format of the data file (.mdm)

Alcohol use over the last year:

0 “none”

1 “1-2 times”

2 “3-5 times”

3 “6-9 times”

4 “10-19 times”

5 “20-39 times”

6 “40+ times”

 

Neighborhood support:

QC7  My neighbors notice when I am doing a good job and let me know.

QC9  there are people in my neighborhood who encourage me to do my best.

QC10  There are people in my neighborhood who are proud of me when I do something well.

QC12  There are lots of adults in my neighborhood I could talk to about something important.

 

Neighborhood erosion:

QC1A How much does "crime and/or drug selling" describe your neighborhood?

QC1B  How much does "fights" describe your neighborhood?

QC1C  How much does "lots of empty or abandoned buildings" describe your neighborhood?

QC1D  How much does "lots of graffiti" describe your neighborhood?

 

HW 2   (Right click and “save target as” or “save file as” to download  these files.)

ABA neighborhood data. HLM format data (.mdm), SPSS format data (.sav).  Note that in the HLM data set, SERVICES is not centered, but it is group-mean centered in the SPSS data set.

 

Descriptive statistics for the HLM data file:

 

                      LEVEL-1 DESCRIPTIVE STATISTICS

 

 VARIABLE NAME       N       MEAN         SD         MINIMUM      MAXIMUM

  NHRATING           396       3.46       1.05         1.00         5.00

  PROBLEMS           396       2.16       0.77         1.00         4.00

  SERVICES           396       3.44       0.86         1.00         5.00

 

 

                      LEVEL-2 DESCRIPTIVE STATISTICS

 

 VARIABLE NAME       N       MEAN         SD         MINIMUM      MAXIMUM

  MEANSERV           42       3.45       0.43         2.55         4.38

 

 

HW 3   (Right click and “save target as” or “save file as” to download  these files.)

Problems 1-4:  Social Relationships data files:  SPSS, HLM

5-wave longitudinal data collected at 6-month intervals.  The following are descriptive statistics for the HLM data file:

                      LEVEL-1 DESCRIPTIVE STATISTICS

 

 VARIABLE NAME       N       MEAN         SD         MINIMUM      MAXIMUM

      TIME          1500       2.00       1.41         0.00         4.00

    TIMESQ          1500       6.00       5.90         0.00        16.00

   SUPPORT          1500       2.39       0.78         0.01         4.00

    HEALTH          1500       2.14       1.06         0.00         4.00

 

 

                      LEVEL-2 DESCRIPTIVE STATISTICS

 

 VARIABLE NAME       N       MEAN         SD         MINIMUM      MAXIMUM

      EDUC           300       4.74       1.95         1.00         9.00

 

Problems 5-6:  British Elections Study 2005 data files:  SPSS, HLM.

Study of election attitudes in England, Scotland, and Wales.  The following are descriptive statistics for the HLM data file:

                      LEVEL-1 DESCRIPTIVE STATISTICS

 

 VARIABLE NAME       N       MEAN         SD         MINIMUM      MAXIMUM

     VOTED          2278       0.76       0.43         0.00         1.00

       AGE          2278      46.75      16.01        18.00        99.00

   COLLEGE          2278       0.23       0.42         0.00         1.00

  TRUSTPOL          2278       3.88       2.17         0.00        10.00

   MEANCON          2278       0.12       0.11         0.00         0.57

 

 

                      LEVEL-2 DESCRIPTIVE STATISTICS

 

 VARIABLE NAME       N       MEAN         SD         MINIMUM      MAXIMUM

   MEANCON           255       0.11       0.12         0.00         0.57

 

 

Structural Equation Modeling

HW1

Census undercount data.  From *E.P. Ericksen, J.B. Kadane, & J.W. Tukey (1989), "Adjusting the 1980 Census of Population and Housing," Journal of the American Statistical Association, 84:927-944.  Variables are area (AREA), the estimates of undercounts in the area (UNDER), poverty rate (POVERTY), and crimes in the area per 1000 (CRIME). 

HW 2

School Survey data from NPC Research, Inc. on neighborhood quality.  ASCII data file and Mplus input statements.

HW 3

School survey data with Gender.  SPSS file, ASCII data file, and Mplus input statements for reading the data:  Problem 1b, Problem 1c, Problem 2a&2b.

Social relationships data.  ASCII data file, Mplus input statements for reading the data:  Problem 3a, Problem3b.