Jason's Main Page

Data Analysis I

Data Analysis II

SEM Class

MLR Class

Stats Notes

SEMrefs

Statistical Links

Other Links

Example Data Sets for Newsom's Data Analysis Courses

 

Right click and "save target as" or "save file as" to download these files.

Structural Equation Modeling

HW1 Census undercount data (ASCII data file). Mplus input file. (store the data and input file in the same folder). From *E.P. Ericksen, J.B. Kadane, & J.W. Tukey (1989), "Adjusting the 1980 Census of Population and Housing," Journal of the American Statistical Association, 84:927-944. Variables are area (AREA), the estimates of undercounts in the area (UNDER), poverty rate (POVERTY), and crimes in the area per 1000 (CRIME).

School Survey data from NPC Research, Inc. on neighborhood quality. ASCII data file and Mplus input statements.

Data Analysis II (Correlation, Regression, & Logistic Regression)

HW 1 From the Socio-Economic Indicators for Functional Urban Regions in the United States, 1820-1970 study (available through the Inter-University Consortium for Political and Social Research; ICPSR). I have selected a few variables for the 100 largest cities for use in these problems: per capita income from 1970 (INCOME); population growth between 1960 and 1970 (given in thousands, so that -81 indicates a decline of 81,000 and 501 is an increase of 501,000); the percentage of the population that are adults in 1970 (ADULTS).

HW 2 New version of the Socio-Economic Indicators for Functional Urban Regions in the United States, 1820-1970 data set with two region variables: cities in the eastern U.S. are coded 0 and cities in the western U.S. are coded 1, and a four category region variable (REGION), with 1 = Northeast, 2 = Southeast, 3 = Southwest, and 4 = Midwest and West.

HW 3 Socio-Economic Indicators for Functional Urban Regions in the United States, 1820-1970 (same version as HW 2).

Chilean plebiscite data. VOTE is the survey respondent's preference six months before the election (0 = Pinochet, 1 = new government). SEX is the sex of the respondent (0=female, 1=male), AGE is the age of the respondent, EDUC is a variable for three levels of education (primary, secondary, post secondary), INCOME is income in Chilean Pesos, and STATQUO is a score on a measure of political support for the status quo (standardized scores).

Data Analysis I (Significance Testing, t-tests, Chi-square, Correlation, Reliability, ANOVA)

HW1 no downloads.

HW2

Portland police racial profiling data. Contains data on the minority status of the driver stopped and whether driver was arrested for 2004.

 

HW3

NPC School Survey Data for Reliability Analysis. (Right click and "save target as" or "save file as" to download this file). Responses to the following items were on a 4-point scale (0=NO! 1=no 2=yes 4=YES!).

 

CRIME, How much does "crime" and/or "drug selling" describe your neighborhood?

FIGHTS, How much does "fights" describe your neighborhood?

BLDINGS, How much does "lots of empty or abandoned buildings" describe your neighborhood?

GRAFFITI, How much does "lots of graffiti describe your neighborhood?
GETOUT, I'd like to get out of my neighborhood.
MOVE, People move in and out of my neighborhood.

Socioeconomic data set with 4 regions. From the Socio-Economic Indicators for Functional Urban Regions in the United States, 1820-1970 study (available from ICPSR).

Artificial data set on gas mileage in US and Canada.

Socioeconomic data with 1960 and 1970 income and positive/negative growth indicator.

Multilevel Regression

HW 1 ABA neighborhood data: A community survey of 378 respondents from 42 neighborhoods conducted by the American Bar Association (ABA). SPSS Data File, HLM Data File

Descriptive Statistics

LEVEL-1

 

 

 

 

 

VARIABLE NAME

 

 

 

 

 

 

N

MEAN

SD

MINIMUM

MAXIMUM

NHRATING

378

3.45

1.05

1.00

5.00

DRUGS

378

2.04

0.74

1.00

3.00

FEAR

378

2.28

1.12

1.00

4.00

 

 

 

 

 

 

 

 

 

 

 

 

LEVEL-2

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

MEANPROB

42

2.14

0.55

1.16

3.31

 

HW 2 Head Start mental health consultation survey. 559 teachers in 66 programs. SPSS Data File, HLM Data File

 

LEVEL-1

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

HELPED

559

2.96

0.71

1.00

4.00

SHARE

559

3.54

0.66

1.00

4.00

CULTCOMP

559

3.34

0.57

1.00

4.00

 

 

 

 

 

 

LEVEL-2

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

MONEY

66

26125.98

42147.09

0.00

235000.00

 

HW 3 Social relationships study. SPSS Data File, HLM Data File

 

Descriptive statistics

LEVEL-1

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

TIME

1500

2.00

1.41

0.00

4.00

TIMESQ

1500

6.00

5.90

0.00

16.00

SUPPORT

1500

2.39

0.78

0.01

4.00

HEALTH

1500

2.14

1.06

0.00

4.00

 

 

 

 

 

 

LEVEL-2

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

EDUC

300

4.74

1.95

1.00

9.00

 

British election study. HLM Data File

 

Descriptive statistics

 

LEVEL-1

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

VOTED

2278

0.76

0.43

0.00

1.00

AGE

2278

46.75

16.01

18.00

99.00

COLLEGE

2278

0.23

0.42

0.00

1.00

TRUSTPOL

2278

3.88

2.17

0.00

10.00

MEANCON

2278

0.12

0.11

0.00

0.57

 

 

 

 

 

 

LEVEL-2

 

 

 

 

 

VARIABLE NAME

N

MEAN

SD

MINIMUM

MAXIMUM

MEANCON

255

0.11

0.12

0.00

0.57

 

 

 

 

 

Data Analysis II (Correlation, Regression, & Logistic Regression)

HW 1 Urban Mobility Report 2009 on traffic delays and population for 90 urban areas.

 

HW 2 Fertility rate data. Contraceptive use and fertility in developing countries collected by Robey, Shea, Rutstein, and Morris (1992). Variables are COUNTRY, REGION (1=Central and Southern Africa, 2=Asia and Pacific Islands, 3=Latin America and Carribean, 4=Near East and North Africa), FERTRATE, CNTRCPT.

 

HW 3

School survey. (Right click and "save target as" or "save file as" to download these files.) The data set used in this homework is from a real survey about drug use and violence collected from 11th grade students in Oregon by NPC Research, Inc. Twenty-eight schools are selected for these analyses, with 2,433 individuals total. Students responded to a wide variety of questions about the student's drug use, alcohol use, violence, community, and family. The data set contains three dichotomous variables, GUN (0=no, 1=yes), RACE (0=white, 1=nonwhite), and GANG (0=no, 1=yes), as well as the following three questions about alcohol use (ALCOHOL), neighborhood support (SUPPORT), and the condition of the neighborhood (EROSION):

ALCOHOL: Alcohol use over the last year:

0 "none"

1 "1-2 times"

2 "3-5 times"

3 "6-9 times"

4 "10-19 times"

5 "20-39 times"

6 "40+ times"

 

SUPPORT—Neighborhood support:

QC7 My neighbors notice when I am doing a good job and let me know.

QC9 there are people in my neighborhood who encourage me to do my best.

QC10 There are people in my neighborhood who are proud of me when I do something well.

QC12 There are lots of adults in my neighborhood I could talk to about something important.

 

EROSION—Neighborhood erosion:

QC1A How much does "crime and/or drug selling" describe your neighborhood?

QC1B How much does "fights" describe your neighborhood?

QC1C How much does "lots of empty or abandoned buildings" describe your neighborhood?

QC1D How much does "lots of graffiti" describe your neighborhood?

 

 

Right click and "save target as" or "save file as" to download these files.

 

 

 

Right click and "save target as" or "save file as" to download these files.