| NHAPS.Data {heR.ActivityData} | R Documentation |
The complete database with variable labels for the USEPA's 1992-94 National Human Activity Pattern Survey (NHAPS)
data(nhaps.main) data(nhaps.diary) ...
nhaps.main |
NHAPS main data |
nhaps.diary |
NHAPS 24-hour diary data |
nhaps.main.labels |
Variable labels for the main data, also embedded in nhaps.main as comments |
nhaps.diary.labels |
Variable labels for the diary data, also embedded in nhaps.diary as comments |
nhaps.weights.klepeis |
NHAPS weights from Klepeis et al. (1996) |
nhaps.diarycodes.klepeis |
NHAPS diary codes from Klepeis et al. (1996) |
nhaps.ca.weights.klepeis |
NHAPS-California weights from Tsang and Klepeis (1998) |
nhaps.loc.codes.short |
NHAPS location codes with short names |
nhaps.loc.codes.long |
NHAPS location codes with long names |
nhaps.act.codes.short |
NHAPS activity codes with short names |
nhaps.act.codes.long |
NHAPS activity codes with long names |
R data-frame objects with varying numbers of variables and records.
Variable labels are provided as comments for each variable, and value
labels are provided as factors for each categorical variable or
as $value.label attributes.
The United States Environmental Protection Agency (USEPA) sponsored a 1992-94 national study of human activity patterns (n=9386) conducted by the University of Maryland's Survey Research Center. It has come to be called the National Human Activity Pattern Survey, or NHAPS. The study was intended to provide data for use in models of human exposure to household pollutants in air and water. It was conducted by telephone and includes a minute-by-minute `day-before' diary where respondents recounted their activities and the locations they visited over a full 24-hour period.
For various descriptions and analyses of the data, including two published journal articles, please consult the references listed below.
The main NHAPS reference is Klepeis et al. (2001) in the Journal of Exposure Analysis and Environmental Epidemiology. Please contact the lead author, Neil Klepeis (also the author of this R package), to obtain a reprint.
The NHAPS data have been included in the USEPA's Consolidated Human Activity Pattern Database (CHAD), which is available on the world wide web at http://www.epa.gov/chadnet1. A number of other activity pattern databases are part of CHAD, including the two CARB studies that are part of the present R package.
For detailed information on the data sets, variable codes, weights, survey questionnaires, and general survey methods, please consult the following documentation files, which were originally distributed by the University of Maryland's Survey Research Center in May 1995 along with the raw data in this package:
diary.codebookmain.codebooksurvey.methodsadult.quest.aadult.quest.bchild.quest.achild.quest.bproxy.quest.aproxy.quest.b
Each of the data frames nhaps.main and
nhaps.diary contain encoded factors for variables with 12 distinct
values or less, as derived from the
variable value labels in the original SPSS data files. Continuous
variables and variables with more than 12 distinct values,
such as location codes, activity codes, and zipcodes, retain their
numeric values in a $value.label attribute.
Some of these variables have labels for only a small number of
the possible values (e.g., 99999 = "DK/REF" for zipcodes).
For non-factor variables with value labels,
the $value.labels attribute of the
given variable contains the labels and numeric values in the form of
a named numeric vector.
The nhaps.diarycodes.klepeis data frame
also contains a $value.label attribute giving the recoded
location and activity codes. In addition, the separate
nhaps.xxx.codes.xxxx data objects in this package contain
numeric codes and labels (both short and long)
for each original NHAPS location and activity.
The $value.label attribute for non-factor variables can be
used to encode the variable data as a factor by using the numerical
values as the levels and the vector names as the labels, i.e.,
factor(levels=<value.labels>, labels=names(<value.labels>)). Be careful
to add more levels and labels for distinct and desired values in the data that
are not already labeled. Unspecified values will be converted
to NA's.
If you find that a particular variable has a large number of missing values, it may be possible that it had less than 12 distinct values, but not all of these had associated value labels. So they were converted to missing values when the variable was encoded as a factor. It is unlikely that this has happened, so before reaching this conclusion please check to make sure the missing data have not arisen because of the questionnaire version that was used. Note that not all questions were included on version A and B of the questionnaire, and that some variables are dependent on a "YES" answer for another variable. Also check other reported statistics from NHAPS. See the references below for previously reported results of NHAPS data analyses.
The SPSS code used by the Survey Research Center of the University of Maryland to process the ASCII raw data contained commands (included below) to create new and recoded variables for education, weights, and age.
The composite weight variable WEIGHT is
the adjusted sum of the original ADWKWTA and KIDWKWTA
weight variables. The original weighting variables are the adult and child week weight
containing adjustments to correct for stratification across quarter,
stratification by Census region,
oversampling of weekends, increased probability of selection for
multi-phone households,
and decreased probability of selection in a multi-adult or multi-child
households. The adult and child weights can be summed because they are zero
for the opposite group (e.g., adults have values of zero for KIDWKWTA).
So, it would seem that WEIGHT can be used for analyses across
all groups. This weight might also be used as a basis to create
post-stratification adjustment weights for individual days of the week,
gender, age, and other important subgroups,
as was done by Klepeis, Tsang, and Behar (1996) for the nation and Tsang and Klepeis (1998) for Californians. These post-stratification weights are included in the nhaps.weights.klepeis and nhaps.ca.weights.klepeis dataframe objects. Please note that none of the original NHAPS
weights included in nhaps.main contain post-stratification
weights.
The SPSS code for creating the new variables is as follows:
COMPUTE REduc=Educ.
FORMAT REduc (F2).
RECODE REduc (LOWEST THRU 11 = 1) (12 = 2) (13 THRU 15 = 3) (16 = 4) (17 THRU 18 = 5) (99 = 9).
VALUE LABELS REduc 1 'LESS THAN HIGH SCHOOL' 2 'HIGH SCHOOL GRAD' 3 'SOME COLLEGE' 4 'COLLEGE GRAD' 5 'POST GRAD' 9 'REF'.
MISSING VALUES REduc (9).
COMPUTE WEIGHT=(ADWKWTA+KIDWKWTA)*(9386/9387).
compute Q=quarter.
recode Q(1 thr 5=1) (6 thr 8=6).
if (q=1) age1=yob.
recode age1 (89 thr 93=1) (82 thr 88=2) (76 thr 81=3) (69 thr 75=4) (59 thr 68=5) (49 thr 58=6) (39 thr 48=7) (29 thr 38=8) (19 thr 28=9) (0 thr 18=10).
if (q=6) age2=yob.
recode age2 (90 thr 94=1) (83 thr 89=2) (77 thr 82=3) (70 thr 76=4) (60 thr 69=5) (50 thr 59=6) (40 thr 49=7) (30 thr 39=8) (20 thr 29=9) (0 thr 19=10).
compute age=age1.
if (q=6) age=age2.
format age (f2).
value label age 1 '<5' 2 '5-11' 3 '12-17' 4 '18-24' 5 '25-34' 6 '35-44' 7 '45-54' 8 '55-64' 9 '65-74' 10 '75+' 99 'Ref'.
missing value age (99).
Neil E. Klepeis compiled this R package. Please report errors or other problems to http://klepeis.net http://klepeis.net \ http://exposurescience.org
I received the NHAPS raw ASCII data, SPSS data files (with encoded variable and value labels), codebooks, methods report, and questionnaires (all on CDROM) from Timothy Triplett of the University of Maryland, Survey Research Center, during Fall 1995
Glen, G., Lakkadi, Y., Tippett, J A., and del Valle-Torres, M. Development of NERL/CHAD: the national exposure research laboratory consolidated human activity database. Report for ManTech Environmental Technology, Inc: Research Triangle Park, NC, 1997.
Klepeis N.E., Nelson W.C., Ott W.R., Robinson J.P., Tsang A.M., Switzer P., Behar J.V., Hern S.C., Engelmann W.H., The National Human Activity Pattern Survey (NHAPS): a resource for assessing exposure to environmental pollutants, Journal of Exposure Analysis and Environmental Epidemiology, 2001: 11(3): 231-252.
Klepeis, N.E. An introduction to the indirect exposure assessment approach: modeling human exposure using microenvironmental measurements and the recent national human activity pattern survey. Environ. Health Perspect. 1999: 107(Suppl. 2): 365-374.
Klepeis, N.E., Tsang, A.M., and Behar, J.V. Analysis of the National Human Activity Pattern Survey (NHAPS) Respondents from a Standpoint of Exposure Assessment. Final EPA Report, EPA/600/R-96/074: Washington, D.C., 1996.
McCurdy, T., Glen, G., Smith, L., and Lakkadi, Y. The national exposure research laboratory's consolidated human activity database. J. Expos. Anal. Environ. Epidemiol. 2000: 10(6): 566-578.
Nelson, W.C., Ott, W.R., and Robinson, J.P. The national human activity pattern survey (NHAPS): use of nationwide activity data for human exposure assessment. Paper number 94-WA75A.01 presented at the A&WMA 87th Annual Meeting: Cincinnati, OH, 1994; and EPA Report, 600/A-94/147, AREAL: Research Triangle Park, O-67, 1994.
Robinson, J.P., and Blair, J. Estimating exposure to pollutants through human activity pattern data: the national micro-environmental activity pattern survey. Annual Report, Survey Research Center, University of Maryland: College Park, Maryland, 1995.
Tsang, A.M. and Klepeis, N.E. Descriptive statistics tables from a detailed analysis of the national human activity pattern survey (NHAPS) data. Final EPA Report, EPA/600/R-96/148: Washington, D.C., 1996.
Tsang, A.M. and Klepeis, N.E. Three Telephone Surveys of Human Activities in California: The 1992-94 National Human Activity Pattern Survey; The 1987-88 California Activity Pattern Survey of Adults and Teenages; and the 1989-90 California Activity Pattern Survey of Children. Internal EPA Report: Las Vegas, NV, 1998.