This repository contains the functions that clean and check the raw data files, which eventually create the final working dataset (PROMISE). It also contains the data dictionary and a description of the methods.

Important: If you want to contribute, please read the CONTRIBUTING.md file before contributing to the PROMISE dataset.

This repository stores mainly the code involved in creating the final merged PROMISE dataset (and the individual datasets), as well as documentation on the dataset. It does not contain the data collection forms, the scrubbing and other functions, and questionnaires, nor does it contain the Access database with the original raw data.

Companion packages

There are several packages that are companions to preparing the PROMISE dataset.

Installation

I would suggest you install PROMISE.data using this method:

  1. Go to this link for version v0.3.0 (You’ll need access to the repository).
  2. Click the download button, choosing the tar.gz version.
  3. After the download is complete open an R console in the Downloads folder and run:
# install.packages("devtools")
promise_gzip <- list.files(pattern = "PROMISE-v0.3.0.*.tar.gz")
untar(promise_gzip)
promise_pkg <- sub("\\.tar\\.gz", "", promise_gzip)
devtools::install(promise_pkg, dependencies = TRUE, upgrade_dependencies = TRUE)

Basic usage

One of the main goals of this PROMISE.data package was to allow the data and manuals to be easily accessible to the end user (us graduate students!). So here are some of the basic commands to use:

To access the dataset, it’s simple!

PROMISE.data::PROMISE

Or to load it into your environment:

library(PROMISE.data)
data('PROMISE')

There are also other datasets available. See ?PROMISE for the other datasets or the datasets vignette (see below for viewing it).

If you want to combine datasets together, do something like:

combine_datasets(msd, ogtt)
combine_datasets(PROMISE, dhq)
combine_datasets(PROMISE, form012)
# Or specific variables from specific datasets
library(dplyr)
combine_datasets(
    PROMISE,
    form012 %>%
        select(SID, VN, matches("Neuro")) # for Neuropathy measures
    )

To view manuals:

# See a list of manuals available:
list_manuals
view_manual('dictionary')
view_manual('methods')
view_manual('datasets')

If you need the dataset as a different format (e.g. if you want to import it into SAS or some other program), you can export it as a .csv file:

export(PROMISE, 'promise-data.csv')

If you want to add new data to the dataset or want to run some quality control commands or to add a new variable to the dataset, see the CONTRIBUTING.md file.