Part 2 of a case study in three parts, illustrating how we work with longitudinal student-level records.
Goals. Introducing the study.
Data. Transforming the data to yield the observations of interest.
Results. Summary statistics, metric, chart, and table.
Method
Our data processing goal is to reduce the source data tables to the specific observations needed to compute our metrics. The data processing tasks include filtering observations, creating, renaming, and recoding variables, and joining data frames.
The analysis is organized to produce two data frames—students ever enrolled in the programs and students graduating from the programs—that are joined and written to file as a starting point for developing the results.
Reminder. midfielddata datasets are for practice, not research.
Load data
Start. If you are writing your own script to follow along, we use these packages in this article:
Load. Practice datasets. View data dictionaries via
?student
, ?term
, ?degree
.
# Load practice data
data(student, term, degree)
Initial processing
Select (optional). Reduce the number of columns to the minimum needed by the midfieldr functions.
# Work with required midfieldr variables only
student <- select_required(student)
term <- select_required(term)
degree <- select_required(degree)
Initialize. Assign a working data frame. We often start
with the term
dataset.
# Working data frame
DT <- copy(term)
DT
#> mcid institution term cip6 level
#> <char> <char> <char> <char> <char>
#> 1: MCID3111142225 Institution B 19881 140901 01 First-year
#> 2: MCID3111142283 Institution J 19881 240102 01 First-year
#> 3: MCID3111142283 Institution J 19883 240102 01 First-year
#> ---
#> 639913: MCID3112898894 Institution B 20181 451001 01 First-year
#> 639914: MCID3112898895 Institution B 20181 302001 01 First-year
#> 639915: MCID3112898940 Institution B 20181 050103 01 First-year
The result has 639,915 observations. In the case study, we will typically note the number of observations as they change.
Filter for data sufficiency
Some student records near the lower and upper terms that bound the available data must be excluded to prevent false summaries involving timely degree completion. To apply this filter, we first determine the timely completion term.
- timely completion term
-
The last term in which a student’s degree completion would be considered timely. In many cases the timely completion (TC) term is 6 years after admission. The TC term can be adjusted to account for transfer credits. (Currently, there is no mechanism for extending the TC term for co-ops or migrators.)
Add variables. Using information in term
, we
add the timely_term
variable as well as supporting
variables used in its construction.
# Determine a timely completion term for every student
DT <- add_timely_term(DT, term)
DT
#> mcid institution term cip6 level term_i
#> <char> <char> <char> <char> <char> <char>
#> 1: MCID3111142225 Institution B 19881 140901 01 First-year 19881
#> 2: MCID3111142283 Institution J 19881 240102 01 First-year 19881
#> 3: MCID3111142283 Institution J 19883 240102 01 First-year 19881
#> ---
#> 639913: MCID3112898894 Institution B 20181 451001 01 First-year 20181
#> 639914: MCID3112898895 Institution B 20181 302001 01 First-year 20181
#> 639915: MCID3112898940 Institution B 20181 050103 01 First-year 20181
#> level_i adj_span timely_term
#> <char> <num> <char>
#> 1: 01 First-year 6 19933
#> 2: 01 First-year 6 19933
#> 3: 01 First-year 6 19933
#> ---
#> 639913: 01 First-year 6 20233
#> 639914: 01 First-year 6 20233
#> 639915: 01 First-year 6 20233
Add variables. Using information in term
, we
add the data_sufficiency
variable as well as supporting
variables used in its construction.
# Determine data sufficiency for every student
DT <- add_data_sufficiency(DT, term)
DT
#> mcid institution term cip6 level level_i
#> <char> <char> <char> <char> <char> <char>
#> 1: MCID3111142225 Institution B 19881 140901 01 First-year 01 First-year
#> 2: MCID3111142283 Institution J 19881 240102 01 First-year 01 First-year
#> 3: MCID3111142283 Institution J 19883 240102 01 First-year 01 First-year
#> ---
#> 639913: MCID3112898894 Institution B 20181 451001 01 First-year 01 First-year
#> 639914: MCID3112898895 Institution B 20181 302001 01 First-year 01 First-year
#> 639915: MCID3112898940 Institution B 20181 050103 01 First-year 01 First-year
#> adj_span timely_term term_i lower_limit upper_limit data_sufficiency
#> <num> <char> <char> <char> <char> <char>
#> 1: 6 19933 19881 19881 20181 exclude-lower
#> 2: 6 19933 19881 19881 20096 exclude-lower
#> 3: 6 19933 19881 19881 20096 exclude-lower
#> ---
#> 639913: 6 20233 20181 19881 20181 exclude-upper
#> 639914: 6 20233 20181 19881 20181 exclude-upper
#> 639915: 6 20233 20181 19881 20181 exclude-upper
- data sufficiency criterion
-
Student records are limited to those for which available data are sufficient to assess timely completion without biased counts of completers or non-completers.
Filter. We filter to retain observations for which the data are sufficient then drop all but the ID variable.
# Retain observations having sufficient data
DT <- DT[data_sufficiency == "include"]
DT <- DT[, .(mcid)]
DT <- unique(DT)
DT
#> mcid
#> <char>
#> 1: MCID3111142689
#> 2: MCID3111142782
#> 3: MCID3111142881
#> ---
#> 76873: MCID3112785480
#> 76874: MCID3112800920
#> 76875: MCID3112870009
The result has 76,875 observations.
Filter for degree seeking
- inner join
-
A merge operation between two data frames
X
andY
that returns observations (rows) that match specified conditions in both. The data.table syntax isY[X, j, on]
wherej
can be used to select columns.
Filter. Use an inner join with student
to
retain degree-seeking students only. Select the ID column.
# Filter for degree seeking, output unique IDs
DT <- student[DT, .(mcid), on = c("mcid"), nomatch = NULL]
DT <- unique(DT)
DT
#> mcid
#> <char>
#> 1: MCID3111142689
#> 2: MCID3111142782
#> 3: MCID3111142881
#> ---
#> 76873: MCID3112785480
#> 76874: MCID3112800920
#> 76875: MCID3112870009
The result has 76,875 observations. (No change is expected in this example because all students in the midfielddata practice data are degree-seeking.) We preserve this data frame as a baseline set of IDs to be used again.
baseline <- copy(DT)
Identify programs
In MIDFIELD datasets, the cip6
variable identifies the
6-digit code for the program in which a student is enrolled in a given
term.
- CIP
-
Classification of Instructional Programs, a taxonomy of academic programs curated by the US Department of Education (NCES 2010). The 2010 codes are included with midfieldr in the data set
cip
.
We have already searched cip
to obtain the codes for the
four programs in our case study. The first four digits of the 6-digit
CIP codes are:
- Civil Engineering 1408
- Electrical Engineering 1410
- Mechanical Engineering 1419
- Industrial/Systems Engineering 1427, 1435, 1436, and 1437.
From cip
, we obtain all codes that start with any of the
selected 4-digit codes.
# Four engineering programs using 4-digit CIP codes
selected_programs <- filter_cip(c("^1408", "^1410", "^1419", "^1427", "^1435", "^1436", "^1437"))
selected_programs
#> cip2 cip2name cip4 cip4name cip6
#> <char> <char> <char> <char> <char>
#> 1: 14 Engineering 1408 Civil Engineering 140801
#> 2: 14 Engineering 1408 Civil Engineering 140802
#> 3: 14 Engineering 1408 Civil Engineering 140803
#> ---
#> 13: 14 Engineering 1435 Industrial Engineering 143501
#> 14: 14 Engineering 1436 Manufacturing Engineering 143601
#> 15: 14 Engineering 1437 Operations Research 143701
#> cip6name
#> <char>
#> 1: Civil Engineering, General
#> 2: Geotechnical Engineering
#> 3: Structural Engineering
#> ---
#> 13: Industrial Engineering
#> 14: Manufacturing Engineering
#> 15: Operations Research
Add a variable. User-defined program names are nearly always required. Add a variable to label each of these 15 programs with one of the four conventional program abbreviations we will use in comparing metrics, i.e., Civil (CE), Electrical (EE), Mechanical (ME), and Industrial/Systems Engineering (ISE).
# Recode program labels. Edit as required.
selected_programs[, program := fcase(
cip6 %like% "^1408", "CE",
cip6 %like% "^1410", "EE",
cip6 %like% "^1419", "ME",
cip6 %chin% c("142701", "143501", "143601", "143701"), "ISE"
)]
Confirm that the abbreviations match the original 4-digit CIP names.
We also illustrate using options()
to change the number of
data.table rows to print.
# Preserve settings
op <- options()
# Edit number of rows to print
options(datatable.print.nrows = 15)
# Confirm that abbreviations match the longer program names
selected_programs[, .(cip4name, program)]
#> cip4name program
#> <char> <char>
#> 1: Civil Engineering CE
#> 2: Civil Engineering CE
#> 3: Civil Engineering CE
#> 4: Civil Engineering CE
#> 5: Civil Engineering CE
#> 6: Civil Engineering CE
#> 7: Electrical, Electronics and Communications Engineering EE
#> 8: Electrical, Electronics and Communications Engineering EE
#> 9: Electrical, Electronics and Communications Engineering EE
#> 10: Electrical, Electronics and Communications Engineering EE
#> 11: Mechanical Engineering ME
#> 12: Systems Engineering ISE
#> 13: Industrial Engineering ISE
#> 14: Manufacturing Engineering ISE
#> 15: Operations Research ISE
Having checked that the new abbreviations correctly represent the programs, we can finalize the data frame of program CIPs and names.
selected_programs <- selected_programs[, .(cip6, program)]
selected_programs
#> cip6 program
#> <char> <char>
#> 1: 140801 CE
#> 2: 140802 CE
#> 3: 140803 CE
#> 4: 140804 CE
#> 5: 140805 CE
#> 6: 140899 CE
#> 7: 141001 EE
#> 8: 141003 EE
#> 9: 141004 EE
#> 10: 141099 EE
#> 11: 141901 ME
#> 12: 142701 ISE
#> 13: 143501 ISE
#> 14: 143601 ISE
#> 15: 143701 ISE
# Restore original settings
options(op)
Gather ever-enrolled
Reset The data frame of baseline IDs is the intake for this section.
# IDs of data-sufficient, degree-seeking students
DT <- copy(baseline)
DT
#> mcid
#> <char>
#> 1: MCID3111142689
#> 2: MCID3111142782
#> 3: MCID3111142881
#> ---
#> 76873: MCID3112785480
#> 76874: MCID3112800920
#> 76875: MCID3112870009
The result has 76,875 observations.
- left join
-
A merge operation between two data frames
X
andY
which returns all observations (rows) ofX
and all matching rows inY
. The data.table syntax isY[X, j, on]
wherej
can be used to select columns.
Left join (add a variable). Returns all rows from
DT
and rows from term
that match on
mcid
—in effect, adding the cip6
variable to
DT
. Additionally, because term
contains
multiple rows per ID, the merged data frame also has the possibility of
multiple rows per ID.
# Left-outer join from term to DT
DT <- term[DT, .(mcid, cip6), on = c("mcid")]
DT <- unique(DT)
DT
#> mcid cip6
#> <char> <char>
#> 1: MCID3111142689 090401
#> 2: MCID3111142782 260101
#> 3: MCID3111142881 450601
#> ---
#> 127347: MCID3112800920 240102
#> 127348: MCID3112800920 240199
#> 127349: MCID3112870009 240102
The result has 127,349 observations.
Inner join (add a variable, filter observations). Returns
rows in DT
and study_programs
that match on
cip6
. In effect, we add a column of program labels to
DT
and simultaneously filter DT
to retain rows
that match the four case study programs only.
# Join program names and retain desired programs only
DT <- study_programs[DT, on = c("cip6"), nomatch = NULL]
DT
#> cip6 program mcid
#> <char> <char> <char>
#> 1: 141001 EE MCID3111142965
#> 2: 141001 EE MCID3111145102
#> 3: 141001 EE MCID3111146537
#> ---
#> 5655: 141901 ME MCID3112641399
#> 5656: 141901 ME MCID3112641535
#> 5657: 141901 ME MCID3112698681
The result has 5657 observations.
Filter. Because students can change CIP codes but remain
within the same labeled group (e.g., ISE), we drop the cip6
code and filter for unique combinations of ID and program label.
# Filter for unique ID-program combinations
DT[, cip6 := NULL]
DT <- unique(DT)
DT
#> program mcid
#> <char> <char>
#> 1: EE MCID3111142965
#> 2: EE MCID3111145102
#> 3: EE MCID3111146537
#> ---
#> 5651: ME MCID3112641399
#> 5652: ME MCID3112641535
#> 5653: ME MCID3112698681
The result has 5653 observations.
Copy. Set aside the ever enrolled information under a new name to use later for joining with graduates.
# Prepare for joining
setcolorder(DT, c("mcid", "program"))
ever_enrolled <- copy(DT)
ever_enrolled
#> mcid program
#> <char> <char>
#> 1: MCID3111142965 EE
#> 2: MCID3111145102 EE
#> 3: MCID3111146537 EE
#> ---
#> 5651: MCID3112641399 ME
#> 5652: MCID3112641535 ME
#> 5653: MCID3112698681 ME
Gather graduates
Reset The data frame of baseline IDs is the intake for this section. As before, the result has 76,875 observations.
# IDs of data-sufficient, degree-seeking students
DT <- copy(baseline)
DT
#> mcid
#> <char>
#> 1: MCID3111142689
#> 2: MCID3111142782
#> 3: MCID3111142881
#> ---
#> 76873: MCID3112785480
#> 76874: MCID3112800920
#> 76875: MCID3112870009
Add variables. We use term
to again add the
timely_term
variable and its supporting variables.
# Add timely completion term
DT <- add_timely_term(DT, term)
DT
#> mcid term_i level_i adj_span timely_term
#> <char> <char> <char> <num> <char>
#> 1: MCID3111142689 19883 01 First-year 6 19941
#> 2: MCID3111142782 19883 01 First-year 6 19941
#> 3: MCID3111142881 19893 01 First-year 6 19951
#> ---
#> 76873: MCID3112785480 20071 01 First-year 6 20123
#> 76874: MCID3112800920 20101 01 First-year 6 20153
#> 76875: MCID3112870009 19951 01 First-year 6 20003
Add variables. We use degree
to add the
completion_status
variable and its supporting
variables.
# Add completion status
DT <- add_completion_status(DT, degree)
DT
#> mcid term_i level_i adj_span timely_term term_degree
#> <char> <char> <char> <num> <char> <char>
#> 1: MCID3111142689 19883 01 First-year 6 19941 19913
#> 2: MCID3111142782 19883 01 First-year 6 19941 19903
#> 3: MCID3111142881 19893 01 First-year 6 19951 19894
#> ---
#> 76873: MCID3112785480 20071 01 First-year 6 20123 <NA>
#> 76874: MCID3112800920 20101 01 First-year 6 20153 <NA>
#> 76875: MCID3112870009 19951 01 First-year 6 20003 <NA>
#> completion_status
#> <char>
#> 1: timely
#> 2: timely
#> 3: timely
#> ---
#> 76873: <NA>
#> 76874: <NA>
#> 76875: <NA>
- timely completion criterion
-
Completing a program in no more than a specified span of years, in many cases, within 6 years after admission (150% of the “normal” 4-year span), or possibly less for some transfer students.
Filter. Retain observations of timely completers only. Drop unnecessary variables.
# Retain timely completers
DT <- DT[completion_status == "timely"]
DT <- DT[, .(mcid)]
DT
#> mcid
#> <char>
#> 1: MCID3111142689
#> 2: MCID3111142782
#> 3: MCID3111142881
#> ---
#> 40438: MCID3112692944
#> 40439: MCID3112694738
#> 40440: MCID3112730841
The result has 40,440 observations.
Left join (add variables). We use a left-join with
degree
to add the CIP codes and terms of the degrees
earned.
DT <- degree[DT, .(mcid, term_degree, cip6), on = c("mcid")]
DT
#> mcid term_degree cip6
#> <char> <char> <char>
#> 1: MCID3111142689 19913 090401
#> 2: MCID3111142782 19903 260101
#> 3: MCID3111142881 19894 450601
#> ---
#> 40538: MCID3112692944 20153 090101
#> 40539: MCID3112694738 20143 230101
#> 40540: MCID3112730841 20164 040401
The result has 40,540 observations.
Inner join (add a variable, filter observations) Again, add a column of program labels and filter by program.
# Join programs
DT <- study_programs[DT, on = c("cip6"), nomatch = NULL]
DT
#> cip6 program mcid term_degree
#> <char> <char> <char> <char>
#> 1: 141001 EE MCID3111142965 19901
#> 2: 141001 EE MCID3111145102 19893
#> 3: 141001 EE MCID3111146537 19913
#> ---
#> 3264: 141901 ME MCID3112618976 20153
#> 3265: 141001 EE MCID3112619484 20133
#> 3266: 141901 ME MCID3112641535 20143
The result has 3266 observations.
Filter. Students may have earned multiple degrees in different terms. We retain degrees earned in their first degree term only.
DT <- DT[, .SD[which.min(term_degree)], by = "mcid"]
DT
#> mcid cip6 program term_degree
#> <char> <char> <char> <char>
#> 1: MCID3111142965 141001 EE 19901
#> 2: MCID3111145102 141001 EE 19893
#> 3: MCID3111146537 141001 EE 19913
#> ---
#> 3262: MCID3112618976 141901 ME 20153
#> 3263: MCID3112619484 141001 EE 20133
#> 3264: MCID3112641535 141901 ME 20143
The result has 3264 observations.
Filter. Drop unnecessary variables and filter for unique observations of ID and program label.
# Filter for unique ID-program combinations
DT[, c("cip6", "term_degree") := NULL]
DT <- unique(DT)
DT
#> mcid program
#> <char> <char>
#> 1: MCID3111142965 EE
#> 2: MCID3111145102 EE
#> 3: MCID3111146537 EE
#> ---
#> 3262: MCID3112618976 ME
#> 3263: MCID3112619484 EE
#> 3264: MCID3112641535 ME
Copy. Set aside the graduates information under a new name to use for joining with ever enrolled.
# Prepare for joining
setcolorder(DT, c("mcid", "program"))
graduates <- copy(DT)
graduates
#> mcid program
#> <char> <char>
#> 1: MCID3111142965 EE
#> 2: MCID3111145102 EE
#> 3: MCID3111146537 EE
#> ---
#> 3262: MCID3112618976 ME
#> 3263: MCID3112619484 EE
#> 3264: MCID3112641535 ME
Add groupings
We plan to group the data by program, bloc, race/ethnicity, and sex. Program is already present. Bloc labels are added next.
- bloc
-
A grouping of student-level data dealt with as a unit, for example, starters, students ever-enrolled, graduates, transfer students, traditional and non-traditional students, migrators, etc.
Add a variable. We add a bloc
variable to the
ever enrolled and graduates data frames before joining.
ever_enrolled[, bloc := "ever_enrolled"]
graduates[, bloc := "graduates"]
Join. Combine the two data frames by rows, binding by matching column names.
# Combine two data frames
DT <- rbindlist(list(ever_enrolled, graduates), use.names = TRUE)
DT
#> mcid program bloc
#> <char> <char> <char>
#> 1: MCID3111142965 EE ever_enrolled
#> 2: MCID3111145102 EE ever_enrolled
#> 3: MCID3111146537 EE ever_enrolled
#> ---
#> 8915: MCID3112618976 ME graduates
#> 8916: MCID3112619484 EE graduates
#> 8917: MCID3112641535 ME graduates
The result has 8917 observations.
- grouping variables
-
Detailed information in the student-level data that further characterize a bloc of records, typically used to create bloc subsets for comparison, for example, program, race/ethnicity, sex, age, grade level, grades, etc.
Add variables. Use a left join, matching on
mcid
, to add race/ethnicity and sex to the data frame.
# Join race/ethnicity and sex
cols_we_want <- student[, .(mcid, race, sex)]
DT <- cols_we_want[DT, on = c("mcid")]
DT
#> mcid race sex program bloc
#> <char> <char> <char> <char> <char>
#> 1: MCID3111142965 International Male EE ever_enrolled
#> 2: MCID3111145102 White Male EE ever_enrolled
#> 3: MCID3111146537 Asian Female EE ever_enrolled
#> ---
#> 8915: MCID3112618976 White Male ME graduates
#> 8916: MCID3112619484 White Male EE graduates
#> 8917: MCID3112641535 White Male ME graduates
Verify prepared data. study_observations
,
included with midfieldr, contains the case study information developed
above. Here we verify that the two data frames have the same
content.
# Demonstrate equivalence
check_equiv_frames(DT, study_observations)
#> [1] FALSE
In this form, the observations are the starting point for part 3 of the case study.
Closer look
We examine the study observations for a few specific students to better illustrate the structure of these data.
Example 1. This ID yields one observation only. The student was enrolled in Electrical Engineering but did not complete one of the four case study programs.
# Display one student by ID
mcid_we_want <- "MCID3111171519"
DT[mcid == mcid_we_want]
#> mcid race sex program bloc
#> <char> <char> <char> <char> <char>
#> 1: MCID3111171519 White Male EE ever_enrolled
A closer look at the student’s term
record confirms the
result: the student was enrolled in CIP 141001 (Electrical Engineering)
but switched to CIP 110701 (Computer Science). The degree
record indicates that the student graduated in Computer Science.
# Closer look at term
term[mcid == mcid_we_want]
#> mcid institution term cip6 level
#> <char> <char> <char> <char> <char>
#> 1: MCID3111171519 Institution B 19883 149999 02 Second-year
#> 2: MCID3111171519 Institution B 19891 141001 02 Second-year
#> 3: MCID3111171519 Institution B 19893 141001 03 Third-year
#> 4: MCID3111171519 Institution B 19901 141001 03 Third-year
#> 5: MCID3111171519 Institution B 19903 110701 04 Fourth-year
#> 6: MCID3111171519 Institution B 19913 110701 04 Fourth-year
#> 7: MCID3111171519 Institution B 19921 110701 05 Fifth-year Plus
#> 8: MCID3111171519 Institution B 19923 110701 05 Fifth-year Plus
#> 9: MCID3111171519 Institution B 19924 110701 05 Fifth-year Plus
# Closer look at degree
degree[mcid == mcid_we_want]
#> mcid institution term_degree cip6
#> <char> <char> <char> <char>
#> 1: MCID3111171519 Institution B 19924 110701
#> degree
#> <char>
#> 1: Bachelor of Science in Computer Science
Example 2. This ID yields two observations indicating that the student was enrolled in Industrial/Systems Engineering and a timely graduate of that program.
# Display one student by ID
mcid_we_want <- "MCID3111150194"
DT[mcid == mcid_we_want]
#> mcid race sex program bloc
#> <char> <char> <char> <char> <char>
#> 1: MCID3111150194 Black Male ISE ever_enrolled
#> 2: MCID3111150194 Black Male ISE graduates
The term
and degree
excerpts confirm those
observations.
# Closer look at terms
term[mcid == mcid_we_want]
#> mcid institution term cip6 level
#> <char> <char> <char> <char> <char>
#> 1: MCID3111150194 Institution J 19883 140102 01 First-year
#> 2: MCID3111150194 Institution J 19891 140102 02 Second-year
#> 3: MCID3111150194 Institution J 19893 140102 02 Second-year
#> 4: MCID3111150194 Institution J 19903 143501 03 Third-year
#> 5: MCID3111150194 Institution J 19911 143501 04 Fourth-year
#> 6: MCID3111150194 Institution J 19913 143501 04 Fourth-year
#> 7: MCID3111150194 Institution J 19921 143501 05 Fifth-year Plus
#> 8: MCID3111150194 Institution J 19923 143501 05 Fifth-year Plus
# Closer look at degree
degree[mcid == mcid_we_want]
#> mcid institution term_degree cip6
#> <char> <char> <char> <char>
#> 1: MCID3111150194 Institution J 19923 143501
#> degree
#> <char>
#> 1: Bachelor of Science in Industrial Engineering
Example 3. This ID yields two observations indicating that the student was enrolled in Electrical Engineering and in Civil Engineering but a timely graduate of neither program.
# Display one student by ID
mcid_we_want <- "MCID3111264877"
DT[mcid == mcid_we_want]
#> mcid race sex program bloc
#> <char> <char> <char> <char> <char>
#> 1: MCID3111264877 White Male EE ever_enrolled
#> 2: MCID3111264877 White Male CE ever_enrolled
The term
excerpt agrees; the degree
record
shows they graduated in CIP 261399 (Biological and Biomedical
Sciences).
# Closer look at term
term[mcid == mcid_we_want]
#> mcid institution term cip6 level
#> <char> <char> <char> <char> <char>
#> 1: MCID3111264877 Institution B 19901 141001 01 First-year
#> 2: MCID3111264877 Institution B 19903 140201 02 Second-year
#> 3: MCID3111264877 Institution B 19911 140201 02 Second-year
#> 4: MCID3111264877 Institution B 19913 140801 03 Third-year
#> 5: MCID3111264877 Institution B 19914 140801 03 Third-year
#> 6: MCID3111264877 Institution B 19921 240199 03 Third-year
#> 7: MCID3111264877 Institution B 19923 261399 04 Fourth-year
#> 8: MCID3111264877 Institution B 19931 261399 04 Fourth-year
#> 9: MCID3111264877 Institution B 19933 261399 05 Fifth-year Plus
#> 10: MCID3111264877 Institution B 19941 261399 05 Fifth-year Plus
# Closer look at degree
degree[mcid == mcid_we_want]
#> mcid institution term_degree cip6
#> <char> <char> <char> <char>
#> 1: MCID3111264877 Institution B 19941 261399
#> degree
#> <char>
#> 1: Bachelor of Science in Evolution and Population Biology
Example 4. This ID yields four observations indicating that the student was enrolled in Civil, Electrical, and Mechanical Engineering and a timely graduate of Mechanical.
# Display one student by ID
mcid_we_want <- "MCID3112470255"
DT[mcid == mcid_we_want]
#> mcid race sex program bloc
#> <char> <char> <char> <char> <char>
#> 1: MCID3112470255 White Male CE ever_enrolled
#> 2: MCID3112470255 White Male EE ever_enrolled
#> 3: MCID3112470255 White Male ME ever_enrolled
#> 4: MCID3112470255 White Male ME graduates
The term
and degree
excerpts confirm those
observations.
# Closer look at term
term[mcid == mcid_we_want]
#> mcid institution term cip6 level
#> <char> <char> <char> <char> <char>
#> 1: MCID3112470255 Institution C 20101 140801 01 First-year
#> 2: MCID3112470255 Institution C 20103 141001 01 First-year
#> 3: MCID3112470255 Institution C 20111 141901 02 Second-year
#> 4: MCID3112470255 Institution C 20113 141901 02 Second-year
#> 5: MCID3112470255 Institution C 20121 141901 03 Third-year
#> 6: MCID3112470255 Institution C 20123 141901 03 Third-year
#> 7: MCID3112470255 Institution C 20124 141901 03 Third-year
#> 8: MCID3112470255 Institution C 20131 141901 04 Fourth-year
#> 9: MCID3112470255 Institution C 20133 141901 04 Fourth-year
#> 10: MCID3112470255 Institution C 20141 141901 05 Fifth-year Plus
#> 11: MCID3112470255 Institution C 20143 141901 05 Fifth-year Plus
# Closer look at degree
degree[mcid == mcid_we_want]
#> mcid institution term_degree cip6
#> <char> <char> <char> <char>
#> 1: MCID3112470255 Institution C 20143 141901
#> degree
#> <char>
#> 1: Bachelor of Science in Mechanical Engineering