To a data frame keyed by student ID, add a column indicating that an institution's data range is sufficient to reliably assess a student's program completion. Columns of supporting information are also added. Unrelated columns are dropped.
Value
Data frame with the following properties:
Data frame class is preserved. Groups and keys are not preserved.
Rows are filtered for unique
mcidvalues.Columns
{mcid, term_i, timely_term}are retained (all other columns are dropped). New columns added:institution.Character. Institution in which the student is enrolled in the given term. Extracted frommidfield_rec.The limits given in the next two columns are specific to the institution.lower_limit.Character. Initial term of an institution's data range, encodedYYYYT. Extracted frommidfield_rec.Compared toterm_ito determine the lower-limit exclusion.upper_limit.Character. Final term of an institution's data range, encodedYYYYT. Extracted frommidfield_rec.Compared totimely_termto determine upper-limit exclusion.data_sufficiency.Character. Possible values are "include", if the data are sufficient; and "exclude-lower" or "exclude-upper" if not, indicating at which boundary of the data range the ambiguity occurs.
Details
Because the time span of MIDFIELD term data varies by institution, each has their own lower and upper bounds. When assessing a student's program completion, an unavoidable ambiguity arises for student records at or near these bounds. Such records must be identified and in most cases excluded to prevent false summary counts.
The data sufficiency criterion states that student records are limited to those for which available data are sufficient to assess timely completion without biased counts of completers or non-completers. In practice, the criteria is implemented via two filters. Rows are labeled for exclusion when: 1) a student ID is extant in the non-summer lower limit of an institution's data range; or 2) a student ID has a timely completion term that exceeds the upper limit of the institution's data range.
The goal of determining data sufficiency is to refine a population, that
is, obtain a data frame of IDs that satisfy our constraints. Thus
data_sufficiency() yields a column of data sufficiency values and
columns of supporting information keyed by ID. All other columns in
dframe (if any) are dropped.
The supporting information in the output is provided so that the user can review the findings. After review, we usually delete all columns except the IDs, yielding the refined population that was our goal.
Examples
# Start with an excerpt from the student data set
dframe <- toy_student[1:10, .(mcid)]
# Timely term column is required to add data sufficiency column
dframe <- timely_term(dframe, toy_term)
# Add data sufficiency column
data_sufficiency(dframe, toy_term)
#> mcid term_i timely_term institution lower_limit upper_limit
#> <char> <char> <char> <char> <char> <char>
#> 1: MCID3111158953 19881 19933 Institution J 19881 20096
#> 2: MCID3111158953 19881 19933 Institution J 19881 20096
#> 3: MCID3111158953 19881 19933 Institution J 19881 20096
#> 4: MCID3111158953 19881 19933 Institution J 19881 20096
#> 5: MCID3111158953 19881 19933 Institution J 19881 20096
#> 6: MCID3111158953 19881 19933 Institution J 19881 20096
#> 7: MCID3111158953 19881 19933 Institution J 19881 20096
#> 8: MCID3111158953 19881 19933 Institution J 19881 20096
#> 9: MCID3111158953 19881 19933 Institution J 19881 20096
#> 10: MCID3111158953 19881 19933 Institution J 19881 20096
#> 11: MCID3111159270 19881 19933 Institution J 19881 20096
#> 12: MCID3111159270 19881 19933 Institution J 19881 20096
#> 13: MCID3111159270 19881 19933 Institution J 19881 20096
#> 14: MCID3111159270 19881 19933 Institution J 19881 20096
#> 15: MCID3111159270 19881 19933 Institution J 19881 20096
#> 16: MCID3111159270 19881 19933 Institution J 19881 20096
#> 17: MCID3111159270 19881 19933 Institution J 19881 20096
#> 18: MCID3111159270 19881 19933 Institution J 19881 20096
#> 19: MCID3111160513 19881 19933 Institution J 19881 20096
#> 20: MCID3111160513 19881 19933 Institution J 19881 20096
#> 21: MCID3111160513 19881 19933 Institution J 19881 20096
#> 22: MCID3111160513 19881 19933 Institution J 19881 20096
#> 23: MCID3111160513 19881 19933 Institution J 19881 20096
#> 24: MCID3111160513 19881 19933 Institution J 19881 20096
#> 25: MCID3111160513 19881 19933 Institution J 19881 20096
#> 26: MCID3111160513 19881 19933 Institution J 19881 20096
#> 27: MCID3111160513 19881 19933 Institution J 19881 20096
#> 28: MCID3111160513 19881 19933 Institution J 19881 20096
#> 29: MCID3111162677 19881 19933 Institution J 19881 20096
#> 30: MCID3111162677 19881 19933 Institution J 19881 20096
#> 31: MCID3111162677 19881 19933 Institution J 19881 20096
#> 32: MCID3111162677 19881 19933 Institution J 19881 20096
#> 33: MCID3111162677 19881 19933 Institution J 19881 20096
#> 34: MCID3111162677 19881 19933 Institution J 19881 20096
#> 35: MCID3111162677 19881 19933 Institution J 19881 20096
#> 36: MCID3111162677 19881 19933 Institution J 19881 20096
#> 37: MCID3111162677 19881 19933 Institution J 19881 20096
#> 38: MCID3111162677 19881 19933 Institution J 19881 20096
#> 39: MCID3111164287 19881 19933 Institution J 19881 20096
#> 40: MCID3111164287 19881 19933 Institution J 19881 20096
#> 41: MCID3111164287 19881 19933 Institution J 19881 20096
#> 42: MCID3111164287 19881 19933 Institution J 19881 20096
#> 43: MCID3111164287 19881 19933 Institution J 19881 20096
#> 44: MCID3111164287 19881 19933 Institution J 19881 20096
#> 45: MCID3111164287 19881 19933 Institution J 19881 20096
#> 46: MCID3111171205 19881 19933 Institution B 19881 20181
#> 47: MCID3111171205 19881 19933 Institution B 19881 20181
#> 48: MCID3111171205 19881 19933 Institution B 19881 20181
#> 49: MCID3111171205 19881 19933 Institution B 19881 20181
#> 50: MCID3111171205 19881 19933 Institution B 19881 20181
#> 51: MCID3111171205 19881 19933 Institution B 19881 20181
#> 52: MCID3111171205 19881 19933 Institution B 19881 20181
#> 53: MCID3111171205 19881 19933 Institution B 19881 20181
#> 54: MCID3111172083 19881 19933 Institution B 19881 20181
#> 55: MCID3111172083 19881 19933 Institution B 19881 20181
#> 56: MCID3111172083 19881 19933 Institution B 19881 20181
#> 57: MCID3111172083 19881 19933 Institution B 19881 20181
#> 58: MCID3111172083 19881 19933 Institution B 19881 20181
#> 59: MCID3111172083 19881 19933 Institution B 19881 20181
#> 60: MCID3111172083 19881 19933 Institution B 19881 20181
#> 61: MCID3111172083 19881 19933 Institution B 19881 20181
#> 62: MCID3111213943 19891 19943 Institution B 19881 20181
#> 63: MCID3111213943 19891 19943 Institution B 19881 20181
#> 64: MCID3111213943 19891 19943 Institution B 19881 20181
#> 65: MCID3111213943 19891 19943 Institution B 19881 20181
#> 66: MCID3111248941 19901 19953 Institution J 19881 20096
#> 67: MCID3111248941 19901 19953 Institution J 19881 20096
#> 68: MCID3111248941 19901 19953 Institution J 19881 20096
#> 69: MCID3111248941 19901 19953 Institution J 19881 20096
#> 70: MCID3111248941 19901 19953 Institution J 19881 20096
#> 71: MCID3111248941 19901 19953 Institution J 19881 20096
#> 72: MCID3111248941 19901 19953 Institution J 19881 20096
#> 73: MCID3111248941 19901 19953 Institution J 19881 20096
#> 74: MCID3111248941 19901 19953 Institution J 19881 20096
#> 75: MCID3111248941 19901 19953 Institution J 19881 20096
#> 76: MCID3111248941 19901 19953 Institution J 19881 20096
#> 77: MCID3111250695 19901 19953 Institution J 19881 20096
#> 78: MCID3111250695 19901 19953 Institution J 19881 20096
#> 79: MCID3111250695 19901 19953 Institution J 19881 20096
#> mcid term_i timely_term institution lower_limit upper_limit
#> <char> <char> <char> <char> <char> <char>
#> data_sufficiency
#> <char>
#> 1: exclude-lower
#> 2: exclude-lower
#> 3: exclude-lower
#> 4: exclude-lower
#> 5: exclude-lower
#> 6: exclude-lower
#> 7: exclude-lower
#> 8: exclude-lower
#> 9: exclude-lower
#> 10: exclude-lower
#> 11: exclude-lower
#> 12: exclude-lower
#> 13: exclude-lower
#> 14: exclude-lower
#> 15: exclude-lower
#> 16: exclude-lower
#> 17: exclude-lower
#> 18: exclude-lower
#> 19: exclude-lower
#> 20: exclude-lower
#> 21: exclude-lower
#> 22: exclude-lower
#> 23: exclude-lower
#> 24: exclude-lower
#> 25: exclude-lower
#> 26: exclude-lower
#> 27: exclude-lower
#> 28: exclude-lower
#> 29: exclude-lower
#> 30: exclude-lower
#> 31: exclude-lower
#> 32: exclude-lower
#> 33: exclude-lower
#> 34: exclude-lower
#> 35: exclude-lower
#> 36: exclude-lower
#> 37: exclude-lower
#> 38: exclude-lower
#> 39: exclude-lower
#> 40: exclude-lower
#> 41: exclude-lower
#> 42: exclude-lower
#> 43: exclude-lower
#> 44: exclude-lower
#> 45: exclude-lower
#> 46: exclude-lower
#> 47: exclude-lower
#> 48: exclude-lower
#> 49: exclude-lower
#> 50: exclude-lower
#> 51: exclude-lower
#> 52: exclude-lower
#> 53: exclude-lower
#> 54: exclude-lower
#> 55: exclude-lower
#> 56: exclude-lower
#> 57: exclude-lower
#> 58: exclude-lower
#> 59: exclude-lower
#> 60: exclude-lower
#> 61: exclude-lower
#> 62: include
#> 63: include
#> 64: include
#> 65: include
#> 66: include
#> 67: include
#> 68: include
#> 69: include
#> 70: include
#> 71: include
#> 72: include
#> 73: include
#> 74: include
#> 75: include
#> 76: include
#> 77: include
#> 78: include
#> 79: include
#> data_sufficiency
#> <char>
# Existing data_sufficiency column, if any, is replaced
dframe[, data_sufficiency := NA_character_][]
#> mcid term_i level_i adj_span timely_term data_sufficiency
#> <char> <char> <char> <num> <char> <char>
#> 1: MCID3111158953 19881 01 First-year 6 19933 <NA>
#> 2: MCID3111159270 19881 01 First-year 6 19933 <NA>
#> 3: MCID3111160513 19881 01 First-year 6 19933 <NA>
#> 4: MCID3111162677 19881 01 First-year 6 19933 <NA>
#> 5: MCID3111164287 19881 01 First-year 6 19933 <NA>
#> 6: MCID3111171205 19881 01 First-year 6 19933 <NA>
#> 7: MCID3111172083 19881 01 First-year 6 19933 <NA>
#> 8: MCID3111213943 19891 01 First-year 6 19943 <NA>
#> 9: MCID3111248941 19901 01 First-year 6 19953 <NA>
#> 10: MCID3111250695 19901 01 First-year 6 19953 <NA>
data_sufficiency(dframe, toy_term)
#> mcid term_i timely_term institution lower_limit upper_limit
#> <char> <char> <char> <char> <char> <char>
#> 1: MCID3111158953 19881 19933 Institution J 19881 20096
#> 2: MCID3111158953 19881 19933 Institution J 19881 20096
#> 3: MCID3111158953 19881 19933 Institution J 19881 20096
#> 4: MCID3111158953 19881 19933 Institution J 19881 20096
#> 5: MCID3111158953 19881 19933 Institution J 19881 20096
#> 6: MCID3111158953 19881 19933 Institution J 19881 20096
#> 7: MCID3111158953 19881 19933 Institution J 19881 20096
#> 8: MCID3111158953 19881 19933 Institution J 19881 20096
#> 9: MCID3111158953 19881 19933 Institution J 19881 20096
#> 10: MCID3111158953 19881 19933 Institution J 19881 20096
#> 11: MCID3111159270 19881 19933 Institution J 19881 20096
#> 12: MCID3111159270 19881 19933 Institution J 19881 20096
#> 13: MCID3111159270 19881 19933 Institution J 19881 20096
#> 14: MCID3111159270 19881 19933 Institution J 19881 20096
#> 15: MCID3111159270 19881 19933 Institution J 19881 20096
#> 16: MCID3111159270 19881 19933 Institution J 19881 20096
#> 17: MCID3111159270 19881 19933 Institution J 19881 20096
#> 18: MCID3111159270 19881 19933 Institution J 19881 20096
#> 19: MCID3111160513 19881 19933 Institution J 19881 20096
#> 20: MCID3111160513 19881 19933 Institution J 19881 20096
#> 21: MCID3111160513 19881 19933 Institution J 19881 20096
#> 22: MCID3111160513 19881 19933 Institution J 19881 20096
#> 23: MCID3111160513 19881 19933 Institution J 19881 20096
#> 24: MCID3111160513 19881 19933 Institution J 19881 20096
#> 25: MCID3111160513 19881 19933 Institution J 19881 20096
#> 26: MCID3111160513 19881 19933 Institution J 19881 20096
#> 27: MCID3111160513 19881 19933 Institution J 19881 20096
#> 28: MCID3111160513 19881 19933 Institution J 19881 20096
#> 29: MCID3111162677 19881 19933 Institution J 19881 20096
#> 30: MCID3111162677 19881 19933 Institution J 19881 20096
#> 31: MCID3111162677 19881 19933 Institution J 19881 20096
#> 32: MCID3111162677 19881 19933 Institution J 19881 20096
#> 33: MCID3111162677 19881 19933 Institution J 19881 20096
#> 34: MCID3111162677 19881 19933 Institution J 19881 20096
#> 35: MCID3111162677 19881 19933 Institution J 19881 20096
#> 36: MCID3111162677 19881 19933 Institution J 19881 20096
#> 37: MCID3111162677 19881 19933 Institution J 19881 20096
#> 38: MCID3111162677 19881 19933 Institution J 19881 20096
#> 39: MCID3111164287 19881 19933 Institution J 19881 20096
#> 40: MCID3111164287 19881 19933 Institution J 19881 20096
#> 41: MCID3111164287 19881 19933 Institution J 19881 20096
#> 42: MCID3111164287 19881 19933 Institution J 19881 20096
#> 43: MCID3111164287 19881 19933 Institution J 19881 20096
#> 44: MCID3111164287 19881 19933 Institution J 19881 20096
#> 45: MCID3111164287 19881 19933 Institution J 19881 20096
#> 46: MCID3111171205 19881 19933 Institution B 19881 20181
#> 47: MCID3111171205 19881 19933 Institution B 19881 20181
#> 48: MCID3111171205 19881 19933 Institution B 19881 20181
#> 49: MCID3111171205 19881 19933 Institution B 19881 20181
#> 50: MCID3111171205 19881 19933 Institution B 19881 20181
#> 51: MCID3111171205 19881 19933 Institution B 19881 20181
#> 52: MCID3111171205 19881 19933 Institution B 19881 20181
#> 53: MCID3111171205 19881 19933 Institution B 19881 20181
#> 54: MCID3111172083 19881 19933 Institution B 19881 20181
#> 55: MCID3111172083 19881 19933 Institution B 19881 20181
#> 56: MCID3111172083 19881 19933 Institution B 19881 20181
#> 57: MCID3111172083 19881 19933 Institution B 19881 20181
#> 58: MCID3111172083 19881 19933 Institution B 19881 20181
#> 59: MCID3111172083 19881 19933 Institution B 19881 20181
#> 60: MCID3111172083 19881 19933 Institution B 19881 20181
#> 61: MCID3111172083 19881 19933 Institution B 19881 20181
#> 62: MCID3111213943 19891 19943 Institution B 19881 20181
#> 63: MCID3111213943 19891 19943 Institution B 19881 20181
#> 64: MCID3111213943 19891 19943 Institution B 19881 20181
#> 65: MCID3111213943 19891 19943 Institution B 19881 20181
#> 66: MCID3111248941 19901 19953 Institution J 19881 20096
#> 67: MCID3111248941 19901 19953 Institution J 19881 20096
#> 68: MCID3111248941 19901 19953 Institution J 19881 20096
#> 69: MCID3111248941 19901 19953 Institution J 19881 20096
#> 70: MCID3111248941 19901 19953 Institution J 19881 20096
#> 71: MCID3111248941 19901 19953 Institution J 19881 20096
#> 72: MCID3111248941 19901 19953 Institution J 19881 20096
#> 73: MCID3111248941 19901 19953 Institution J 19881 20096
#> 74: MCID3111248941 19901 19953 Institution J 19881 20096
#> 75: MCID3111248941 19901 19953 Institution J 19881 20096
#> 76: MCID3111248941 19901 19953 Institution J 19881 20096
#> 77: MCID3111250695 19901 19953 Institution J 19881 20096
#> 78: MCID3111250695 19901 19953 Institution J 19881 20096
#> 79: MCID3111250695 19901 19953 Institution J 19881 20096
#> mcid term_i timely_term institution lower_limit upper_limit
#> <char> <char> <char> <char> <char> <char>
#> data_sufficiency
#> <char>
#> 1: exclude-lower
#> 2: exclude-lower
#> 3: exclude-lower
#> 4: exclude-lower
#> 5: exclude-lower
#> 6: exclude-lower
#> 7: exclude-lower
#> 8: exclude-lower
#> 9: exclude-lower
#> 10: exclude-lower
#> 11: exclude-lower
#> 12: exclude-lower
#> 13: exclude-lower
#> 14: exclude-lower
#> 15: exclude-lower
#> 16: exclude-lower
#> 17: exclude-lower
#> 18: exclude-lower
#> 19: exclude-lower
#> 20: exclude-lower
#> 21: exclude-lower
#> 22: exclude-lower
#> 23: exclude-lower
#> 24: exclude-lower
#> 25: exclude-lower
#> 26: exclude-lower
#> 27: exclude-lower
#> 28: exclude-lower
#> 29: exclude-lower
#> 30: exclude-lower
#> 31: exclude-lower
#> 32: exclude-lower
#> 33: exclude-lower
#> 34: exclude-lower
#> 35: exclude-lower
#> 36: exclude-lower
#> 37: exclude-lower
#> 38: exclude-lower
#> 39: exclude-lower
#> 40: exclude-lower
#> 41: exclude-lower
#> 42: exclude-lower
#> 43: exclude-lower
#> 44: exclude-lower
#> 45: exclude-lower
#> 46: exclude-lower
#> 47: exclude-lower
#> 48: exclude-lower
#> 49: exclude-lower
#> 50: exclude-lower
#> 51: exclude-lower
#> 52: exclude-lower
#> 53: exclude-lower
#> 54: exclude-lower
#> 55: exclude-lower
#> 56: exclude-lower
#> 57: exclude-lower
#> 58: exclude-lower
#> 59: exclude-lower
#> 60: exclude-lower
#> 61: exclude-lower
#> 62: include
#> 63: include
#> 64: include
#> 65: include
#> 66: include
#> 67: include
#> 68: include
#> 69: include
#> 70: include
#> 71: include
#> 72: include
#> 73: include
#> 74: include
#> 75: include
#> 76: include
#> 77: include
#> 78: include
#> 79: include
#> data_sufficiency
#> <char>