Subset a CIP data frame, retaining rows that match or partially match a
vector of character strings. Search terms can include regular expressions.
Uses grepl(), therefore non-character columns (if any) that can be
coerced to character are also searched for matches.
Arguments
- dframe
Data frame of CIP program codes to be searched, typically
cipthat loads with midfieldr.- pattern
Character vector of search strings for retaining rows, not case-sensitive. Can include regular expressions.
- ...
Not used for passing values; forces subsequent arguments to be referable only by name.
- negate
Logical. If true, searches for not-pattern. Default FALSE.
Value
A data frame of the same type as dframe. The output has the
following properties:
Rows are a subset of the input, but appear in the same order.
Columns are not modified.
Groups are not preserved.
Data frame attributes are preserved for classes
data.frame,data.table, ortbl_df.
Examples
# Subset using keywords
filter_cip_rows(cip, pattern = "history")
#> cip6name cip6
#> <char> <char>
#> 1: Architectural History and Criticism 040801
#> 2: History Teacher Education 131328
#> 3: Theatre Literature, History and Criticism 500505
#> 4: Art History, Criticism and Conservation 500703
#> 5: Music History, Literature and Theory 500902
#> 6: History, General 540101
#> 7: American History (United States) 540102
#> 8: European History 540103
#> 9: History and Philosophy of Science and Technology 540104
#> 10: Public, Applied History and Archival Administration 540105
#> 11: Asian History 540106
#> 12: Canadian History 540107
#> 13: Military History 540108
#> 14: History, Other 540199
#> cip4name
#> <char>
#> 1: Architectural History and Criticism
#> 2: Teacher Education and Professional Development, Specific Subject Areas
#> 3: Drama, Theatre Arts and Stagecraft
#> 4: Fine and Studio Art
#> 5: Music
#> 6: History
#> 7: History
#> 8: History
#> 9: History
#> 10: History
#> 11: History
#> 12: History
#> 13: History
#> 14: History
#> cip4 cip2name cip2
#> <char> <char> <char>
#> 1: 0408 Architecture and Related Services 04
#> 2: 1313 Education 13
#> 3: 5005 Visual and Performing Arts 50
#> 4: 5007 Visual and Performing Arts 50
#> 5: 5009 Visual and Performing Arts 50
#> 6: 5401 History 54
#> 7: 5401 History 54
#> 8: 5401 History 54
#> 9: 5401 History 54
#> 10: 5401 History 54
#> 11: 5401 History 54
#> 12: 5401 History 54
#> 13: 5401 History 54
#> 14: 5401 History 54
# Subset using codes
filter_cip_rows(cip, pattern = "^54")
#> cip6name cip6 cip4name cip4
#> <char> <char> <char> <char>
#> 1: History, General 540101 History 5401
#> 2: American History (United States) 540102 History 5401
#> 3: European History 540103 History 5401
#> 4: History and Philosophy of Science and Technology 540104 History 5401
#> 5: Public, Applied History and Archival Administration 540105 History 5401
#> 6: Asian History 540106 History 5401
#> 7: Canadian History 540107 History 5401
#> 8: Military History 540108 History 5401
#> 9: History, Other 540199 History 5401
#> cip2name cip2
#> <char> <char>
#> 1: History 54
#> 2: History 54
#> 3: History 54
#> 4: History 54
#> 5: History 54
#> 6: History 54
#> 7: History 54
#> 8: History 54
#> 9: History 54
# Multiple passes to narrow the results
first_pass <- filter_cip_rows(cip, "math")[, .(cip6name, cip6)]
first_pass
#> cip6name cip6
#> <char> <char>
#> 1: Mathematics Teacher Education 131311
#> 2: Biometry, Biometrics 261101
#> 3: Biostatistics 261102
#> 4: Bioinformatics 261103
#> 5: Computational Biology 261104
#> 6: Biomathematics and Bioinformatics, Other 261199
#> 7: Mathematics, General 270101
#> 8: Algebra and Number Theory 270102
#> 9: Analysis and Functional Analysis 270103
#> 10: Geometry, Geometric Analysis 270104
#> 11: Topology and Foundations 270105
#> 12: Mathematics, Other 270199
#> 13: Applied Mathematics 270301
#> 14: Computational Mathematics 270303
#> 15: Computational and Applied Mathematics 270304
#> 16: Financial Mathematics 270305
#> 17: Mathematical Biology 270306
#> 18: Applied Mathematics, Other 270399
#> 19: Statistics, General 270501
#> 20: Mathematical Statistics and Probability 270502
#> 21: Mathematics and Statistics 270503
#> 22: Statistics, Other 270599
#> 23: Mathematics and Statistics, Other 279999
#> 24: Multi, Interdisciplinary Studies - Mathematics and Computer Science 300801
#> 25: Developmental, Remedial Mathematics 320104
#> 26: Theoretical and Mathematical Physics 400810
#> 27: Aromatherapy 513701
#> cip6name cip6
#> <char> <char>
second_pass <- filter_cip_rows(first_pass, c("bio", "educ"), negate = TRUE)
second_pass
#> cip6name cip6
#> <char> <char>
#> 1: Mathematics, General 270101
#> 2: Algebra and Number Theory 270102
#> 3: Analysis and Functional Analysis 270103
#> 4: Geometry, Geometric Analysis 270104
#> 5: Topology and Foundations 270105
#> 6: Mathematics, Other 270199
#> 7: Applied Mathematics 270301
#> 8: Computational Mathematics 270303
#> 9: Computational and Applied Mathematics 270304
#> 10: Financial Mathematics 270305
#> 11: Applied Mathematics, Other 270399
#> 12: Statistics, General 270501
#> 13: Mathematical Statistics and Probability 270502
#> 14: Mathematics and Statistics 270503
#> 15: Statistics, Other 270599
#> 16: Mathematics and Statistics, Other 279999
#> 17: Multi, Interdisciplinary Studies - Mathematics and Computer Science 300801
#> 18: Developmental, Remedial Mathematics 320104
#> 19: Theoretical and Mathematical Physics 400810
#> 20: Aromatherapy 513701
#> cip6name cip6
#> <char> <char>
third_pass <- filter_cip_rows(second_pass, c("^27", "^30"))
third_pass
#> cip6name cip6
#> <char> <char>
#> 1: Mathematics, General 270101
#> 2: Algebra and Number Theory 270102
#> 3: Analysis and Functional Analysis 270103
#> 4: Geometry, Geometric Analysis 270104
#> 5: Topology and Foundations 270105
#> 6: Mathematics, Other 270199
#> 7: Applied Mathematics 270301
#> 8: Computational Mathematics 270303
#> 9: Computational and Applied Mathematics 270304
#> 10: Financial Mathematics 270305
#> 11: Applied Mathematics, Other 270399
#> 12: Statistics, General 270501
#> 13: Mathematical Statistics and Probability 270502
#> 14: Mathematics and Statistics 270503
#> 15: Statistics, Other 270599
#> 16: Mathematics and Statistics, Other 279999
#> 17: Multi, Interdisciplinary Studies - Mathematics and Computer Science 300801