authors_refine
This function takes the author list output after the
output has been synthesized for incorrect author matches. It contains a
similarity score cutoff like read_authors. This however is to further
constrain the list. New values ARE NOT created, instead it filters by the
sim_score column in the output file.
Examples
## First gather the authors data.frame from authors_clean
data(BITR)
BITR_authors <- authors_clean(BITR)
#>
#> Splitting author records
#>
|
| | 0%
|
|======= | 10%
|
| | 0%
|
|============== | 20%
|
| | 0%
|
|===================== | 30%
|
| | 0%
|
|============================ | 40%
|
| | 0%
|
|=================================== | 50%
|
| | 0%
|
|========================================== | 60%
|
| | 0%
|
|================================================= | 70%
|
| | 0%
|
|======================================================== | 80%
|
| | 0%
|
|=============================================================== | 90%
|
| | 0%
|
|======================================================================| 100%
#>
#> Splitting addresses
#>
|
| | 0%
|
|===== | 7%
|
| | 0%
|
|========== | 14%
|
| | 0%
|
|=============== | 21%
|
| | 0%
|
|==================== | 29%
|
| | 0%
|
|========================= | 36%
|
| | 0%
|
|============================== | 43%
|
| | 0%
|
|=================================== | 50%
|
| | 0%
|
|======================================== | 57%
|
| | 0%
|
|============================================= | 64%
|
| | 0%
|
|================================================== | 71%
|
| | 0%
|
|======================================================= | 79%
|
| | 0%
|
|============================================================ | 86%
|
| | 0%
|
|================================================================= | 93%
|
| | 0%
|
|======================================================================| 100%
#>
#> Matching authors
#>
|
| | 0%
|
|= | 2%
|
| | 0%
|
|=== | 4%
|
| | 0%
|
|==== | 5%
|
| | 0%
|
|===== | 7%
|
| | 0%
|
|====== | 9%
|
| | 0%
|
|======== | 11%
|
| | 0%
|
|========= | 13%
|
| | 0%
|
|========== | 15%
|
| | 0%
|
|=========== | 16%
|
| | 0%
|
|============= | 18%
|
| | 0%
|
|============== | 20%
|
| | 0%
|
|=============== | 22%
|
| | 0%
|
|================= | 24%
|
| | 0%
|
|================== | 25%
|
| | 0%
|
|=================== | 27%
|
| | 0%
|
|==================== | 29%
|
| | 0%
|
|====================== | 31%
|
| | 0%
|
|======================= | 33%
|
| | 0%
|
|======================== | 35%
|
| | 0%
|
|========================= | 36%
|
| | 0%
|
|=========================== | 38%
|
| | 0%
|
|============================ | 40%
|
| | 0%
|
|============================= | 42%
|
| | 0%
|
|=============================== | 44%
|
| | 0%
|
|================================ | 45%
|
| | 0%
|
|================================= | 47%
|
| | 0%
|
|================================== | 49%
|
| | 0%
|
|==================================== | 51%
|
| | 0%
|
|===================================== | 53%
|
| | 0%
|
|====================================== | 55%
|
| | 0%
|
|======================================= | 56%
|
| | 0%
|
|========================================= | 58%
|
| | 0%
|
|========================================== | 60%
|
| | 0%
|
|=========================================== | 62%
|
| | 0%
|
|============================================= | 64%
|
| | 0%
|
|============================================== | 65%
|
| | 0%
|
|=============================================== | 67%
|
| | 0%
|
|================================================ | 69%
|
| | 0%
|
|================================================== | 71%
|
| | 0%
|
|=================================================== | 73%
|
| | 0%
|
|==================================================== | 75%
|
| | 0%
|
|===================================================== | 76%
|
| | 0%
|
|======================================================= | 78%
|
| | 0%
|
|======================================================== | 80%
|
| | 0%
|
|========================================================= | 82%
|
| | 0%
|
|=========================================================== | 84%
|
| | 0%
|
|============================================================ | 85%
|
| | 0%
|
|============================================================= | 87%
|
| | 0%
|
|============================================================== | 89%
|
| | 0%
|
|================================================================ | 91%
|
| | 0%
|
|================================================================= | 93%
|
| | 0%
|
|================================================================== | 95%
|
| | 0%
|
|=================================================================== | 96%
|
| | 0%
|
|===================================================================== | 98%
|
| | 0%
|
|======================================================================| 100%
#>
#> Pruning groupings...
BITR_review_df <- BITR_authors$review
BITR_prelim_df <- BITR_authors$prelim
## If accepting the preliminary disambiguation
## from authors_clean() without review:
refine_df <- authors_refine(BITR_review_df, BITR_prelim_df,
sim_score = 0.90, confidence = 5)
## Note that 'sim_score' and 'confidence' are optional arguments and are
## only required if changing the default values.
refine_df <- authors_refine(BITR_review_df, BITR_prelim_df)
## If changes were made to groupID or authorID in the "_review.csv" file:
## then incorporate those changes in a text editor, save the corrections as
## a new file name, load in to R and run `authors_refine()` with the
## new corrections as the review arguement.