Duplicate search function

The Duplicate Search function finds duplicates across datasets, with parameters that can be modified to widen or narrow down the intended search. The results are presented in a package format that allows users to easily move information from the duplicate records to be deleted into the final master record.

1. Accessing the Duplicate Search function

The Duplicate Search can be accessed through the Analysis menu in the main menu bar.

      

2. Search with default settings

Enter the string of text in the search box and click on the magnifying glass. You can also search without typing any text in the search box.

       

The default search, without selecting any of the options under the Settings and Advanced menus, looks for English perfect duplicates, excludes non-duplicates, excludes deleted records, searches in open mode and compares only term/title.

3. Settings menu

The search parameters can be modified through the Settings menu.

       

Click on Settings to display a dropdown menu with these options:

  • Non-duplicates: to include non-doublons in the search.
  • Non-perfect: when selected, the system looks for Perfect duplicates (identical strings, ignoring capitalization), and Imperfect duplicates (variations in  string order, stop words, punctuation, and stemming). 
  • Spelling variations: to look for duplicate records with spelling differences.
  • Closed search: when selected, duplicates are detected only within defined search criteria.
  • Deleted records: to include deleted recordsin the search.
  • All entries: to include all terms, titles, variants, alternates, synonyms, short forms and acronyms in the comparison.

4. Advanced menu

The Advanced menu allows you to refine your search.

       

Click on Advanced next to Settings. A new line will be displayed with these menu options:

  • Reset button: click to reset the options selected.
  • DB: pick the databases that you want to compare.
  • Language: select the language that you are comparing for duplication.
  • Categorization: pick bodies and subjects to which you want to limit the search.
  • Type: limit your search to certain record types (term, title, phraseology, proper name, country, footnote)
  • Status: searches languages with the selected status, which can be combined with the search language under Language menu (for example, search for a string in English and look for duplicates with Validated Chinese)

5. Searching specific datasets

To search for duplicates across specific datasets, select Closed search under settings and the corresponding datasets under Advanced>DB. You can also look for duplicates within one dataset only.

       


6. Search results

For each comparison string found, the system displays all records grouped in batch, with indication of the dataset where they were found. 

      

7. Sorting results

The search results can be order with the dropdown menu found next to the magnifying glass, by relevancy, alphabetically or by duplicate term string.

      

8. Working the duplicate results

To work with a given duplicated string, follow the steps below:<

1) Click on Compare to see all the records found under a given string.

      

The duplicate records found under the string will be displayed side by side.

      

2) Identify the records that are not duplicates because they refer to a different concept and mark them as Non-doublons. Click on the Non-doublon button and select all the records against which they constitute a non-duplicate.

      

You can also remove records from the package without affecting their status by clicking on the red x button shown below.

      

3) Identify the record that will become the master record and click on the Master button inside the record. The master record is the one that will remain available, while the rest of the duplicate records will be deleted.

      

4) See if there is any information from the rest of the records to be deleted that you would like to move into this master record. Click on the move button next to the text you'd like to move. The information will automatically appear in the Master record.

      

5) Click on the Delete button in each duplicate record. The deleted records will be moved into a Deleted record space within the portal. Although these records will no longer appear in search results, they will be retrievable from the Deleted records space and could be reinstated if necessary.