DedupEndNote - deduplicate 2 files

1a. OLD records


1b. NEW records


2. START

3. RESULT

Progress

Waiting for new input file ...

Steps

(The names of the EndNote databases and the RIS files are just examples)

  1. Export your existing EndNote database OLD as a RIS file OLD_RECORDS.txt
  2. Import the results from the second query into a new EndNote database NEW
  3. Export the NEW EndNote database as NEW_RECORDS.txt
  4. Upload these 2 RIS files on this page
  5. DedupEndNote will deduplicate both files, and save only the records from NEW_RECORDS.txt which are not present in OLD_RECORDS.txt. If a record occurs multiple times in NEW_RECORDS.txt, it will be saved only once
  6. Save the result file as a local file (NEW_RECORDS_deduplicated.txt)
  7. Import NEW_RECORDS_deduplicated.txt in a new EndNote database if you want only to see the new records. Otherwise, import them into EndNote database OLD

Why deduplicate 2 files?

  • You have executed a query in a bibliographic database (e.g. PubMed) and imported the results in an EndNote database. Some time later you execute that query again (maybe after changing the query) in the same bibliographic database or a another query in another bibliographic database. You want to know which results from the second query are not present yet in the existing EndNote database.
  • You have assigned the records in the original EndNote database to several groups. If you add records from a new query to that EndNote database and deduplicate all records with DedupEndNote, you would lose that grouping information: EndNote export files do not retain grouping information!
  • You have results from several bibliographic databases (PubMed, Cochrane, EMBASE, ...) and no EndNote database yet. You prefer PubMed records above Cochrane records above EMBASE records ... (when a duplicate set has PubMed, Cochrane and EMBASE records, you want the PubMed record, when it has Cochrane and EMBASE records, you want the Cochrane record, ...).
    Steps:
    1. import the results in EndNote databases PUBMED, COCHRANE and EMBASE
    2. export these EndNote databases as PUBMED_RECORDS.txt, COCHRANE_RECORDS.txt, EMBASE_RECORDS.txt
    3. deduplicate PUBMED_RECORDS.txt as 1 file, import PUBMED_RECORDS_deduplicated.txt into a new EndNote database ALL
    4. export the EndNote database ALL as ALL_RECORDS.txt
    5. deduplicate ALL_RECORDS.txt and COCHRANE_RECORDS.txt as 2 files, and import COCHRANE_RECORDS_deduplicated.txt into EndNote database ALL
    6. repeat steps 4 and 5 for the other EndNote export files from step 2

    See also the FAQ for another way to achieve this.

Mark Mode

In Mark mode the duplicate records in the file with new records are marked with the IDs of the first record of a set of duplicate records. If the duplicate record was found in the file with old records, the ID is preceded with "-". When a record has no duplicates, no ID is used. The input file with new records is copied to the output file but the IDs of the duplicate records are written to the Label field ("LB"). The original content of the Label field is overwritten! The DOI and Pages fields are not changed.

After importing the result file ("..._mark.txt") into a new EndNote database, making the Label field visible, and sorting on the Label field:

  • records without Label content were unique in both files
  • records with a negative Label were already present in the file with old records
  • records with a positive Label had duplicates in the file with new records

Caveat

  • If you have deleted records in the original EndNote database, and the second set contains records which are duplicates of the deleted records, these records from the second set will be present in the deduplication results. See the FAQ for a solution with new projects.
  • DO NOT use the output file of a deduplication as an inputfile for another deduplication. The first output must first be imported in an EndNote database (acquiring ID's) and exported again. This export file (with the ID's) can be used as inputfile for the another deduplication.
  • Updates in the new file for ahead-of-print publications in the old file will NOT appear in the result file!