Deduplicate 1 file

Use Deduplicate 2 files to compare a new set of records against an existing one

1Input File

Select an EndNote or Zotero export in RIS format (.ris or .txt)

2Start

Waiting for new input file ...

3Result

DedupEndNote is a tool for removing duplicate records from EndNote or Zotero databases exported in RIS format. It is more flexible than the built-in deduplication features in EndNote or Zotero, and identifies more duplicates.

You can:

  • Deduplicate a single file (this page)
  • Compare a new file against an existing file (see Deduplicate 2 files)

The program has been tested on EndNote and Zotero databases with records from: CINAHL (EBSCOHost), ClinicalTrials.gov, Cochrane Library, EMBASE (OVID and Embase.com), Medline (OVID), PsycINFO (OVID), PubMed, Scopus, Web of Science.

DedupEndNote compares each pair of records in up to five stages, stopping early if a mismatch is found:

  1. Publication Year: Matches if the years are the same or differ by at most one year.
  2. Starting Page or DOI: Compares page numbers and DOIs with preprocessing to handle variations.
  3. Authors: Uses Jaro-Winkler similarity on up to the first 40 authors, with name normalization.
  4. Title: Compares normalized or reversed titles.
  5. ISBN / ISSN / Journal: Matches exact ISBNs or ISSNs or compares normalized journal titles.

If a pair scores YES in all applicable comparisons, they are considered duplicates.

By default, only the first record in each duplicate set is kept. The output file:

  • Preserves all unique records
  • Enriches them with missing data from duplicates (e.g., DOI, journal name, publication year, pages)
  • Normalizes certain fields (e.g., DOI format, page ranges)

Steps to use the result:

  • Export an EndNote / Zotero database into a file in RIS format
  • Upload this file in DedupEndNote
  • Save the results file with deduplicated records
  • Import this results file into a new EndNote / Zotero database

Instead of removing duplicates, Mark Mode labels them for manual review:

  • The ID of the first record in each duplicate set is copied to the Label (LB) field of all duplicates.
  • The original Label content is overwritten.
  • No enrichment or normalization is performed.

This mode is useful if you want to merge records manually in EndNote.

If you use DedupEndNote, please cite:

Lobbestael, G. (2026). DedupEndNote (Version 1.1.4) [Computer software]. https://github.com/globbestael/DedupEndNote

If you have any questions about the tool or come across a problem when trying to use it, please raise an issue on the GitHub Repository.