Last implemented rails migration deletes duplicate entries
in a single feed based on title+summary+content md5 hash,
not just guid. The oldest entry for each set of duplicates in a feed
is left in the DB, newer duplicates are deleted.
To determine which entry is oldest (and should not be deleted), first
they are ordered by publish date and if there's a tie, created_at is
used.
↧