Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpforms-captcha domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/interbiznw/public_html/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpforms-lite domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/interbiznw/public_html/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpscan domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/interbiznw/public_html/wp-includes/functions.php on line 6114
Removing Duplicate E-mail Messages From A Mailbox

Removing Duplicate E-mail Messages From A Mailbox

Occasionally your mail delivery scheme might hiccup, leaving you with duplicate copies of email messages sitting in your mailboxes. I find this happens occasionally if something goes wrong with fetchmail – you kill the fetchmail process before it has expunged the deleted email from the remote POP3 server, so the next time you run fetchmail it downloads a second copy of each email. This is a simple process that I came up with to remove duplicate email messages from a maildir format mailbox.

As a bit of background, a maildir mailbox is a small directory tree:

$ du .boxes.xml-dev
4       .boxes.xml-dev/tmp
124     .boxes.xml-dev/new
52340   .boxes.xml-dev/cur
52920   .boxes.xml-dev
$

Hierarchy is represented by components of the mailbox name separated by dots, so the mailbox above is called xml-dev and it is in the boxes mailbox. Messages are files in either the new or cur directories. Transport agents place messages into the new directory. When a user agent opens a mailbox it moves all the messages from new to cur. If you’re accessing your mail through an IMAP server like Courier-IMAP the IMAP server will deal with this for you.

  1. Make sure there’s nothing sitting in the new subdirectory.
    $ ls new
    $


    If there are messages in the new subdirectory, open the mailbox in a user agent to get it to move them into cur.
  2. See how many messages you have:
    $ ls cur | wc -l 842
    $
  3. Check they all have Message-IDs:
    $ for i in cur/*; do reformail -x Message-ID: <$i; done | wc -l
    842
    $
  4. See how many you have if you filter out duplicate Message-IDs:
    $ for i in cur/*; do reformail -x Message-ID: <$i; done | sort -u | wc -l 698
    $
  5. See how many we’re going to delete:
    $ rm /tmp/dups $ for i in cur/*; do reformail -D 20000 /tmp/dups <$i && echo $i; done | wc -l
    144
    $
    expr 698 + 144
    842
    $

    If this total doesn’t match you should increase the 20000 – reformail isn’t remembering enough Message-IDs to spot all the duplicates.
  6. Delete the messages and check things look right afterwards:
    $ rm /tmp/dups $ for i in cur/*; do reformail -D 20000 /tmp/dups <$i && rm $i; done
    $ ls cur | wc -l
    698
    $

Leave a Reply

Your email address will not be published. Required fields are marked *