How to split single mail with procmail?

Question

I have a quarantine folder that I periodically have to download and split by recipient inbox or even better split each message in a text file. I have c.a. 10.000 mails per day and I'm coding something with fetchmail and procmail. The problem is that i can't find out how to split message-by-message in procmail; they all end up in the same inbox.

I tried to pass every message in a script via a recipe like:

    :0
    | script_processing_messages.sh

Which contained

    read varname
    echo "$varname" > test_file

To try to see if I could obtain a single message in the $varname variable but nope, I only obtain a single line of a message each time.

Right now I use

    fetchmail --keep

where .fetchmailrc is

    poll mail.mymta.my protocol pop3 username "my@inbox.com" password "****" mda "procmail /root/.procmailrc"

and .procmailrc is

    VERBOSE=0
    DEFAULT=/root/inbox.quarantine

I would like to obtain a file for each message, so:

1.txt
2.txt
3.txt
[...]
10000.txt

I have many recipients and many domains, so I can't let's say write 5000 rules to match every recipient. It would be good if there was some kind of

^To: $USER

that redirect to

/$USER.inbox

so that procmail itself takes care of reading and creating dinamically these inbox

I'm not very expert in fetchmail and procmail recipes, I'm trying hard but I'm not going so far.

Start with [RedHat - Procmail Recipes](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/s2-email-procmail-recipes) — David C. Rankin, Jul 04 '19 at 12:28
I already did it, but I didn't find anything to help me with. I mean, probably there's something in the pipe rule, but I couldnt get it as I showed in the question — Wyatt Gillette, Jul 04 '19 at 12:38
Look at the *"direct messages to a default location."* portion where that page is directing `* ^From: spammer@domain.com` to `/dev/null`. If I understand, you need to use the `^To:` field to direct the message to the users folder. That should get you to the right section of the reference. I use procmail for user-filtering into folders, but don't have a solution off the top of my head for your issue. — David C. Rankin, Jul 04 '19 at 12:43
Yes I should have been more accurate in describing my problem. I have many recipients and many domains, so I can't let's say write 5000 rules to match every recipient. It would be good if there was some kind of `^To: $USER` that redirect to `/$USER.inbox` so that procmail itself takes care of reading and creating dinamically these inbox...and that's where I can't find the answer — Wyatt Gillette, Jul 04 '19 at 12:58
It's not really clear what you are asking. Do you want to split a single Berkeley mbox file into messages and pass each through Procmail? That's `formail -s procmail — tripleee, Jul 04 '19 at 14:53
Or are you asking what to put in the `.procmailrc` file to save by recipient? That very much begs the question "which recipient exactly" - Procmail doesn't really know which of several headers to examine, and neither can we unless you tell us in more detail. Sometimes the local recipient address cannot be inferred from the headers (see the [Bcc FAQ](http://www.iki.fi/era/procmail/mini-faq.html#bcc)). — tripleee, Jul 04 '19 at 14:53
If you can enumerate the addresses you want to match and all of them will be matched in *some* `^TO_` header then `:0` `* ^TO_\/(first@example\.com|second@example\.net)` `$MATCH.mbox` will save to `first@example.com.mbox` if that's matched, else to `second@example.net.mbox` if that's matched, etc if you add more address regexes to the parenthesized list in the condition. — tripleee, Jul 04 '19 at 14:57
To clarify: I'm asking what to put in the `.procmailrc` file to save by recipient or what to put in a script to save by recipient when procmail pass the mail via a pipe. For example: right now I did a script that retrieve each recipient from my mail server and writes a recipe for each recipient in `.procmailrc`. Not very elegant or clean but it's working. Now the problem is how to split mail by mail, which i discovered it was the best thing to do and i still don't know how to do it. — Wyatt Gillette, Jul 05 '19 at 07:40
You seem to be rephrasing the same vague requirement but you are not really clarifying anything. Can you provide an example along the lines of *"this input ... should produce this output"?* More than one example might be useful for illustrating the scope of the problem (multiple addresses in multiple domains? How many? Or just all in one domain? Or messages are all local and there is not necessarily a domain name?) — tripleee, Jul 08 '19 at 10:15

tripleee · Accepted Answer · 2019-07-09T13:57:30.610

You seem to have two or three different questions; proper etiquette on Stack Overflow would be to ask each one separately - this also helps future visitors who have just one of your problems.

First off, to split a Berkeley mbox file containing multiple messages and run Procmail on each separately, try

formail -s procmail -m <file.mbox

You might need to read up on the mailbox formats supported by Procmail. A Berkeley mailbox is a single file which contains multiple messages, simply separated by a line beginning with From (with a space after the four alphabetic characters). This separator has to be unique, and so a message which contains those five characters at beginning of a line in the body will need to be escaped somehow (typically by writing a > before From).

To save each message in a separate file, choose a different mailbox format than the single-file Berkeley format. Concretely, if the destination is a directory, Procmail will create a new file in that directory. How exactly the new file is named depends on the contents of the directory (if it contains the Maildir subdirectories new, tmp, and cur, the new file is created in new in accordance with Maildir naming conventions) and on how exactly the directory is specified (trailing slash and dot selects MH format; otherwise, mail directory format).

Saving to one mailbox per recipient has a number of pesky corner cases. What if the message was sent to more than one of your local recipients? What if the recipient address is not visible in the headers? etc (the Procmail Mini-FAQ has a section about this, in the context of virtual hosting of a domain, which this is basically a variation of). But if we simply ignore these, you might be able to pull it off with something like

:0  # whitespace before ] is a literal tab
* ^TO_\/[^ @    ]+@(yourdomain\.example|example\.info)\>
{
    # Trim domain part from captured MATCH
    :0
    * MATCH ?? ^\/[^@]+
    ./$MATCH/
}

This will capture into $MATCH the first address which matches the regex, then perform another regex match on the captured string to capture just the part before the @ sign. This obviously requires that the addresses you want to match are all in a set of specific domains (here, I used yourdomain.example and example.info; obviously replace those with your actual domain names) and that capturing the first matching address is sufficient (so if a message was To: alice@yourdomain.example and Cc: bob@example.info, whichever one of those is closer to the top of the message will be picked out by this recipe, and the other one will be ignored).

In some more detail, the \/ special token causes Procmail to copy the text which matched the regex after this point into the internal variable MATCH. As this recipe demonstrates, you can then perform a regex match on that variable itself to extract a substring of it (or, in other words, discard part of the captured match).

The action ./$MATCH/ uses the captured string in MATCH as the name of the folder to save into. The leading ./ specifies the current directory (which is equal to the value of the Procmail variable MAILDIR) and the trailing / selects mail directory format.

If your expected recipients cannot be constrained to be in a specific set of domains or otherwise matched by a single regex, my recommendation would be to ask a new question with more limited scope, and enough details to actually identify what you want to accomplish.

score 0 · Answer 2 · answered Jul 05 '19 at 08:13

I found a solution to a part of my problem.

It seems that there is no way in procmail to let procmail itself recognize the For recipient without specifying it in a recipe, so I just obtained a list and create a huge recipe file.

But then I just discovered that to save single mails and to avoid huge mailboxes filled with a lot of mails, one could just write a recipe like:

:0
* ^To: recipient@mail.it
/inbox/folder/recipient@mail.it/

Note the / at the end: this will make procmail creating a folder structure instead of writing everywhing in a single file.

How to split single mail with procmail?

2 Answers2