Bounce processing, when SMTP server doesn’t return bounces properly


What bounce processing is

When you send an email, it’s possible that it’s returned to you because of technical problems or mistakes made when writing the email.

One of these mistakes may be that the recipient’s email address is wrong.

CRM sending mass emails

Many CRMs, Customer Relationship Management system, have the capability of sending mass mailing.

They can send the same newsletter to hundreds or thousands of recipients.

CRM processing bounces

If an email is returned because of any problem, the CRM will process it.

If an email is returned because the email address of the recipient is wrong, the CRM will put the contact on hold.

CiviCRM and bounce processing

I’m using CiviCRM as CRM. When I send a mailing to many contacts, CiviCRM sets the “Return-Path” header in the email to be something like: return+b.19.5.9c232c48e0ba219a@emanuelesantanche.com.

When an email is returned because of errors, it should be sent to return@emanuelesantanche.com.

The additional part b.19.5.9c232c48e0ba219a in the email address will tell CiviCRM which mailing and contact the original email was about.

By using this information, CiviCRM will put the contact on hold, sometimes immediately, sometimes after a few failures occurred.

SMTP server not returning bounces correctly

It happens that the SMTP server I’m using, and many others, don’t process bounces properly.

They should send the failure message to return+b.19.5.9c232c48e0ba219a@emanuelesantanche.com and the email should show up in the return@emanuelesantanche.com mailbox.

CiviCRM should have nothing else to do than check the recipient, return+b.19.5.9c232c48e0ba219a@emanuelesantanche.com, and do its processing.

Instead, the SMTP server sends the email back to where it was from, emanuele@emanuelesantanche.com.

If CiviCRM checks the recipient, it doesn’t find the information it needs to process the bounce.

Scanning the entire email’s body

The solution is to scan the entire body of the returned email because somewhere there will be written that address CiviCRM needs.

CiviCRM uses a regular expression like:

  1. '/Return-Path: ' . preg_quote($dao->localpart) . '(b)' . $twoDigitString . '([0-9a-f]{16})@' . preg_quote($dao->domain) . '/'

Let’s say that the returned email contains the headers of the original email:

  1. Return-Path:return+b.19.5.9c232c48e0ba219a@emanuelesantanche.com
  2. Received:from crm.emanuelesantanche.com (cpc19-finc14-2-0-cust228.4-2.cable.virginm.net [82.28.208.229]) by mx.zohomail.com
  3. with SMTPS id 1444752405240749.4338448238343; Tue, 13 Oct 2015 09:06:45 -0700 (PDT)
  4. Date:Tue, 13 Oct 2015 17:06:44 +0100

The regular expression above will find the line:

  1. Return-Path:return+b.19.5.9c232c48e0ba219a@emanuelesantanche.com

and CiviCRM will process the bounce correctly.

Problem solved?

Yes, you got it, the regular expression above won’t find the line with the return path because it looks for a space that is missing.

It should be:

  1. '/Return-Path:' . preg_quote($dao->localpart) . '(b)' . $twoDigitString . '([0-9a-f]{16})@' . preg_quote($dao->domain) . '/'

Note the space after ‘Return-Path:’, I removed it.

Now the regular expression will match.

Fixing CiviCRM code

I had to fix CiviCRM code for the regular expression to work correctly.

I changed this php file: /sites/all/modules/civicrm/CRM/Utils/Mail/EmailProcessor.php

This is the change:

  1. // a tighter regex for finding bounce info in soft bounces’ mail bodies
  2. // EMS 2015-10-15 fixing return path regex
  3. //$rpRegex = '/Return-Path: ' . preg_quote($dao->localpart) . '(b)' . $twoDigitString . '([0-9a-f]{16})@' . preg_quote($dao->domain) . '/';
  4. $rpRegex = '/Return-Path:\s*' . preg_quote($dao->localpart) . '(b)' . $twoDigitString . '([0-9a-f]{16})@' . preg_quote($dao->domain) . '/';

I used the regular expression:

  1. '/Return-Path:\s*' . preg_quote($dao->localpart) . '(b)' . $twoDigitString . '([0-9a-f]{16})@' . preg_quote($dao->domain) . '/';

to make it work either there is a space or not.

Precisely ‘\s’ will match any whitespace should occur between ‘Return-Path:’ and the following.