-3

I have a body of text from a plain text email and for various reasons I am trying to read information out of the headers from an email lower in the chain (the email has been forwarded) - so the string looks like this:

Jonathan Nathanson

Technology Consultant

Excell One Number: 0203 176 1025



From: Matthew Smith [mailto:Matthew.Smith@playspace.co.uk]

Sent: 22 May 2015 16:28

To: 'janine@greatplaces.co.uk'

Cc: Mark McIntyre;

Subject: GG.505 Treadmore Place - GreatPlaces 



Dear Janine,



Please find attached an o....

I want to read the information such as the name and email address out of the 'From:' line, the email address from the 'To:' line and the information in Subject line.

Does anyone have any pointers that could at least help me on my way?

Thanks very much.

  • Take a look into php regular expressions. You should be able to parse out text following "From:", "To:", and "Subject:". http://php.net/manual/en/function.preg-match.php – Aiias May 24 '15 at 01:51
  • I have been trying to use substr() and strpos() but without much luck. I will look at regular expressions. Thank you. – Jonathan Nathanson May 24 '15 at 01:58
  • possible duplicate of [Reference - What does this regex mean?](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) – Nathan Tuggy May 24 '15 at 02:03
  • Regex '^From[ :]*(.*)' would do the job (supposing u read line by line). Get value back with $1. (If still solved when I wake up, I.lll finish it). – Falt4rm May 24 '15 at 04:48

1 Answers1

1

The following will help you:

$re1 = '~(?<=From: )(.*?)(?: \[mailto:)(.*?)(?=\])~';
$re2 = "~(?<=To: ').*(?=')~";
$re3 = "~(?<=Subject:\s)(.*?)(?=\s)(?:.*\s\-\s)(.*)~";

if(preg_match($re1, $str, $matches1)) {
    $from_name = $matches1[1];
    $from_email = $matches1[2];
}
if(preg_match($re2, $str, $matches2))
    $to_email = $matches2[0];
if(preg_match($re3, $str, $matches3)) {
    $Subject_code = $matches3[1];
    $Subject_nameAfterHyphen = $matches3[2];
}
someOne
  • 2,570
  • 2
  • 11
  • 19
  • Hi, thank you this is very helpful, can you explain how those two regex rules work? I am trying to work out what the logic is using guides online but it's very confusing. I want to replicate this for the sent date and then use it to pull specific information out of the subject line. – Jonathan Nathanson May 24 '15 at 10:57
  • Sent: data was really easy - just by editing $re2: `$re3 = "~(?<=Sent: ).*(?=)~"; ` – Jonathan Nathanson May 24 '15 at 11:07
  • The last thing I am trying to do is break down the following `GG.505 Metal Box Factory - Good Innovations (AD4U)` in to it's component parts being `GG.505` (or any combination of letters, numbers, possibly more letters after), then `Metal Box Factory` (or some other name), then the company name after the - `Good Innovations` discarding the last bit in brackets.... it's very confusing! – Jonathan Nathanson May 24 '15 at 11:18
  • @JonathanNathanson So is the following, the pattern of the "Subject" entry: Two uppercase Latin characters followed by a period symbol, followed by exactly three digits and then some characters up to the first (or maybe the last) occurrence of a dash character and some characters to the end of the line, Isn't it? – someOne May 24 '15 at 11:25
  • So `GG.505` could be `AB.12` or `CD.123.F` - the length of this code varies and is alphanumerical - I don't actually need the Metal Box Name after this as the two uppercase Latin characters are actually a code for that name. It's just that code I need to pull out - plus whatever name is after the hyphen. – Jonathan Nathanson May 24 '15 at 11:29
  • @JonathanNathanson Then you may accept it as an answer! :) – someOne May 24 '15 at 12:44