2

I am using Outlook under Windows 10 as my email client and am trying to use the RDCOMClient library to process some emails. Some of the emails are in Russian and I am having trouble getting the Russian part out in a usable format. Right now, I am just focusing on the subject lines. When I extract the line and print it out, I just get question marks except for a few Latin characters in the subject. I have tried setting the encoding and using iconv, but with no success. But iconv did provide a useful clue. Based on my reproducible example below showing the raw characters gives:

iconv(SUBJECT, toRaw=T)
 [1] 53 74 61 63 6b 4f 76 65 72 66 6c 6f 77 54 65 73 74 4d 65 73 73 61 67 65 3a
[26] 20 3f 3f 3f 3f 3f 3f 3f 3f 20 3f 3f 3f 3f 3f 3f 3f 3f 3f

All of the 3f's at the end? That is the code for question mark. RDCOMClient is actually returning the ??? from Outlook. It is not some encoding issue inside R.

I have looked at many RDCOMClient posts on SO, but do not see anything that deals with this problem.

Is the RDCOMClient<->Outlook connection just broken? Or is there some way around this?

Attempt at Reproducible Example

Since we are talking about accessing email, I don't see how to make a really easy reproducible example, but here is a reproducible way to test this. Of course, you have to have Outlook on Windows for this to make sense.

  1. Send yourself an email with the subject line: StackOverflowTestMessage: Тестовое сообщение

  2. R code We need to find the email first. Most of the code does that. Then we inspect the subject.


## Connect to Outlook
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace = OutApp$GetNameSpace("MAPI")

## Find the Inbox
INBOX = outlookNameSpace$GetDefaultFolder(6)
INBOX$Name()            ## Confirm
emails <- INBOX$Items

## Find the relevant email
NumEmail = emails()$Count()
MessageNumber = 0
for(i in NumEmail:1) {
    SUBJ = emails(i)$Subject()
    if(grepl("StackOverflowTestMessage", SUBJ)) {
        MessageNumber = i
        break()
    }
}

## Now try to get the subject line
SUBJECT = emails(MessageNumber)$Subject()
Encoding(SUBJECT) = 'UTF-8'
SUBJECT
[1] "StackOverflowTestMessage: ???????? ?????????"
iconv(SUBJECT, toRaw=T)
[[1]]
 [1] 53 74 61 63 6b 4f 76 65 72 66 6c 6f 77 54 65 73 74 4d 65 73 73 61 67 65 3a
[26] 20 3f 3f 3f 3f 3f 3f 3f 3f 20 3f 3f 3f 3f 3f 3f 3f 3f 3f```

Eugene Astafiev
  • 26,795
  • 2
  • 13
  • 31
G5W
  • 32,266
  • 10
  • 31
  • 60
  • Do you get the same results if other programming languages or packages are used? Is the issue specific to the RDCOMClient? – Eugene Astafiev May 16 '20 at 12:38
  • This is specific to RDComClient. However, I also tried using perl with Win32::OLE. and had a completely different set of problems. This question is for R. I would be open to solutions using other R packages. – G5W May 16 '20 at 13:39
  • So, it is not related to Outlook. – Eugene Astafiev May 16 '20 at 14:08
  • Incorrect. I specifically want to get my email out of Outlook - using R to automate the process.The question is about an interaction between R and Outlook. – G5W May 16 '20 at 14:12
  • The Outlook object model is common for all kinds of applications. If you see the same issue in other programming languages then it is a problem of Outlook. If you don't get same issue then the problem is in R (or frameworks you are using). – Eugene Astafiev May 17 '20 at 13:36

0 Answers0