I am using Outlook under Windows 10 as my email client and am trying to use the RDCOMClient library to process some emails. Some of the emails are in Russian and I am having trouble getting the Russian part out in a usable format. Right now, I am just focusing on the subject lines. When I extract the line and print it out, I just get question marks except for a few Latin characters in the subject. I have tried setting the encoding and using iconv, but with no success. But iconv did provide a useful clue. Based on my reproducible example below showing the raw characters gives:
iconv(SUBJECT, toRaw=T)
[1] 53 74 61 63 6b 4f 76 65 72 66 6c 6f 77 54 65 73 74 4d 65 73 73 61 67 65 3a
[26] 20 3f 3f 3f 3f 3f 3f 3f 3f 20 3f 3f 3f 3f 3f 3f 3f 3f 3f
All of the 3f's at the end? That is the code for question mark. RDCOMClient is actually returning the ??? from Outlook. It is not some encoding issue inside R.
I have looked at many RDCOMClient posts on SO, but do not see anything that deals with this problem.
Is the RDCOMClient<->Outlook connection just broken? Or is there some way around this?
Attempt at Reproducible Example
Since we are talking about accessing email, I don't see how to make a really easy reproducible example, but here is a reproducible way to test this. Of course, you have to have Outlook on Windows for this to make sense.
Send yourself an email with the subject line: StackOverflowTestMessage: Тестовое сообщение
R code We need to find the email first. Most of the code does that. Then we inspect the subject.
## Connect to Outlook
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace = OutApp$GetNameSpace("MAPI")
## Find the Inbox
INBOX = outlookNameSpace$GetDefaultFolder(6)
INBOX$Name() ## Confirm
emails <- INBOX$Items
## Find the relevant email
NumEmail = emails()$Count()
MessageNumber = 0
for(i in NumEmail:1) {
SUBJ = emails(i)$Subject()
if(grepl("StackOverflowTestMessage", SUBJ)) {
MessageNumber = i
break()
}
}
## Now try to get the subject line
SUBJECT = emails(MessageNumber)$Subject()
Encoding(SUBJECT) = 'UTF-8'
SUBJECT
[1] "StackOverflowTestMessage: ???????? ?????????"
iconv(SUBJECT, toRaw=T)
[[1]]
[1] 53 74 61 63 6b 4f 76 65 72 66 6c 6f 77 54 65 73 74 4d 65 73 73 61 67 65 3a
[26] 20 3f 3f 3f 3f 3f 3f 3f 3f 20 3f 3f 3f 3f 3f 3f 3f 3f 3f```