TL;DR
Here's a portable approach that uses cURL and doesn't require mounting remote filesystems:
> install.packages("curl")
> require("curl")
> handle <- new_handle()
> handle_setopt(handle, username = "domain\\username")
> handle_setopt(handle, password = "secret") # If needed
> request <- curl_fetch_memory("smb://host.example.com/share/file.txt", handle = handle)
> contents <- rawToChar(request$content)
If we need to read the contents as CSV, like in the question, we can stream the file through another function:
> stream <- curl("smb://host.example.com/share/file.txt", handle = handle)
> contents <- read.csv(stream)
Let's take a look at a more robust way to access remote files through smb:// URLs besides the approach described in other answers that mounts the remote filesystem. Unfortunately, I'm a bit late to this one, but I hope this helps future readers.
In some cases, we may not have the privileges needed to mount a filesystem (this requires admin or root access on many systems), or we simply may not want to mount an entire filesystem just to read a single file. We'll use the cURL library to read the file instead. This approach improves the flexibility and portability of our programs because we don't need to depend on the existence of an externally mounted filesystem. We'll examine two different ways: through a system()
call, and by using a package that provides a cURL API.
Some background: for those not familiar with it, cURL provides tools used to transfer data over various protocols. Since version 7.40, cURL supports the SMB/CIFS protocol typically used for Windows file-sharing services. cURL includes a command-line tool that we can use to fetch the contents of a file:
$ curl -u 'domain\username' 'smb://host.example.com/share/file.txt'
The command above reads and outputs (to STDOUT) the contents of file.txt from the remote server host.example.com authenticating as the specified user on the domain. The command will prompt us for a password if needed. We can remove the domain portion from the username if our network doesn't use a domain.
System Call
We can achieve the same functionality in R by using the system()
function:
system("curl -u 'domain\\username' 'smb://host.example.com/share/file.txt'")
Note the double backslash in domain\\username
. This escapes the backslash character so that R doesn't interpret it as an escape character in the string. We can capture file contents from the command output into a variable by setting the intern
parameter of the system()
function to TRUE
:
contents <- system("curl -u 'domain\\username' 'smb://host.example.com/share/file.txt'", intern = TRUE)
...or by calling system2()
instead, which quotes the command arguments for safety and better handles process redirection between platforms:
contents <- system2('curl', c("-u", "domain\\\\username", "smb://host.example.com/share/file.txt"), stdout = TRUE)
The curl command will still prompt us for a password if required by the remote server. While we can specify a password using -u 'domain\\username:password'
to avoid the prompt, doing so exposes the plain-text password in the command string. For a more secure approach, read the section below that describes the usage of a package.
We can also add the -s
or --silent
flag to the curl command to suppress the progress status output. Note that doing so will also hide error messages, so we may also want to add -S
(--show-error
) as well. The contents
variable will contain a vector of the lines of the file—similar to the value returned by readLines("file.txt")
—that we can squash back together using paste(contents, collapse = "\n")
.
cURL API
While this all works fine, we can improve upon this approach by using a dedicated cURL library. This curl package provides R bindings to libcurl so that we can use the cURL API in our program directly. First we need to install the package:
install.packages("curl")
require("curl")
(Linux users will need to install libcurl development files.)
Then, we can read the remote file into a variable using the curl_fetch_memory()
function:
handle <- new_handle()
handle_setopt(handle, username = "domain\\username")
handle_setopt(handle, password = "secret") # If needed
request <- curl_fetch_memory("smb://host.example.com/share/file.txt", handle = handle)
content <- rawToChar(request$content)
First we create a handle
to configure the request by setting any authentication options needed. Then, we execute the request and assign the contents of the file to a variable. As shown, set the password
CURLOPT if needed.
To process a remote file like we would with read.csv()
, we need to create a streaming connection. The curl()
function creates a connection object that we can use to stream the file contents through any function that supports an argument returned by the standard url()
function. For example, here's a way to read the remote file as CSV, like in the question:
handle = new_handle()
...
stream <- curl("smb://host.example.com/share/file.txt", handle = handle)
contents <- read.csv(stream)
Of course, the concepts described above apply to fetching the contents or response body over any protocol supported by cURL, not just SMB/CIFS. If needed, we can also use these tools to download files to the filesystem instead of just reading the contents into memory.