1

I'm trying to import a large Database table into R to do some global analysis. I connect to Oracle DB with ROracle and use dbGetquery. Make minimum selection and necessary where clauses directly in the query to reduce the scope of the dataset but still it is 40 columns for 12 million of rows.

My PC has only 8GB of RAM how can I handle this? There is no way to store those data on the disk rather than on the RAM ? or something similar to that way? The same things made in SAS works fine.

Any Idea?

RevivedPicard
  • 129
  • 11
  • also check [this previous question](https://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes) – mischva11 May 06 '19 at 13:04
  • SAS partly runs on disk. Curious, what are you doing in R with that inital, large load? Due to limited resources, consider running calculation, aggregation, etc. in Oracle and import final resultset in R for end use needs such as graphing, modeling, etc. Databases today have various statistical and analytical functions. – Parfait May 06 '19 at 13:13
  • In my case i have to read directly form DB and I cannot download files form the DB and then read it on R – RevivedPicard May 06 '19 at 13:14

2 Answers2

2

Few ideas:

  1. May be some aggregation could be done on server side?

  2. You are going to do something with this data in R, right? So you can try not to load data, but to create tbl object and made manipulations and aggregations in R

    library(dplyr)

    my_tbl <- 'SELECT ... FROM ...' %>% sql() %>% tbl(con, .)

where con is your connection

APC
  • 137,061
  • 19
  • 153
  • 266
Yuriy Barvinchenko
  • 1,083
  • 1
  • 6
  • 13
  • Unfortunately all the dataset is needed in R for doing aggregation, partial export and so on ... Please explain method 2 ... seems interesting for my case – RevivedPicard May 06 '19 at 13:30
0

Here are a couple ideas for you to consider.

library(RODBC)
dbconnection <- odbcDriverConnect("Driver=ODBC Driver 11 for SQL Server;Server=Server_Name; Database=DB_Name;Uid=; Pwd=; trusted_connection=yes")
initdata <- sqlQuery(dbconnection,paste("select * from MyTable Where Name = 'Asher';"))
odbcClose(channel)

If you can export the table as a CSV file...

require(sqldf)
df <- read.csv.sql("C:\\your_path\\CSV1.csv", "select * from file where Name='Asher'")
df
ASH
  • 15,523
  • 6
  • 50
  • 116