23

I am on windows 7 64 bit. I have a csv file 'data.csv'. I want to import data to a postgresql table 'temp_unicommerce_status' via a python script.

My Script is:

import psycopg2
conn = psycopg2.connect("host='localhost' port='5432' dbname='Ekodev' user='bn_openerp' password='fa05844d'")
cur = conn.cursor()
cur.execute("""truncate table "meta".temp_unicommerce_status;""")
cur.execute("""Copy temp_unicommerce_status from 'C:\Users\n\Desktop\data.csv';""")
conn.commit()
conn.close()

I am getting this error

Traceback (most recent call last):
  File "C:\Users\n\Documents\NetBeansProjects\Unicommerce_Status_Update\src\unicommerce_status_update.py", line 5, in <module>
cur.execute("""Copy temp_unicommerce_status from     'C:\\Users\\n\\Desktop\\data.csv';""")
psycopg2.ProgrammingError: must be superuser to COPY to or from a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
Vivek S.
  • 15,777
  • 6
  • 54
  • 78
Manish Gupta
  • 3,771
  • 9
  • 48
  • 87

8 Answers8

30

Use the copy_from cursor method

f = open(r'C:\Users\n\Desktop\data.csv', 'r')
cur.copy_from(f, temp_unicommerce_status, sep=',')
f.close()

The file must be passed as an object.

Since you are coping from a csv file it is necessary to specify the separator as the default is a tab character

Clodoaldo Neto
  • 98,807
  • 21
  • 191
  • 235
  • Doesn't this (and copy_expert) require a super user role though? Its not piping from STDIN, which avoids the Super User role. Copying from a file requires the user, which it seems like this does. My issue is this runs, but does nothing, and I think there's a super user error failing silently. – yekta Jan 07 '16 at 18:01
  • 1
    I can confirm that copy_from works for an aws rds-postgresql instance, where indeed you're only granted rds_superuser, but not superuser – mork Sep 14 '16 at 10:15
  • I was able to propagate a database via csv file using copy_expert using STDIN. This removes the need to give the postgres user role SUPERUSER privileges within postgres. See the answer here: https://stackoverflow.com/questions/51850512/postgressqlfile-permissions-error-using-copy/51866826#51866826 – jonnyjandles Aug 16 '18 at 17:50
  • this is only a viable solution if code is running on the same server as database. otherwise you're loading the file in memory and passing it over the network. tons of unnecessary IO – Sonic Soul Jun 12 '19 at 20:23
  • Need to be careful here since the psycopg2 `copy_from` function uses the text format and not the CSV format. – Gohn67 Oct 10 '20 at 20:40
7

The way I solved this problem particular to use psychopg2 cursor class function copy_expert (Docs: http://initd.org/psycopg/docs/cursor.html). copy_expert allows you to use STDIN therefore bypassing the need to issue a superuser privilege for the postgres user. Your access to the file then depends on the client (linux/windows/mac) user's access to the file

From Postgres COPY Docs (https://www.postgresql.org/docs/current/static/sql-copy.html):

Do not confuse COPY with the psql instruction \copy. \copy invokes COPY FROM STDIN or COPY TO STDOUT, and then fetches/stores the data in a file accessible to the psql client. Thus, file accessibility and access rights depend on the client rather than the server when \copy is used.

You can also leave the permissions set strictly for access to the development_user home folder and the App folder.

csv_file_name = '/home/user/some_file.csv'
sql = "COPY table_name FROM STDIN DELIMITER '|' CSV HEADER"
cursor.copy_expert(sql, open(csv_file_name, "r"))
Community
  • 1
  • 1
jonnyjandles
  • 3,744
  • 4
  • 24
  • 34
  • Yes, I think the psycopg2 should make the `copy_from` a bit clearer in that it expects a text format and not CSV format. Using copy_expert seems to be the only way to pass in a CSV file correctly. – Gohn67 Oct 10 '20 at 20:42
4
#sample of code that worked for me

import psycopg2 #import the postgres library

#connect to the database
conn = psycopg2.connect(host='localhost',
                       dbname='database1',
                       user='postgres',
                       password='****',
                       port='****')  
#create a cursor object 
#cursor object is used to interact with the database
cur = conn.cursor()

#create table with same headers as csv file
cur.execute("CREATE TABLE IF NOT EXISTS test(**** text, **** float, **** float, **** 
text)")

#open the csv file using python standard file I/O
#copy file into the table just created 
with open('******.csv', 'r') as f:
next(f) # Skip the header row.
    #f , <database name>, Comma-Seperated
    cur.copy_from(f, '****', sep=',')
    #Commit Changes
    conn.commit()
    #Close connection
    conn.close()


f.close()
1

Here is an extract from relevant PostgreSQL documentation : COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible to the server and the name must be specified from the viewpoint of the server. When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server

That's the reason why the copy command to or from a file a restricted to a PostgreSQL superuser : the file must be present on server and is loaded directly by the server process.

You should instead use :

cur.copy_from(r'C:\Users\n\Desktop\data.csv', temp_unicommerce_status)

as suggested by this other answer, because internally it uses COPY from stdin.

Community
  • 1
  • 1
Serge Ballesta
  • 121,548
  • 10
  • 94
  • 199
0

You can use d6tstack which makes this simple

import d6tstack
import glob

c = d6tstack.combine_csv.CombinerCSV([r'C:\Users\n\Desktop\data.csv']) # single-file
c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv')) # multi-file
c.to_psql_combine('postgresql+psycopg2://psqlusr:psqlpwdpsqlpwd@localhost/psqltest', 'tablename')

It also deals with data schema changes, create/append/replace table and allows you to preprocess data with pandas.

citynorman
  • 3,467
  • 2
  • 25
  • 31
0

I know this question has been answered, but here are my two cent. I am adding little more description:

You can use cursor.copy_from method :

First you have to create a table with same no of columns as your csv file.

Example:

My csv looks like this:

Name,       age , college , id_no , country , state   , phone_no

demo_name   22  , bdsu    , 1456  , demo_co , demo_da , 9894321_

First create a table:

import psycopg2
from psycopg2 import Error

connection = psycopg2.connect(user = "demo_user",
                                  password = "demo_pass",
                                  host = "127.0.0.1",
                                  port = "5432",
                                  database = "postgres")
cursor = connection.cursor()


create_table_query = '''CREATE TABLE data_set
(Name  TEXT NOT NULL ,
age  TEXT NOT NULL ,
college  TEXT NOT NULL ,
id_no TEXT NOT NULL ,
country TEXT NOT NULL ,
state TEXT NOT NULL ,
phone_no TEXT NOT NULL);'''

cursor.execute(create_table_query)
connection.commit()

Now you can simply use cursor.copy_from where you need three parameters :

first file object , second table_name , third sep type

you can copy now :

f = open(r'final_data.csv', 'r')
cursor.copy_from(f, 'data_set', sep=',')
f.close()

done

Aaditya Ura
  • 9,140
  • 4
  • 35
  • 62
0

I am going to post some of the errors I ran into trying to copy a csv file to a database on a linux based system....

here is an example csv file:

Name Age Height
bob  23   59
tom  56   67
  1. You must install the library psycopg2 (i.e. pip install psycopg2 or sudo apt install python3-psycopg2 )

  2. You must have postgres installed on your system before you can use psycopg2 (sudo apt install postgresql-server postgresql-contrib )

  3. Now you must create a database to store the csv unless you already have postgres setup with a pre-existing database

COPY CSV USING POSTGRES COMMANDS

  • After installing postgres it creates a default user account which gives you access to postgres commands

  • To switch to the postgres account issue: sudo -u postgres psql

  • Acess the prompt by issuing: psql

    #command to create a database create database mytestdb; #connect to the database to create a table \connect mytestdb; #create a table with same csv column names create table test(name char(50), age char(50), height char(50)); #copy csv file to table copy mytestdb 'path/to/csv' with csv header;

COPY CSV USING PYTHON The main issue I ran into with copying the CSV file to a database was I didn't have the database created yet, however this can be done with python still.

import psycopg2 #import the Postgres library

#connect to the database
conn = psycopg2.connect(host='localhost',
                       dbname='mytestdb',
                       user='postgres',
                       password='')  
#create a cursor object 
#cursor object is used to interact with the database
cur = conn.cursor()

#create table with same headers as csv file
cur.execute('''create table test(name char(50), age char(50), height char(50));''')

#open the csv file using python standard file I/O
#copy file into the table just created 
f = open('file.csv','r')
cursor.copy_from(f, 'test', sep=',')
f.close()
Kundan Kumar
  • 113
  • 1
  • 9
-2

Try to do the same as the root user - postgres. If it were linux system, you could change file's permissions or move the file to /tmp. The problem results from missing credentials to read from the filesystem.

percy
  • 698
  • 5
  • 7