2

I am executing a sql query from a python script to retrieve the data from snowflake in windows 10 but the resulting query is missing column names and its getting replaced by 0,1,2,3 so on. While executing query in snowflake interface and downloading csv is giving the columns in the file. I am passing column names as Aliases in my query

Below is code

def _CONSUMPTION(con):

data2 = con.cursor().execute("""select sd.sales_force_lvl_1_code "Plan-To Code",sd.sales_force_lvl_1_desc "Plan-To Description",pd.matl_code "Product Code",pd.matl_desc "Product Description",pd.ean_upc_code "UPC",dd.fiscal_week_desc "Fiscal Week Description",f.unit_sales_qty "Sales Units",f.incr_units_qty "Incremental Units"
                                    from DW.consumption_fact1 f, DW.market_dim md, DW.matl_dim pd, DW.fiscal_week_dim dd, (select sales_force_lvl_1_code,max(sales_force_lvl_1_desc) sales_force_lvl_1_desc from DW.mv_us_sales_force_dim group by sales_force_lvl_1_code) sd 
                                    where dd.fiscal_week_key = f.fiscal_week_key 
                                    and pd.matl_key = f.matl_key 
                                    and md.market_key = f.market_key 
                                    and sd.sales_force_lvl_1_code = md.curr_sales_force_lvl_1_code 
                                    and dd.fiscal_week_key between (select curr_fy_week_key-6 from DW.curr_date_lkp) and (select curr_fy_week_key-1 from DW.curr_date_lkp)
                                    and f.company_key = 6006 
                                    and (f.unit_sales_qty <> 0 and f.sales_amt <> 0) 
                                    and md.curr_sales_force_lvl_1_code is not null
                                    UNION
                                    select '5000016240' "Plan-To Code", 'AWG TOTAL' "Plan-To Description",pd.matl_code "Product Code",pd.matl_desc "Product Description",pd.ean_upc_code "UPC",dd.fiscal_week_desc "Fiscal Week Description",f.unit_sales_qty "Sales Units",f.incr_units_qty "Incremental Units"
                                    from DW.consumption_fact1 f, DW.market_dim md, DW.matl_dim pd, DW.fiscal_week_dim dd 
                                    where dd.fiscal_week_key = f.fiscal_week_key 
                                    and pd.matl_key = f.matl_key 
                                    and md.market_key = f.market_key 
                                    and dd.fiscal_week_key between (select curr_fy_week_key-6 from DW.curr_date_lkp) and (select curr_fy_week_key-1 from DW.curr_date_lkp)
                                    and f.company_key = 6006 
                                    and (f.unit_sales_qty <> 0 and f.sales_amt <> 0) 
                                    and md.market_code = '20267'""").fetchall()
                            
df = pd.DataFrame(data2)
df.head(5)
df.to_csv('CONSUMPTION.csv',index = False)
demircioglu
  • 2,291
  • 1
  • 13
  • 19
  • Would you be able to update the post with your query and the code you're trying to execute? It's hard to say what exactly is wrong. – J.H Sep 03 '20 at 17:19
  • i have added the code snippet – himanshu maskara Sep 03 '20 at 17:27
  • thats because you are using pandas dataframe. You need to set the columns as well. – hopeIsTheonlyWeapon Sep 03 '20 at 18:51
  • Check if this SO helps . https://stackoverflow.com/questions/53114687/how-can-i-populate-a-pandas-dataframe-with-the-result-of-a-snowflake-sql-query – hopeIsTheonlyWeapon Sep 03 '20 at 18:53
  • 2
    I'd recommend loading directly into a pandas dataframe using the fetch_pandas_all() function. https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#reading-data-from-a-snowflake-database-to-a-pandas-dataframe – Mike Walton Sep 03 '20 at 20:49
  • Also, if your goal is to just turn around and create a CSV file, you might want to consider just doing a COPY INTO location (file export) from Snowflake. – Mike Walton Sep 03 '20 at 20:50
  • Another angle to this: If you use a UNION, column names will be dropped in favour of column numbers (this because the column names from one side of the UNION are not guaranteed to be the same on the other side of it). The only way I have found to "force" column naming for UNION queries - I work with Teradata DBMS - is to explicitly name the columns on one or other side of the UNION: select col_1 as COL_1, col2 as COL_2 from xxx UNION select col_1, col2 from yyy; The above should return column names for all returned rows as COL_1 and COL_2, respectively. YMMV. – v0rl0n Oct 12 '20 at 17:31

3 Answers3

2

Looking [at the docs], seems the easiest way is to use the cursor method .fetch_pandas_all():

query = "SELECT 1 a, 2 b, 'a' c UNION ALL SELECT 7,4,'snow'"
cur = connection.cursor()
cur.execute(query).fetch_pandas_all()

enter image description here

Or if you want to dump the results into a CSV, just do so as in the question:

query = "SELECT 1 a, 2 b, 'a' c UNION ALL SELECT 7,4,'snow'"
cur = connection.cursor()
df = cur.execute(query).fetch_pandas_all()
df.to_csv('x.csv', index = False)

Visualized:

enter image description here

Felipe Hoffa
  • 39,978
  • 8
  • 105
  • 241
1

Looks like you haven’t defined the column methods in your code to define the data frame.

My recommendation will be to add column methods first df.columns

In addition refer snowflake page for details

https://docs.snowflake.com/en/user-guide/python-connector-pandas.html

Try this

import pandas as pd

def fetch_pandas_old(cur, sql):
    cur.execute(sql)
    rows = 0
    while True:
        dat = cur.fetchmany(50000)
        if not dat:
            break
        df = pd.DataFrame(dat, columns=cur.description)
        rows += df.shape[0]
    print(rows)
ewong
  • 1,072
  • 9
  • 20
1

A nice way to extract the column headings from the cursor description and save in a pandas df using the Snowflake connector (also works for psycopg2 btw) is as follows:


#Create the connection
def connect_snowflake(uname, pword, acct, role_name, whouse, dbase, schema_name):
    conn = snowflake.connector.connect(
    user=uname,
    password=pword,
    account=acct,
    role = role_name,
    warehouse = whouse,
    database = dbase,
    schema = schema_name
    )
    
    cur = conn.cursor()
    
    return conn, cur

Then execute your query. The cur.description object returns a list of tuples, the first of each being the column name :)

conn, cur = connect_snowflake(username, password, account_name, role, warehouse, database, schema)
cur.execute('select * from my_schema.my_table')
result =cur.fetchall()
# Extract the column names
col_names = []
for elt in cur.description:
    col_names.append(elt[0])
df = pd.DataFrame(result, columns=col_names)
cur.close()
conn.close()