Pandas dataframe how to escape null integers to float

Question

I'm dynamically generating queries from 11 different tables in SQL Server and storing that into S3 CSV file.

However, when I store null integer fields in CSV it converts them to float so when I'm doing a copy command it returns an error.

I really need to avoid that. is there and option for that?

for object in table_list:
    if args.load_type == "full":
        query_load = object["query_full"]
    else:
        query_load = object["query_active"]

    df = pd.read_sql_query(query_load, sql_server_conn)
    df = df.replace(",", " ", regex=True)
    df = df.replace("\n", " ", regex=True)


    #print(df)
    #df = df * 1
    #print(df.dtypes)
    #print(df.info())


    df = df.assign(extraction_dttm=currentdate)

    csv_buffer = StringIO()
    df.to_csv(csv_buffer, index=False)
    folder_name = "{}".format(object["lake_table_name"])
    file_name = "{}_{}.csv".format(object["lake_table_name"], currentdate.strftime("%Y%m%d"))
    full_path_to_file = DATALAKE_PATH + "/" + folder_name + "/" + file_name

    # print("{} - Storing files in {} ... ".format(dt.utcnow(), datalake_bucket))
    s3_resource.Object(datalake_bucket, full_path_to_file).put(Body=csv_buffer.getvalue())

It's a bit off-topic, but doesn't most SQL language have an option to export into csv? Like this in MySQL: https://stackoverflow.com/questions/356578/how-to-output-mysql-query-results-in-csv-format — AerysS, Jun 02 '20 at 08:10
Ideally this would be a lambda. so I'm not storing csv files anywhere locally. I'm reading from SQL server and pushing directly to S3 — RosaNegra, Jun 02 '20 at 08:16
I'm only guessing here, but your DataFrame.replace calls are probably converting all of the fields to strings before DataFrame.to_csv even gets a look-in. DataFrame.to_csv does have options to deal with null values (na_rep, for example), floats (float_format) and quoting (quoting, quotechar, etc) when given the correct values. — AlwaysLearning, Jun 02 '20 at 08:21
Nope. It does not. I commented out the df.replace method and printed the df.info(). It still reads my that one column as a float instead of integer. — RosaNegra, Jun 02 '20 at 08:27

Pandas dataframe how to escape null integers to float

0 Answers0