How to set row numbers in query return to equal the column names?

Question

I am calculating the correlation between columns in a large table using the in-database Python engine in SQL Server 2019 and, since this calculation returns a diagonal matrix, would like to be able to view the results in SSMS with the rows labeled mirroring the column names.

I know the basics of SQL querying, but not much beyond, so maybe I'm not phrasing my searches precisely.

Here is an example of my code:

execute sp_execute_external_script 
@language = N'Python',
@script = N'
import pandas as pd
from pandas import DataFrame

df = InputDataSet.corr()
OutputDataSet = df

',
@input_data_1 = N'select GHI ,
MNO,
JKL
from PIVOTED_TIME_ID_MATRIX'

with result sets ((GHI float,
MNO float,
JKL float))

This returns:

***** GHI | MNO | JKL
Row 1   1   0.5   0.5
Row 2 0.5     1   0.5
Row 3 0.5   0.5     1

and I would like to see:

***** GHI | MNO | JKL
GHI     1   0.5   0.5
MNO   0.5     1   0.5
JKL   0.5   0.5     1

Is this possible?

Can you try `df.set_index(df.columns)` below `df = InputDataSet.corr()`? — ResidentSleeper, Apr 08 '19 at 18:49
Yes, it outputs nicely to the console window with a print(), but it doesn't pass the dataframe index to the actual result set where I can save the output into a CSV. Thanks for the suggestion though, I might have to figure out a way to use that as a workaround if there isn't a way in SQL to make the index mirror the columns. Just trying to automate this for my team as much as possible. — b00m1, Apr 08 '19 at 19:45
Try `df.set_index(df.columns, inplace=True)` this should work. — ResidentSleeper, Apr 09 '19 at 01:29
Thanks @ResidentSleeper! I ended up combining this solution with a workaround to produce the output I'm after. Since the DataFrame's index doesn't appear to get passed back into the result set to SSMS, I appended the results of the call to df.columns to a new column, then shifted it to the front of the DataFrame and added a new column in the result set. — b00m1, Apr 10 '19 at 14:33

score 0 · Accepted Answer · answered Apr 10 '19 at 14:49

I ended up combining the advice above to use df.columns, as well as a method for rearranging columns from here with a workaround to produce the output I was looking for.

...'
df = InputDataSet.corr()
#puts the names of the existing columns into a new column on the end of df
df["columns"] = df.columns 
cols = df.columns.tolist()
#shift the "columns" column to the front of the dataframe
cols = cols[-1:] + cols[:-1]
df = df[cols]
OutputDataSet = df

',
@input_data_1 = N'select GHI ,
MNO,
JKL
from PIVOTED_TIME_ID_MATRIX'

with result sets ((column_names varchar(max), --add a new column in the result set
GHI float,
MNO float,
JKL float))

How to set row numbers in query return to equal the column names?

1 Answers1