0

I tried to make the dtype parameter of the to_sql command pull from a json file I get the error "json.decoder.JSONDecodeError: Expecting value". Is there a way to do this?

Instead of the below code:

import sqlalchemy


resource_read.to_sql(tablename,engine, 
schema = None, 
if_exists='append', 
dtype={'ID': sqlalchemy.types.VARCHAR(length=50), 
'NAME': sqlalchemy.types.VARCHAR(length=255)})

I want to do something like this where the dtype dictionary parameter is a key value pair in a json file: the config json file would look like this:

{
    "dtypes": {"ID": sqlalchemy.types.VARCHAR(length=50), "NAME": sqlalchemy.types.VARCHAR(length=255) }
}

Then the code is:

import sqlalchemy

config_path = os.path.join("C:\configs")

my_parser.add_argument('-c','-configname', dest='configname', help="Configname file")

args = my_parser.parse_args()
    
    

if args.configname:
        
   with open(os.path.join(config_path, args.configname), 'r') as f:
          
       config_params = json.load(f)

resource = os.path.join(resource_path, filename)

resource_read = pd.read_csv(resource, encoding='utf8', na_values='', dtype=str)

resource_read.to_sql(tablename,engine,schema = None,if_exists='append',dtype=config_params["dtypes"])

The issue is that the value for the dictionaries in the dtypes key is not in quotes, when I put it in quotes I don't get the json issue but I get a sqlalchemy error "ValueError: The type of ID is not a SQLAlchemy type".

Do I somehow convert the value to non-quotes when reading them in? If I just put in the dictionary within the script and I don't put quotes around the values I don't get an error and it runs fine. But I want it in a json file.

Any help would be appreciated. Thank you!!

hope288
  • 495
  • 7
  • 20
  • 1
    *it doesn't work* ... please be specific. Any errors or undesired results? Can't see how the two lines should differ with Python's interpreter. Try running again for reproducibility. – Parfait Nov 08 '20 at 01:24
  • hi I just updated my question above. Thank you. – hope288 Nov 08 '20 at 18:30
  • Please show how `config` is assigned even if it means reading from a `.json` file. We want to be able to reproduce your issue. Not clear what you mean by quotes. – Parfait Nov 08 '20 at 20:14
  • Hi, I added more info. Sorry, I didn't realize you needed that information. Hopefully I covered everything now. Thank you for your help – hope288 Nov 08 '20 at 23:01
  • Hmmmm...that is not a valid JSON and should raise an error when parsing. Specifically, on `json.load(f)` line, I get `JSONDecodeError: Expecting value`. Please post sample of actual JSON if you do not get this error. – Parfait Nov 09 '20 at 00:28
  • yea it does error, that's why I had to add the double quotes I was mentioning earlier which you said wasn't clear. But the double quotes just fixes the json but doesn't run sqlalchemy. I read your solution and will try it the way you described and let you know. Thanks so much! – hope288 Nov 10 '20 at 21:41

1 Answers1

2

Assuming actual JSON quotes the sqlalchemy types:

{
    "dtypes": {"ID": "sqlalchemy.types.VARCHAR(length=50)", "NAME": "sqlalchemy.types.VARCHAR(length=255)" }
    
}

An immediate solution is to call eval in a dict comprehension to build dtypes dictionary. But warning: use this powerful function with caution and not open to unknown user input.

sql_dtypes = { k:eval(v) for k,v in config_params['dtypes'].items() }

print(sql_dtypes)
# {'ID': VARCHAR(length=50), 'NAME': VARCHAR(length=255)}

...

resource_read.to_sql(tablename,engine,schema = None,if_exists='append', dtype=sql_dtypes )

Alternatively, if you have ability to adjust config file, consider saving its underlying types and attributes and not any Python module variables. (See added DECIMAL type for demonstration.)

{
    "dtypes": {
               "ID": ["VARCHAR", 50], 
               "DATE": ["DATETIME"],
               "NAME": ["VARCHAR", 255], 
               "VALUE": ["DECIMAL", 10, 6]
              }
    
}

Then, you can call sqlalchemy.types dynamically with getattr (nested for additional attributes of types) and pass needed parameters.

sql_dtypes = { k: getattr(getattr(sqlalchemy, 'types'), v[0])(v[1] if len(v) >= 1 else None, 
                                                              v[2] if len(v) >= 2 else None)
                        for k,v in config_params['dtypes2'].items() }

print(sql_dtypes)
# {'ID': VARCHAR(length=50), 'DATE': DATETIME, 'NAME': VARCHAR(length=255), 'VALUE': DECIMAL(precision=10, scale=6)}

...

resource_read.to_sql(tablename,engine,schema = None,if_exists='append', dtype=sql_dtypes )
Parfait
  • 87,576
  • 16
  • 87
  • 105
  • wow I would never have known to go about it this way, thank you so much. I will test this out and let you know. – hope288 Nov 10 '20 at 21:39
  • Great to hear and glad to help! Happy coding! – Parfait Nov 11 '20 at 22:06
  • Hi I was able to test it out and it works great, one thing I am not sure how to handle are DateTime() dtypes. How do I save that in the json "end_date": ["DateTime"] ? – hope288 Nov 16 '20 at 20:46
  • also when you write in the code to check if the length is 2 and if it's not then grab the third element in v, when is that actually used and what is it applicable for to add a third element in the json? so that it works for datetime should I change it to say "else v[0]" ? Thank you! – hope288 Nov 16 '20 at 20:48
  • Usually datetimes would not have additional args so do not include in `json` and then in Python check if length is 1 in the first argument of outermost `getattr`. Reversed [ternary operator](https://stackoverflow.com/questions/394809/does-python-have-a-ternary-conditional-operator) to read better. See edit. – Parfait Nov 16 '20 at 21:28