0

i have this json file i wanted to convert it to CSV using pandas

  {
        "partes": [
            {
                "processo": "1001824-89.2019.8.26.0493",
                "tipo": "Reqte: ",
                "nome": "Sérgio Izaias Massaranduba  Advogada: Mariana Pretel E Pretel      ",
                "cnpj_cpf": "Não encontrado",
                "oab": "Não encontrado"
            },
            {
                "processo": "1001824-89.2019.8.26.0493",
                "tipo": "Reqda: ",
                "nome": "CLARO S/A   ",
                "cnpj_cpf": "Não encontrado",
                "oab": "Não encontrado"
            }
        ],
        "movimentacoes": [
            {
                "processo": "1001824-89.2019.8.26.0493",
                "data": "28/10/2019",
                "tem_anexo": "",
                "movimentacao": " Distribuído Livremente (por Sorteio) (movimentação exclusiva do distribuidor)  "
            }
        ]
    }

when i use the following function read_json, he returns me one of these error ValueError: arrays must all be same length

aqui está meu código:

import pandas as pd
import json
import os

os.chdir('C:\\Users\\Suporte\\Desktop\\AUT\\autonomation')


df = pd.read_json('file.json')

df_ = df.to_csv('file.csv', sep=';',index=False)

I don't know why he can't read the file

user158433
  • 31
  • 4

1 Answers1

2
  • Remember that pandas is about tables of data, with repeating column headers.
  • The JSON presented here, as a whole, does not correspond to tabular data.
  • This JSON needs to be read in by separate keys
  • Alternatively, partes and movimentacoes must be the same length.
    • Length of partes value is 2, while movimentacoes is 1.
  • Given the following data, in a file named test1.json

Data:

{
    "partes": [{
            "processo": "1001824-89.2019.8.26.0493",
            "tipo": "Reqte: ",
            "nome": "Sérgio Izaias Massaranduba  Advogada: Mariana Pretel E Pretel      ",
            "cnpj_cpf": "Não encontrado",
            "oab": "Não encontrado"
        }, {
            "processo": "1001824-89.2019.8.26.0493",
            "tipo": "Reqda: ",
            "nome": "CLARO S/A   ",
            "cnpj_cpf": "Não encontrado",
            "oab": "Não encontrado"
        }
    ],
    "movimentacoes": [{
            "processo": "1001824-89.2019.8.26.0493",
            "data": "28/10/2019",
            "tem_anexo": "",
            "movimentacao": " Distribuído Livremente (por Sorteio) (movimentação exclusiva do distribuidor)  "
        }
    ]
}

Code:

from pathlib import Path
import pandas as pd
import json

# path to file
p = Path(r'c:\some_path_to_data\test1.json')

# read the JSON file in
with p.open('r') as f:
    data = json.loads(f.read())

# create the dataframe
df_partes = pd.DataFrame.from_dict(data['partes'])
print(df_partes)

                  processo     tipo                                                                  nome         cnpj_cpf              oab
 1001824-89.2019.8.26.0493  Reqte:   Sérgio Izaias Massaranduba  Advogada: Mariana Pretel E Pretel        Não encontrado  Não encontrado
 1001824-89.2019.8.26.0493  Reqda:                                                           CLARO S/A     Não encontrado  Não encontrado

df_movimentacoes = pd.DataFrame.from_dict(data['movimentacoes'])
print(df_movimentacoes)

                  processo        data tem_anexo                                                                         movimentacao
 1001824-89.2019.8.26.0493  28/10/2019             Distribuído Livremente (por Sorteio) (movimentação exclusiva do distribuidor)

# save to csv
df_partes.to_csv('partes.csv', index=False)
df_movimentacoes('moviementacoes.csv', index=False)
  • If the JSON has many keys, consider making a dictionary of dataframes as follows:
df_dict = {key: pd.DataFrame.from_dict(data[key]) for key in data.keys()}

# Access a specific dataframe just like a regular dictionary
df_dict['partes']

# save to csv
for key in df_dict.keys():
    df_dict[key].to_csv(f'{key}.csv', index=False)
Trenton McKinney
  • 29,033
  • 18
  • 54
  • 66
  • I correct, he returns me ValueError: The DataFrame builder was not called correctly! – user158433 Oct 29 '19 at 17:03
  • @user158433 I can only help you with the information you've provided. The code has been tested against the JSON you provided, so I know the syntax is correct. However, if the entire JSON is significantly different than what's been provided, there may be issues I can't account for. – Trenton McKinney Oct 29 '19 at 17:07