0

I had a list of single long string and I wanted to print the output in a particular form. convert list to a particular json in python

but after conversion order of data changed. How can I maintain the same order?

input_data = 

 [
  "21:15-21:30 IllegalAgrumentsException 1,
   21:15-21:30 NullPointerException 2,
   22:00-22:15 UserNotFoundException 1,
   22:15-22:30 NullPointerException 1
   ....."
]

Code to covert the data in particular json form:

    input_data = input[0]   // input is list of single long string.
    input_data = re.split(r',\s*', input_data)
    output = collections.defaultdict(collections.Counter)
    # print(output)
    for line in input_data:
        time, error, count = line.split(None, 2)
        output[time][error] += int(count)
    print(output)
    response = [
        {
            "time": time,
            "logs": [
                {"exception": exception, "count": count}
                for (exception, count) in counter.items()
            ],
        }
        for (time, counter) in output.items())
    ]

    print(response) 

My output :

{
    "response": [
        {
          "logs": [
            {
                "count": 1,
                "exception": "UserNotFoundException"
            }
        ],
        "time": "22:45-23:00"
    },
    {
        "logs": [
            {
                "count": 1,
                "exception": "NullPointerException"
            }
        ],
        "time": "23:00-23:15"
    }...
 ]
}

so my order is changed but I need my data to be in same order i.e start from 21:15-21:30 and so on.. How can I maintain the same order ?

kelte
  • 27
  • 6

2 Answers2

0

Your timestamps are already sortable, so if you don't care about the order of individual exceptions, you can just do:

for (time, counter) in sorted(output.items())

which will do a lexicographical sort by time and then by count. You can do sorted(output.items(), key=lambda x: x[0]) if you want just sort by time, or key=lambda x: x[0], -x[1] for by time and count descending.

Faboor
  • 1,201
  • 1
  • 10
  • 18
  • This will sort by time which will be like -> `0:00-0:15` and my data strts from `21:15-21:30` so doesn't make sense. – kelte Jul 31 '20 at 10:39
0

The data is read into a dictionary, a defaultdict to be precise:

output[time][error] += int(count)

This data structure is grouping the data by time and by error type, which implies that there may be multiple items with the same time and the same error time. There is no way to have the "same order", if the data is regrouped like that.

On the other hand, you probably expect the time to be ordered in the input and even if it is not, you want output ordered by time, yo sou just need to do that, so instead of this:

for (time, counter) in output.items()

do this:

for time in sorted(output)

and then get the counter as

counter = output[time]

EDIT: time is sorted, but not starting at 0:00, sorting by time string is not correct. Instead, sorting the time by the original time order is correct.

Therefore, remember the original time order:

time_order = []

for line in input_data:
    time, error, count = line.split(None, 2)
    output[time][error] += int(count)
    time_order.append(time)

Then later sort by it:

for time in sorted(output, key=time_order.index)
zvone
  • 15,142
  • 2
  • 32
  • 66
  • My input is already sorted by time and If in a time range, multiple exceptions are present, then sort the exception names in a lexicographical order within that time range. – kelte Jul 31 '20 at 10:36
  • so My input data is in sorted manner itself, nothing is required, but when I do the following operation order change and even if I sort back by time it will start from 0:00-0:15 which I don't want. – kelte Jul 31 '20 at 10:38
  • @kelte Ok... I added a solution for that too ;) – zvone Jul 31 '20 at 11:33