1

I have copied a table from a webpage and when I paste it to a text file (or excel) the table a list of values. here is the example list.

['1', '42', 'Konya', '40.838', '42', '62', 'Tunceli', '7.582']

I want the 0th item on column 1 1st item on column 2 3th item on column 3 4th item on column 4

Below is a long way of doing it( i assume)

import pandas as pd
mylist=['1', '42', 'Konya', '40.838', '42', '62', 'Tunceli', '7.582']
city=[]
code=[]
area=[]
for i,line in enumerate(mylist):
    if i%4==0:
        index.append(line)
    if i%4==1:
        code.append(line)
    if i%4==2:
        city.append(line)
    if i%4==3:
        area.append(line)
dict={'code':code,'city':city,'area':area}   
df=pd.DataFrame(dict)

What I am looking for is the code above but in a shorter way, I am sure someone has a clever way of doing it, just I cannot find it...

Siva Shanmugam
  • 627
  • 7
  • 17
hgv
  • 195
  • 1
  • 1
  • 5
  • Please show an output example – umn May 21 '19 at 07:08
  • 1
    Have you considered to directly read the table of the webpage into Python? For example using Pandas [read_html](https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.read_html.html). – Dylan_w May 21 '19 at 07:13

3 Answers3

1

Convert values to array and reshape, last pass to DataFrame constructor:

L = ['1', '42', 'Konya', '40.838', '42', '62', 'Tunceli', '7.582']

df = pd.DataFrame(np.array(L).reshape(-1, 4), columns=['code1','code2','city','area'])
print (df)
  code1 code2     city    area
0     1    42    Konya  40.838
1    42    62  Tunceli   7.582
jezrael
  • 629,482
  • 62
  • 918
  • 895
0

I guess you could use this pd.DataFrame(list_of_lists, columns=labels)

and using this this to get list_of_lists from your list

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]

labels=['index', 'code', 'city', 'area']

df = pd.DataFrame(chunks(mylist, 4), columns=labels)
ichafai
  • 271
  • 2
  • 9
0

Several solutions are possible.

You said you copied your data to a text file. The simpler solution to build the dataframe is to call read_csv (doc). It takes as arguments the name of the file. You can also provide the "separator" of each element. Here an example. Suppose I have the following text file:

Temp.txt :

index, code, city, area
1, 42, Konya, 40.838
42, 62, Tunceli, 7.582

Python:

df = pd.read_csv(r"..\\temp.txt", sep=',')
print(df)
#   index code     city    area
# 0     1   42    Konya  40.838
# 1    42   62  Tunceli   7.582

However, if you already have your data in Python (in a list for example). Ismail provide a solution. Here is another one. You can reshape your 1 dimension list to a 2D dimension list. Here the code:

Python:

mylist = ['1', '42', 'Konya', '40.838', '42', '62', 'Tunceli', '7.582']
def to_matrix(l, n):
    return [l[i:i + n] for i in range(0, len(l), n)]

my_list_reshape = to_matrix(mylist, 4)
print(my_list_reshape)
# [['1',  '42', 'Konya',   '40.838'],
#  ['42', '62', 'Tunceli', '7.582' ]]
df = pd.DataFrame(my_list_reshape, columns=['index', 'code', 'city', 'area'])
print(df)
#   index code     city    area
# 0     1   42    Konya  40.838
# 1    42   62  Tunceli   7.582
Alexandre B.
  • 4,770
  • 2
  • 11
  • 33