0

I have a dataset as shown below:

Cola                 colb
1.2/1.4/1.5/1.6      A
3.3/5.6              B

I want to have one column

COlA Colb
1.2   A
1.4   A
1.5   A
1.6   A
3.3   B
5.6   B

How do I do this in python

yatu
  • 75,195
  • 11
  • 47
  • 89
Rudra
  • 109
  • 1
  • 8
  • What format do you have the initial dataset in? What have you tried so far? Post some code. – clubby789 Jan 15 '20 at 11:49
  • Are you using `pandas` and this data is in a DataFrame? Or is what you're presenting a native excel spreadsheet and you need to read from that? – Jon Clements Jan 15 '20 at 11:50
  • Please show what you've tried and ask a more specific question. As your question stands, it's really difficult to understand what part of this problem you are having trouble with, or any of the context of your roblem. If it's basic Python knowledge you lack (opening a file, reading text, etc) then I recommend searching some online Python documentation and tutorial resources. – lurker Jan 15 '20 at 11:51

1 Answers1

2

Import as a pandas dataframe, and use str.split and explode:

df['Cola'] = df.Cola.str.split('/')
df.explode('Cola')

  Cola colb
0  1.2    A
0  1.4    A
0  1.5    A
0  1.6    A
1  3.3    B
1  5.6    B
yatu
  • 75,195
  • 11
  • 47
  • 89
  • 1
    Almost beat you to it! Happy to have thought the same way as you! – Celius Stingher Jan 15 '20 at 11:54
  • 1
    Yeah... was thinking along the lines of `df.assign(version=df.Cola.str.split('/')).explode('version')` that way you can also avoid overwriting the original DF frame and it's something to check against... but it doesn't look like that's required here... – Jon Clements Jan 15 '20 at 12:03
  • Thank you for this. However I receive blank values after using split in cola...Any particular reason for that – Rudra Jan 15 '20 at 12:32
  • Please share a sample of the actual dataframe @rudra – yatu Jan 15 '20 at 12:58
  • Actual DF: small_area Median_BER 217099001/217112002/217112003/217112005/217112 212.9 047041005/047041001/047041004/047041002/047041. 271.3 157041002/157041004/157041003/157041001/157129. 222 – Rudra Jan 15 '20 at 13:27
  • After Split I see below:small_area Median_BER NaN 212.9 NaN 271.3 NaN 222.5 – Rudra Jan 15 '20 at 13:27