I have a pandas dataframe, and a list of lists (each sublist has 3 items [name, seq, qual]). I want to check if the first item in the list of lists matches the name column of the df, and if it does I want to create new columns with item[1] and item[2] added to the dataframe.
To set it up:
reads = [['read1', 'ACTG', 'FFFF'], ['read2', 'TTTT', 'FF:F'], ['read3', 'ATGC', 'F:FF']]
df = pd.DataFrame(reads, columns=['ReadName', 'Sequence1', 'Qual1'])
reads2 = [['read3', 'CGCG', 'F::F'], ['read1', 'TGTG', 'F:FF'], ['read2', 'AAAA', 'FFFF']]
What I've tried:
for item in reads2:
if item[0] in df['ReadName']:
df['Sequence2'] = item[1]
df['Qual2'] = item[2]
but the resultant df looks like:
ReadName Sequence1 Qual1 Sequence2 Qual2
0 read1 ACTG FFFF CGCG F::F
1 read2 TTTT FF:F CGCG F::F
2 read3 ATGC F:FF CGCG F::F
So it is only adding the first item from the list of lists to all the rows in the df. I would expect it to look like:
ReadName Sequence1 Qual1 Sequence2 Qual2
0 read1 ACTG FFFF TGTG F:FF
1 read2 TTTT FF:F AAAA FFFF
2 read3 ATGC F:FF CGCG F::F