1

I have a pandas dataframe that looks something like this:

            0                   1                   2              3  \
0  UICEX_0001   path/to/bam_T.bam   path/to/bam_N.bam     chr1:10000   
1  UICEX_0002  path/to/bam_T2.bam  path/to/bam_N2.bam  chr54:4958392   

              4  
0  chr4:4958392  
1           NaN 

I am trying to loop through each row and print text to output for another program. I need to print the first three columns (with some other text), then go through the rest of the columns and print something different depending on if they are NaN or not.

This works mostly:

Current code

def CreateIGVBatchScript(x):
    for row in x.iterrows():
        print("\nnew")
        sample = x[0]
        bamt = x[1]
        bamn = x[2]
        print("\nload", bamt.to_string(index=False), "\nload", bamn.to_string(index=False))
        for col in range(3, len(x.columns)):
            position = x[col]
            if position.isnull().values.any():
                print("\n")
            else:
                position = position.to_string(index=False)
                print("\ngoto ", position, "\ncollapse\nsnapshot ", sample.to_string(index=False), "_", position,".png\n")

CreateIGVBatchScript(data)

but the output looks like this:

Actual Output

new
load path/to/bam_T.bam
path/to/bam_T2.bam 
load path/to/bam_N.bam
path/to/bam_N2.bam

goto  chr1:10000
chr54:4958392 
collapse
snapshot  UICEX_0001 **<-- ISSUE: it's printing both rows at the same time**
UICEX_0002 _ chr1:10000
chr54:4958392 .png

new

load path/to/bam_T.bam
path/to/bam_T2.bam 
load path/to/bam_N.bam
path/to/bam_N2.bam

goto  chr1:10000
chr54:4958392 
collapse
snapshot  UICEX_0001   **<-- ISSUE: it's printing both rows at the same time**
UICEX_0002 _ chr1:10000
chr54:4958392 .png

The first part seems fine, but when I start to iterate over the columns, all rows are printed. I can't seem to figure out how to fix this. Here's what I want one of those parts to look like:

Partial Wanted Output

goto chr1:10000
collapse
snapshot UICEX_0001_chr1:10000.png
goto chr54:4958392
collapse
snapshot UICEX_0001_chr54:495832.png

Extra information Incidentally, I'm actually trying to adapt this from an R script in order to better learn Python. Here's the R code, in case that helps:

CreateIGVBatchScript <- function(x){
     for(i in 1:nrow(x)){
          cat("\nnew")
          sample = as.character(x[i, 1])
          bamt = as.character(x[i, 2])
          bamn = as.character(x[i, 3])
          cat("\nload",bamt,"\nload",bamn)
          for(j in 4:ncol(x)){
               if(x[i, j] == "" | is.na(x[i, j])){ cat("\n") }
               else{
                    cat("\ngoto ", as.character(x[i, j]),"\ncollapse\nsnapshot ", sample, "_", x[i,j],".png\n", sep = "")
               }
          }
     }
     cat("\nexit")
}
CreateIGVBatchScript(data)
Gaius Augustus
  • 866
  • 2
  • 12
  • 33

1 Answers1

0

I've come up with the answer. There are a few problems here:

  1. I was using iterrows() incorrectly.

The iterrows object actually holds the information from the rows, and then you can use the index to save values from that Series.

for index, row in x.iterrows():
    sample = row[0]

will save the value in that row in column 0.

  1. Iterate over the columns

At this point, you can use a simple for loop, as I was doing to iterate over the columns.

for col in range(3, len(data.columns)):
    position = row[col]

lets you save a value from that column.

The final Python code is:

def CreateIGVBatchScript(x):
    x=x.fillna(value=999)
    for index, row in x.iterrows():
        print("\nnew", sep="")
        sample = row[0]
        bamt = row[1]
        bamn = row[2]
        print("\nload ", bamt, "\nload ", bamn, sep="")
        for col in range(3, len(data.columns)):
            position = row[col]
            if position == 999:
                print("\n")
            else:
                print("\ngoto ", position, "\ncollapse\nsnapshot ", sample, "_", position, ".png\n", sep="")

CreateIGVBatchScript(data)

Answers were guided by the following posts:

Community
  • 1
  • 1
Gaius Augustus
  • 866
  • 2
  • 12
  • 33