0

I am getting the following error "Error: line contains NULL byte" when trying to read and covert a CSV column into a dictionary list within Python.

I have followed list contains NULL byte, CSV DictReader but this doesn't seem to work with the code I have written.

import pandas as pd
import csv
from collections import defaultdict

columns = defaultdict(list) 

with open('file.csv') as f:
    reader = csv.DictReader(f) 
    for row in reader: 
        for (k,v) in row.items(): 
            columns[k].append(v) 
                                 

keywords = print(columns['Keyword'])

Any help would be appreciated! Thanks.

JackNesbitt
  • 105
  • 7

1 Answers1

0

The problem as you see is the empty row/cell. The link you mentioned is replacing '\0' with a dummy string ''

Do you have any other content in the 'file.csv' which is causing this error? Can you check its content first? I believe the error you get would need replacing some other problematic string with dummy string ''

An easier way is to just print all the unique characters and check whether you have problematic strings like '\0', '\0\0', '\0,\0' etc

If you are only looking to convert a column to the list you can try this:

df = pd.read_csv("file.csv")
df["Keyord"].tolist()
sam
  • 1,690
  • 14
  • 23
  • Yeah the CSV i am reading has multiple rows and columns. The script I have, I just want the content of the "Keyword" column, and then convert that into a list. – JackNesbitt May 31 '20 at 10:43
  • If that is the case, can you try df = pd.read_csv("file.csv") and df['Keyword"].tolist() – sam May 31 '20 at 10:48
  • Thanks so far, now getting a "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte" error when reading the file. – JackNesbitt May 31 '20 at 16:25
  • It seems that encoding is different for the file you have. See this https://stackoverflow.com/questions/33819557/unicodedecodeerror-utf-8-codec-while-reading-a-csv-file/33819765 – sam May 31 '20 at 16:51