0

I am trying to build a program to calculate similarity between documents. But first, I want to get document contents from two different directories , below is my code which is worked, but only getting document contents from one directory/folder. Where I want to compare documents of Folder 1 with document of Folder 2.

from os import listdir
from os.path import isfile, join

BASE_INPUT_DIR = "./folder1/"

def returnListOfFilePaths(folderPath):
    fileInfo = []
    listOfFileNames = [fileName for fileName in listdir(folderPath) if isfile(join(folderPath, fileName))]
    listOfFilePaths = [join(folderPath, fileName) for fileName in listdir(folderPath) if isfile(join(folderPath, fileName))]
    fileInfo.append(listOfFileNames)
    fileInfo.append(listOfFilePaths)
    return fileInfo

fileNames, filePaths = returnListOfFilePaths(folder1)
print(fileNames, "\n", filePaths)

def create_docContentDict(filePaths):
    rawContentDict = {}
    for filePath in filePaths:
        with open(filePath, "r") as ifile:
            fileContent = ifile.read()
        rawContentDict[filePath] = fileContent
    return rawContentDict
rawContentDict = create_docContentDict(filePaths)
print(rawContentDict)

I did some changes, but still getting docs from only one directory. here's my changes

BASE_INPUT_DIR = "./folder1/"
BASE_INPUT_DIR2 = "./folder2/"

def returnListOfFilePaths(folderPath,folderPath2):
    fileInfo = []
    fileInfo2 = []
    
    listOfFileNames = [fileName for fileName in listdir(folderPath) if isfile(join(folderPath, fileName))]
    listOfFilePaths = [join(folderPath, fileName) for fileName in listdir(folderPath) if isfile(join(folderPath, fileName))]
    
    listOfFileNames2 = [fileName for fileName in listdir(folderPath2) if isfile(join(folderPath2, fileName))]
    listOfFilePaths2 = [join(folderPath2, fileName) for fileName in listdir(folderPath2) if isfile(join(folderPath2, fileName)
                                                                                                  )]
    fileInfo.append(listOfFileNames)
    fileInfo.append(listOfFilePaths)
    fileInfo2.append(listOfFileNames2)
    fileInfo2.append(listOfFilePaths2)
    return fileInfo
    return fileinfo2


fileNames, filePaths = returnListOfFilePaths(BASE_INPUT_DIR,BASE_INPUT_DIR2  )
print(fileNames, "\n", filePaths)

def create_docContentDict(filePaths):
    rawContentDict = {}
    for filePath in filePaths:
        with open(filePath, "r") as ifile:
            fileContent = ifile.read()
        rawContentDict[filePath] = fileContent
    return rawContentDict
rawContentDict = create_docContentDict(filePaths)
print(rawContentDict)

Thanks

  • Welcome to SO! Do you mean "directory" rather than "direction"? If you have a function that works on one directory and gives you the result you want, why not call it again with a different parameter for the second directory? Adding another parameter any time you want a different result sort of defeats the purpose of functions. – ggorlen Oct 01 '20 at 22:10
  • Hello buddy, Thanks for correcting. Yes I meant Directory not direction. Well, yea it worked when I call it again with another parameter. But I want to learn if possible to do that with function that has two parameters ,one for Folder1 another one for Folder2, so the function will get docs content from two directories. – Diyar Alzuhairi Oct 01 '20 at 23:55
  • I would recommend reading https://ericlippert.com/2014/03/05/how-to-debug-small-programs/. – AMC Oct 02 '20 at 00:24
  • @DiyarAlzuhairi OK, then take all the code in the function and duplicate it and add a 2 on the end of everything like you're doing and return a tuple of two values instead of one. I can assure you it's a totally pointless exercise though that offers no advantage and many disadvantages over writing code once and calling it as many times as you need it.. `return fileInfo, fileInfo2` and unpack it in the caller into two variables. But hopefully I've discouraged you from doing this by now... the original code is far superior. – ggorlen Oct 02 '20 at 00:57
  • Does this answer your question? [How do I return multiple values from a function?](https://stackoverflow.com/questions/354883/how-do-i-return-multiple-values-from-a-function) – ggorlen Oct 02 '20 at 01:19

0 Answers0