I'm trying to convert an excel file into a tab separated text file via Apache POI. The excel has some cells formatted with formulas and some empty cells.
Here's a sample of original excel file:
Here's an extract of the final output:
'US' 'USORACLEAP' SYSTEMREFERENCE SUPPLIERID SUPPLIERNAME CLASSIFICATION VENDOR_SITE_CODE SUPPLIERADDRESS1 SUPPLIERADDRESS2 STATE ZIPCODE COUNTRY SOURCE INVOICENUM INVOICEDATE PAYMENTDATE LINE_DESC GL_COMPANY GL_CODE GL_DESCR COSTCENTER CC_DESCR CURRENCY_CODE CHECK_NUMBER NUM_DOCS SPEND TERM PAYMENT_METHOD SYSTEM_APPROVED PO_DISTRIBUTION_ID WALKER_COST_CENTER RGL_LEDGER_ENTITY
US US Oracle AP RANDBETWEEN(3000,100000) "TEXT "&D2 VENDOR "TEXT "&D3 "TEXT "&D3 "TEXT "&D3 ONTARIO RIGHT(D2,5) US "TEXT "&D3 "TEXT "&D3 RANDBETWEEN(43831, 44150) RANDBETWEEN(44105,44135) "TEXT "&D3 RIGHT("000"&RANDBETWEEN(1,999),3) RANDBETWEEN(55000, 60000) "TEXT "&D3 "TEXT "&D3 "TEXT "&D3 USD RANDBETWEEN(2000000,2100000) RANDBETWEEN(1,4) RANDBETWEEN(1,100000)/100 IMMEDIATE Check "TEXT"&D2 X2
US US Oracle AP 31836 "TEXT "&D3 1099 "TEXT "&D4 "TEXT "&D4 "TEXT "&D4 NY RIGHT(D3,5) US "TEXT "&D4 "TEXT "&D4 RANDBETWEEN(43831,44150) RANDBETWEEN(44105,44135) "TEXT "&D4 RIGHT("000"&RANDBETWEEN(1,999),3) RANDBETWEEN(55000,60000) "TEXT "&D4 "TEXT "&D4 "TEXT "&D4 USD RANDBETWEEN(2000000,2100000) RANDBETWEEN(1,4) RANDBETWEEN(1,100000)/100 IMMEDIATE Check GSUEDCM03 AF2
US US Oracle AP 3504 "TEXT "&D4 VENDOR "TEXT "&D5 "TEXT "&D5 "TEXT "&D5 NY RIGHT(D4,5) US "TEXT "&D5 "TEXT "&D5 RANDBETWEEN(43831,44150) RANDBETWEEN(44105,44135) "TEXT "&D5 RIGHT("000"&RANDBETWEEN(1,999),3) RANDBETWEEN(55000,60000) "TEXT "&D5 "TEXT "&D5 "TEXT "&D5 USD RANDBETWEEN(2000000,2100000) RANDBETWEEN(1,4) RANDBETWEEN(1,100000)/100 IMMEDIATE ACH GSUEIT001 AF3
US US Oracle AP 3504 "TEXT "&D5 VENDOR "TEXT "&D6 "TEXT "&D6 "TEXT "&D6 NY RIGHT(D5,5) US "TEXT "&D6 "TEXT "&D6 RANDBETWEEN(43831,44150) RANDBETWEEN(44105,44135) "TEXT "&D6 RIGHT("000"&RANDBETWEEN(1,999),3) RANDBETWEEN(55000,60000) "TEXT "&D6 "TEXT "&D6 "TEXT "&D6 USD RANDBETWEEN(2000000,2100000) RANDBETWEEN(1,4) RANDBETWEEN(1,100000)/100 IMMEDIATE ACH GSUEIT001 AF4
US US Oracle AP 3504 "TEXT "&D6 VENDOR "TEXT "&D7 "TEXT "&D7 "TEXT "&D7 NY RIGHT(D6,5) US "TEXT "&D7 "TEXT "&D7 RANDBETWEEN(43831,44150) RANDBETWEEN(44105,44135) "TEXT "&D7 RIGHT("000"&RANDBETWEEN(1,999),3) RANDBETWEEN(55000,60000) "TEXT "&D7 "TEXT "&D7 "TEXT "&D7 USD RANDBETWEEN(2000000,2100000) RANDBETWEEN(1,4) RANDBETWEEN(1,100000)/100 IMMEDIATE ACH GSUEIT001 AF5
As you can see, the 1st row represents column headers. Some of the cells (D1
) have been converted to the actual formula. The 3rd column doesn't have any values so the whole content shifted towards left in the text file.
Here's the code:
private void convertXlsToText(InputStream inputStream, String delimiter, File targetFile) throws IOException {
StringBuilder sb = new StringBuilder();
setMinInflateRatio(0);
try (Workbook wb = create(inputStream)) {
Sheet firstSheet = wb.getSheetAt(0);
for (Row nextRow : firstSheet) {
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case STRING:
sb.append(cell.getStringCellValue()).append(delimiter);
break;
case BOOLEAN:
sb.append(cell.getBooleanCellValue()).append(delimiter);
break;
case NUMERIC:
sb.append(cell.getNumericCellValue()).append(delimiter);
break;
case FORMULA:
sb.append(cell.getCellFormula()).append(delimiter);
break;
default:
sb.append(EMPTY).append(delimiter);
}
}
sb.append(DEFAULT_LINE_END);
}
}
dumpStringBuilderToFile(sb, targetFile);
}
Can someone please point out what changes should i be making in my code to fix the alignments and the formula issue?
PS: I'm using TAB (\t)
as my delimiter.
UPDATE: Here's the updated code after suggestions.
private void convertXlsToText(InputStream inputStream, String delimiter, File targetFile) throws IOException {
StringBuilder sb = new StringBuilder();
setMinInflateRatio(0);
try (Workbook wb = create(inputStream)) {
Sheet firstSheet = wb.getSheetAt(0);
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
DataFormatter formatter = new DataFormatter();
for (Row nextRow : firstSheet) {
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
if (cell != null) {
sb.append(format("%-20s", formatter.formatCellValue(cell, evaluator))).append(delimiter);
} else {
sb.append(format("%-20s", EMPTY)).append(delimiter);
}
}
sb.append(DEFAULT_LINE_END);
}
}
dumpStringBuilderToFile(sb, targetFile);
}