2

I am working on a Single Page Application in AngularJS.

There is a Text Area where users can paste some rows from excel. All rows will have same number of columns. I need to process this pasted data and convert it into an javascript array. This is not a problem as I can split the string by \r\n or \n or \r and the fields by \t.

The problem arises if a new line character exists in one of the columns. But in that case, the column is enclosed by quotation marks (").

I understand that regex can be helpful here. However, I am struggling to construct a regex to ignore the new line characters with quotation marks.

//PastedItemData will have the following data
1PS133-0FGD61**\t**"Text with**\r\n**multiple**\r\n**lines"**\t**1**\t**5932.2**\r\n**2PS133-0FGD61**\t**"Simple text with no new lines"**\t**2**\t**1234.5

var PastedItemDataArray = PastedItemData.split("\r\n"); ///use regex here?

I am sure someone else might have faced a similar issue of ignoring new lines in data. Perhaps there might be a different way than using regex. Has any one any ideas?

Thanks.

Vikram
  • 23
  • 3

1 Answers1

0

I could not do it with a regular expression, this code works.

var PastedItemData = '1PS133-0FGD61**\t**"Text with**\r\n**multiple**\r\n**lines"**\t**1**\t**5932.2**\r\n**2PS133-0FGD61**\t**"Simple text with no new lines"**\t**2**\t**1234.5';
var i,c,ignore,cr,nl;
var rows,row,str;
ignore = false;
str = "";
rows = [];
row = [];
for (i = 0; i < PastedItemData.length; i++) {
    c = PastedItemData[i];
    if (ignore === true && c !== '"') {
        str += c;
    } else {
        switch (c) {
            case '"':
                ignore = !ignore;
                str += c;
                cr = false;
                break;
            case "\r":
                cr = true;
                break;
            case "\n":
                if (cr === true) {
                    row.push(str);
                    rows.push(row);
                    row = [];
                    str = "";
                    c = "";
                    cr = false;
                }
                break;
            case "\t":
                row.push(str);
                str = "";
                cr = false;
                break;
            default:
                str += c;
                cr = false;
                break;
        }
    }
}
row.push(str);
rows.push(row);
console.log(rows);
user2182349
  • 8,572
  • 3
  • 22
  • 36
  • Yup. That works. However, it would fail if there was \r\n inside one of the columns. I am trying to find a way to ignore that particular \r\n or \r or \n. Say if 2nd column had "Text with \r\n multiple \r\n lines" – Vikram Aug 04 '16 at 00:07
  • Hi, It works. but for single occurrence of new line character only. I think it only ignores the last occurrence of new line within quotes. – Vikram Aug 04 '16 at 07:14
  • Thanks for your help. If you do come across a way to extend it to ignore multiple occurrences of new line within quotation marks, let me know. It would be of great help. I am trying it out on following regexr link: (http://regexr.com/3dut6) – Vikram Aug 05 '16 at 07:22