36

How can I remove empty lines in a string in C#?

I am generating some text files in C# (Windows Forms) and for some reason there are some empty lines. How can I remove them after the string is generated (using StringBuilder and TextWrite).

Example text file:

THIS IS A LINE



THIS IS ANOTHER LINE AFTER SOME EMPTY LINES!
Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Saeid Yazdani
  • 12,365
  • 48
  • 158
  • 270
  • 1
    Is removing the lines after generation really what you want to do? I think you should look at why you are generating extra lines. If you use the WriteLine(...) methods they will write the new line for you. The Write(...) methods do not write a new line sequence. – Mesh Oct 04 '11 at 12:25
  • 1
    Well it is not my fault, I am extacting text from some text files and that is the problem! – Saeid Yazdani Oct 04 '11 at 12:32
  • http://stackoverflow.com/questions/4973524/how-to-remove-extra-returns-and-spaces-in-a-string-by-regex/4974031#4974031 – Allen Jan 29 '15 at 15:56

11 Answers11

94

If you also want to remove lines that only contain whitespace, use

resultString = Regex.Replace(subjectString, @"^\s+$[\r\n]*", string.Empty, RegexOptions.Multiline);

^\s+$ will remove everything from the first blank line to the last (in a contiguous block of empty lines), including lines that only contain tabs or spaces.

[\r\n]* will then remove the last CRLF (or just LF which is important because the .NET regex engine matches the $ between a \r and a \n, funnily enough).

Community
  • 1
  • 1
Tim Pietzcker
  • 297,146
  • 54
  • 452
  • 522
  • 3
    This almost works, however I have one problem : The last line is empty and it isn't removed. I'm lousy at regex so I'm not sure why? – Robin Rye Jul 09 '12 at 08:48
  • 4
    @RobinRye: This is because it requires at least one whitespace character to match. If you change the `\s+` to `\s*`, then it should also remove the last line. – Tim Pietzcker Jul 09 '12 at 09:08
  • 5
    Thanks Tim, I thought so too after researching Regex a bit but it didn't help. Changed to \s* but the last line was still left in the result string. I used str.Trim() to get rid of it. – Robin Rye Jul 09 '12 at 10:18
  • 1
    This removes the last empty line too: Regex.Replace(subjectString, @"[\r\n]*^\s*$[\r\n]*", "", RegexOptions.Multiline); – Diana May 10 '16 at 04:57
  • 1
    @Diana: This might have a side effect. In some cases, to many "newline" are removed with this method. – roland Nov 07 '16 at 16:40
  • @RobinRye Using str.Trim() will remove space and tab characters from the beginning of the first line of text. You may want to use str.TrimEnd() instead. If you also want to preserve spaces/tabs at the end of the last line of text, use str.TrimEnd('\r','\n'). – Collin K Feb 12 '18 at 18:12
  • @RicardoFontana: Can you elaborate how it‘s not working? This answer is rather specific to .NET regexes - how are you using that under Unix? – Tim Pietzcker Jun 25 '18 at 14:26
  • @TimPietzcker I write a test method `[Theory] [InlineData("\nText sample")] // Windows break line [InlineData("\r\nText sample")] // Unix break line public void RemoveBlankLinesInLinuxAndWindows(string text) { resultString = Regex.Replace(text, @"^\s+$[\r\n]*", string.Empty, RegexOptions.Multiline); Assert.Equal("Text sample", resultString); }` – Ricardo Fontana Jun 25 '18 at 14:53
  • @TimPietzcker change the regex like oobe `@"^\s*$\n|\r"` start to work. – Ricardo Fontana Jun 25 '18 at 14:55
16

Tim Pietzcker - it is not working for me. I have to change a little bit, but thanks!

Ehhh C# Regex.. I had to change it again, but this it working well:

private string RemoveEmptyLines(string lines)
{
  return Regex.Replace(lines, @"^\s*$\n|\r", string.Empty, RegexOptions.Multiline).TrimEnd();
}

Example: http://regex101.com/r/vE5mP1/2

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
oobe
  • 398
  • 2
  • 7
10

You could try String.Replace("\n\n", "\n");

user807566
  • 2,560
  • 3
  • 17
  • 26
  • well thanks but this is not a general solution, would not include tabs, spaces and stuff like that – Saeid Yazdani Oct 04 '11 at 12:30
  • 19
    Your question didn't say anything about that. In fact you specifically said "empty lines." – user807566 Oct 04 '11 at 12:39
  • I tacked `Trim()` on as well. But still, it won't work in cases of `\n\n\n`. – HappyNomad Oct 06 '13 at 01:40
  • Well, that is not actually resolves all empty lines. I faced a situation when I have variable amount of endlines going together. So in that case we need to iterate several times through the text. – Arsinclair Feb 21 '17 at 06:02
4

Try this

Regex.Replace(subjectString, @"^\r?\n?$", "", RegexOptions.Multiline);
Michał Powaga
  • 20,726
  • 7
  • 45
  • 60
Narendra Yadala
  • 9,110
  • 1
  • 24
  • 43
1
private string remove_space(string st)
{
    String final = "";

    char[] b = new char[] { '\r', '\n' };
    String[] lines = st.Split(b, StringSplitOptions.RemoveEmptyEntries);
    foreach (String s in lines)
    {
        if (!String.IsNullOrWhiteSpace(s))
        {
            final += s;
            final += Environment.NewLine;
        }
    }

    return final;
}
The_Black_Smurf
  • 5,041
  • 14
  • 50
  • 66
1
private static string RemoveEmptyLines(string text)
{
    var lines = text.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);

    var sb = new StringBuilder(text.Length);

    foreach (var line in lines)
    {
        sb.AppendLine(line);
    }

    return sb.ToString();
}
Evgeny Sobolev
  • 438
  • 4
  • 12
  • AppendLine appends an empty line at the end of the returned string. – thomasgalliker May 25 '19 at 12:31
  • @thomasgalliker, That is the intention. split removes the newline from end of the line, thus you will need to add it back, otherwise all your lines are going to garble into one line! The only issue is `Environment.NewLine` is a string and cannot fit into char array – AaA May 26 '19 at 04:14
0

Based on Evgeny Sobolev's code, I wrote this extension method, which also trims the last (obsolete) line break using TrimEnd(TrimNewLineChars):

public static class StringExtensions
{
    private static readonly char[] TrimNewLineChars = Environment.NewLine.ToCharArray();

    public static string RemoveEmptyLines(this string str)
    {
        if (str == null)
        {
            return null;
        }

        var lines = str.Split(TrimNewLineChars, StringSplitOptions.RemoveEmptyEntries);

        var stringBuilder = new StringBuilder(str.Length);

        foreach (var line in lines)
        {
            stringBuilder.AppendLine(line);
        }

        return stringBuilder.ToString().TrimEnd(TrimNewLineChars);
    }
}
Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
thomasgalliker
  • 558
  • 4
  • 15
  • 1
    Your extension only works if the string in question is originated from same system. if it is transferring between systems such as lnux,web to windows, it won't work at all. Consider changing TrimNewLineChars to actual array – AaA May 26 '19 at 03:22
  • I don't know what you mean. Can you post a sample string where it won't work and I gonna write a unit test with it. Thanks. – thomasgalliker May 27 '19 at 06:23
  • Try it on text files where the [end-of-line sequence](https://en.wikipedia.org/wiki/Newline#Issues_with_different_newline_formats) is CR + LF ([Windows](https://en.wikipedia.org/wiki/Microsoft_Windows)), LF ([Linux](https://en.wikipedia.org/wiki/Linux)), and [Mac](https://en.wikipedia.org/wiki/Macintosh) (classic, before Max OS X) (CR). [CR](https://en.wikipedia.org/wiki/Carriage_return) = ASCII 13. [LF]([LF](https://en.wikipedia.org/wiki/Newline#Representation)) = ASCII 10. – Peter Mortensen May 17 '21 at 16:17
  • That is what AaA hinted at. `Environment.NewLine` only works if the file was created with the default line-end sequence of the current system. Most advanced text editors can handle/set/save in the formats (in [Visual Studio Code](https://en.wikipedia.org/wiki/Visual_Studio_Code) it is by the somewhat hidden feature that you can [***click*** on the displayed setting](https://stackoverflow.com/questions/48692741/how-can-i-make-all-line-endings-eols-in-all-files-in-visual-studio-code-unix/48694365#48694365) (e.g., *"LF"*) for a given file in the lower right and ***change*** it right there). – Peter Mortensen May 17 '21 at 18:17
  • Please read the question carefully before voting everyone down. – thomasgalliker May 18 '21 at 17:51
0

I tried the previous answers, but some of them with regex do not work right.

If you use a regex to find the empty lines, you can’t use the same for deleting.

Because it will erase "break lines" of lines that are not empty.

You have to use "regex groups" for this replace.

Some others answers here without regex can have performance issues.

    private string remove_empty_lines(string text) {
        StringBuilder text_sb = new StringBuilder(text);
        Regex rg_spaces = new Regex(@"(\r\n|\r|\n)([\s]+\r\n|[\s]+\r|[\s]+\n)");
        Match m = rg_spaces.Match(text_sb.ToString());
        while (m.Success) {
            text_sb = text_sb.Replace(m.Groups[2].Value, "");
            m = rg_spaces.Match(text_sb.ToString());
        }
        return text_sb.ToString().Trim();
    }
Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
antoine
  • 44
  • 4
0

None of the methods mentioned here helped me all the way, but I found a workaround.

  1. Split text to lines - collection of strings (with or without empty strings, also Trim() each string).

  2. Add these lines to multiline string.

     public static IEnumerable<string> SplitToLines(this string inputText, bool removeEmptyLines = true)
     {
         if (inputText == null)
         {
             yield break;
         }
    
         using (StringReader reader = new StringReader(inputText))
         {
             string line;
             while ((line = reader.ReadLine()) != null)
             {
                 if (removeEmptyLines && !string.IsNullOrWhiteSpace(line))
                     yield return line.Trim();
                 else
                     yield return line.Trim();
             }
         }
     }
    
     public static string ToMultilineText(this string text)
     {
         var lines = text.SplitToLines();
    
         return string.Join(Environment.NewLine, lines);
     }
    
Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
scarybook
  • 91
  • 1
  • 5
-1

This pattern works perfect to remove empty lines and lines with only spaces and/or tabs.

s = Regex.Replace(s, "^\s*(\r\n|\Z)", "", RegexOptions.Multiline)
Ivan Ferrer Villa
  • 1,960
  • 1
  • 24
  • 21
-1

I found a simple answer to this problem:

YourradTextBox.Lines = YourradTextBox.Lines.Where(p => p.Length > 0).ToArray();

Adapted from Marco Minerva [MCPD] at Delete Lines from multiline textbox if it's contain certain string - C#

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Scooter
  • 316
  • 3
  • 8