1

Does not work: (\d+\s*|(?<=\d+\s*)-\s*\d\s*)+

But it does works: ((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)

Failed Tests:

"12-34" gives "12-34" (correct) versus "4"  (incorrect)
"1-23"  gives "1-23"  (correct) versus "3"  (incorrect)
"12-3"  gives "12-3"  (correct) versus "-3" (incorrect)

"123" or "1234" works fine for both.

Don't test in consol! Use in MSSQL and NET3.5:

C# DLL

using System.Data.SqlTypes; //SqlInt32, ...
using Microsoft.SqlServer.Server; //SqlFunction, ...
using System.Collections; //IEnumerable
using System.Collections.Generic; //List
using System.Text.RegularExpressions;
internal struct MatchResult {
    /// <summary>Which match or group this is</summary>
    public int ID;
    /// <summary>Where the match or group starts in the input string</summary>
    public int Pos;
    /// <summary>What string matched the pattern</summary>
    public string Match;
}
public class RE {
    [SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, IsPrecise = true, SystemDataAccess = SystemDataAccessKind.None, FillRowMethodName = "FBsRow")]
    public static IEnumerable FBs(string str, string pattern, SqlInt32 opt) {
        if (str == null || pattern == null || opt.IsNull) return null;
        var gs = Regex.Match(str, pattern, (RegexOptions)opt.Value).Groups; int gid = 0; List<MatchResult> r = new List<MatchResult>(gs.Count);
        foreach (Group g in gs) r.Add(new MatchResult { ID = gid++, Pos = g.Index, Match = g.Value }); return r;
    }

    public static void FBsRow(object obj, ref SqlInt32 ID, ref SqlInt32 Pos, ref string FB) { MatchResult g = (MatchResult)obj; ID = g.ID; Pos = g.Pos; FB = g.Match; }
}

MSSQL

go
sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
go
IF OBJECT_ID(N'dbo.FBs') IS NOT NULL DROP FUNCTION dbo.FBs
go
go
if exists(select 1 from sys.assemblies as A where A.name='SQL_CLR') DROP     ASSEMBLY SQL_CLR
go
CREATE ASSEMBLY SQL_CLR FROM 'C:\src\SQL_CLR.dll'
go
CREATE FUNCTION dbo.FBs(@str nvarchar(max),@pattern nvarchar(max),@opt int=1)
RETURNS TABLE (ID int,Pos int,FB nvarchar(max)) WITH EXECUTE AS CALLER
AS EXTERNAL NAME SQL_CLR.[RE].FBs
go
;with P(p) as (select * from (values ('(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+'),('((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)')) P(t)),
T(t) as (select * from (values ('12-34'),('12-3'),('1-23'),('1234'),('123')) T(t))
select *,iif(t=FB,'PASS','FAIL') from P cross join T outer apply dbo.FBs(t,p,0) where ID=1
go

Tests in MSSQL

Dmitry Bychenko
  • 149,892
  • 16
  • 136
  • 186
SaSha
  • 31
  • 4
  • There's no *lie*, I'm far from accusing you, but *incomplete information*. Now, when you've *provided the platform* - MSSQL and NET3.5 - and *put tests* for us to run I vote to reopen your question. – Dmitry Bychenko Sep 11 '19 at 11:50
  • @Dmitry it is still s dupe, please consider reclosing with https://stackoverflow.com/questions/43461376/regex-repeating-capturing-group. I could have modified the close reason myself if you had left me a comment. There is a ton of questions regarding repeated capturing grohps and their behavior, no need to multiply them. – Wiktor Stribiżew Sep 11 '19 at 12:28

1 Answers1

1

Your question is why, when the regular expression is applied to 12-34, the value of the captured group is 4 with the first regex and 12-34 with the second.

Your regular expressions are structured as follows (with the common part highlighted)

enter image description here

The difference is that in the first one you are repeating a capturing group. The captured value only contains the result of the last iteration.

In the second one you are capturing a repeated group. This is the correct one to use for the semantics you desire.

See Repeating a Capturing Group vs. Capturing a Repeated Group for more about this.

Martin Smith
  • 402,107
  • 79
  • 682
  • 775