1

C# string's Splice method seems to copy remnants into an array of strings instead of just reading them. Is there a c++17 string_view equivalent to bypass copying?


For those not familiar with string_view, here is some background information.

From Microsoft's <string_view>:

The string_view family of template specializations provides an efficient way to pass a read-only, exception-safe, non-owning handle to the character data of any string-like objects with the first element of the sequence at position zero. (...)

From Microsoft's C++ Team Blog std::string_view: The Duct Tape of String Types:

string_view solves the “every platform and library has its own string type” problem for parameters. It can bind to any sequence of characters, so you can just write your function as accepting a string view:

void f(wstring_view); // string_view that uses wchar_t's

and call it without caring what stringlike type the calling code is using (and > for (char*, length) argument pairs just add {} around them)

From StackOverflow's What is string_view?

The purpose of any and all kinds of "string reference" and "array reference" proposals is to avoid copying data which is already owned somewhere else and of which only a non-mutating view is required. The string_view in question is one such proposal; there were earlier ones called string_ref and array_ref, too.

The idea is always to store a pair of pointer-to-first-element and size of some existing data array or string.

Such a view-handle class could be passed around cheaply by value and would offer cheap substringing operations (which can be implemented as simple pointer increments and size adjustments). (...)

The following bit, again from std::string_view: The Duct Tape of String Types is unrelated to the question but should be interesting to C# developers:

Today, the most common “lowest common denominator” used to pass string data around is the null-terminated string (or as the standard calls it, the Null-Terminated Character Type Sequence). This has been with us since long before C++, and provides clean “flat C” interoperability. However, char* and its support library are associated with exploitable code, because length information is an in-band property of the data and susceptible to tampering. Moreover, the null used to delimit the length prohibits embedded nulls and causes one of the most common string operations, asking for the length, to be linear in the length of the string.

1 Answers1

2

ReadOnlySpan could work.

Have a look at All About Span: Exploring a New .NET Mainstay

A second variant of Span, called System.ReadOnlySpan, enables read-only access. This type is just like Span, except its indexer takes advantage of a new C# 7.2 feature to return a “ref readonly T” instead of a “ref T,” enabling it to work with immutable data types like System.String. ReadOnlySpan makes it very efficient to slice strings without allocating or copying, as shown here:

string str = "hello, world";
string worldString = str.Substring(startIndex: 7, length: 5); // Allocates
ReadOnlySpan<char> worldSpan =
 str.AsSpan().Slice(start: 7, length: 5); // No allocation
Assert.Equal('w', worldSpan[0]);
worldSpan[0] = 'a'; // Error CS0200: indexer cannot be assigned to

char[]

This is not directly what you're asking but you could organise your data as an array of chars.

tymtam
  • 20,472
  • 3
  • 58
  • 92
  • Couple lines of what "string_view" does (for those who like me are not experts in C++) is what I'm looking for (reading the docs indeed looks like it similar to Spans in C# but since I have not used that in C++ no good idea of how they are used in practice). Also consider if `ReadAllLines` sentence can be edited to make sense in the context of the question (my understanding that C++ also does not provide "spans"/"view" over multiple strings - so not really sure how array of strings relates to string_view type).. – Alexei Levenkov Apr 01 '21 at 05:45