Before Span, we used String.Substring to get a part of a string, which allocates a new string and copies the characters to it.
Now we can just slice the Span using the range operator, an Span to a method that accepts a ReadOnlySpan<char> parameter, such as Int32.Parse. See Basics for a partial list of Span-aware APIs.
int.Parse("12345".AsSpan()[1..3]) // results in 23const string text = "abcdefghijklmnopq";
string ConcatSubstring() => text[10..] + "---" + text[..5];
string ConcatAsSpan() => string.Concat(text.AsSpan()[10..], "---", text.AsSpan()[..5]);BenchmarkDotnet results - note the reduction in allocated memory:
| Method | Mean | Error | StdDev | Gen 0 | Allocated |
|---|---|---|---|---|---|
| Substring | 30.28 ns | 0.633 ns | 0.621 ns | 0.0306 | 128 B |
| AsSpan | 17.30 ns | 0.377 ns | 0.476 ns | 0.0134 | 56 B |
When creating a new string whose size is known in advance, we can use the String.Create method to avoid additional allocations. This method works by allocating a string and providing a writable Span within a delegate. It's safe (meaning the string can't be mutated after creation) since the Span can't escape from the delegate.
Important
Avoid capturing variables in the delegate as they incur an allocation - pass all data using the state parameter. Use the static keyword to enforce this.
static string Reverse(this string s) =>
string.Create(s.Length, s, static (span, str) =>
{
str.AsSpan().CopyTo(span);
span.Reverse();
});Tip
Can be written using RandomNumberGenerator.GetString(size, "abcdefghijklmnopqrstuvwxyz"), which is also cryptographically secure.
static string GetRandomString(int size, char min = 'a', char max = 'z') =>
string.Create(size, (min, max), static (span, state) =>
{
for (int i = 0; i < span.Length; i++)
{
span[i] = (char)Random.Shared.Next((int)state.min, (int)state.max + 1);
}
});C# 13 introduces the allows ref struct generic anti-constraint, which allows us to pass a Span<T> to generic methods and types such as the String.Create method. Note we need to use a helper ref struct since ValueTuples are not ones (and unfortunately there's no ref struct record yet).
public static string ReplaceNonAlphanumeric(this ReadOnlySpan<char> s, char replacement)
{
return string.Create(s.Length, new ReplaceNonAlphanumericData(s, replacement), static (span, state) =>
{
for (int i = 0; i < state.S.Length; i++)
{
var c = state.S[i];
span[i] = char.IsLetterOrDigit(c) ? c : state.Replacement;
}
});
}
private readonly ref struct ReplaceNonAlphanumericData(ReadOnlySpan<char> s, char replacement)
{
public readonly char Replacement = replacement;
public readonly ReadOnlySpan<char> S = s;
}ref structs such as Span can't be used in generic type parameters in .NET 8 (since there was no way to prevent boxing), so if we want to base one string on another, we'll have to use unsafe code - a pointer - in order to pass it into the generic SpanAction<T, TArg> delegate.
Important
Be careful when using unsafe code - it could easily lead to memory corruption. Extensively cover it with tests.
#pragma warning disable CS8500
unsafe static string ReplaceNonAlphanumeric(this ReadOnlySpan<char> s, char replacement)
{
return string.Create(s.Length, (replacement, spanPtr: (IntPtr)(&s)), static (span, state) =>
{
var sourceSpan = *(ReadOnlySpan<char>*)state.spanPtr;
for (int i = 0; i < span.Length; i++)
{
var c = sourceSpan[i];
span[i] = char.IsLetterOrDigit(c) ? c : state.replacement;
}
});
}
#pragma warning restore CS8500Using ReadOnlySpan<char> as the parameter rather than a string, allows us a lot of flexibility - we can stack-allocate the string, and we can transform it without allocations, as we'll see in the next section.
We can use Span<char> to mutate strings while minimizing allocations. The MemoryExtensions class contains extension methods that are similar to System.String methods.
For example, we can use MemoryExtensions.Trim, and use the range operator to get a substring.
" abcde ".AsSpan().Trim()[..3] // results in "abc"To use a method that transforms the string, such as ToLowerInvariant, we can use String.Create, chaining multiple mutations within the delegate.
static string SubstringToLower(this string s, int length) =>
string.Create(length, (s, length), static (span, state) => state.s.AsSpan()[..state.length].ToLowerInvariant(span));We can also create an alternative design for the ReplaceNonAlphanumeric method that does not use unsafe code by adding a parameter for the destination span.
static void ReplaceNonAlphanumeric(this ReadOnlySpan<char> source, Span<char> destination, char replacement)
{
for (int i = 0; i < destination.Length; i++)
{
var c = source[i];
destination[i] = char.IsLetterOrDigit(c) ? c : replacement;
}
}To use this method, we'll need to allocate the destination span. If our final destination is a String we can use String.Create, similarly to how we called ToLowerInvariant.
string.Create(length: 3, " a^b ", static (span, str) =>
str.AsSpan().Trim()[..span.Length].ReplaceNonAlphanumeric('_', span)); // results in "a_b"Interpolated string handlers are a C# feature that optimizes string creation by breaking them into multiple "append" calls, rather than allocating a string using String.Format, and potentially creating multiple interim strings (for example, formatting a number). This allows us to "hack" the compiler into all sorts of interesting optimizations.
Handlers available in .NET:
DefaultInterpolatedStringHandleris used in aString.Createoverload, and emitted by the compiler for regular interpolated strings.AppendInterpolatedStringHandleris used inStringBuilder.Append. No longer is it needed to break an interpolated string into multipleAppendcalls for efficiency.AssertInterpolatedStringHandlerandWriteIfInterpolatedStringHandlerare used inDebug.AssertandDebug.WriteIfrespectively and completely skip writing according to the condition argument.TryWriteInterpolatedStringHandleris used byMemoryExtensions.TryWriteand allows us to efficiently write strings into aSpan<char>. There's a similar one inUtf8.TryWriteto write into aSpan<byte>.
dotNext also includes:
BufferWriterInterpolatedStringHandlerthat writes into anIBufferWriter<char>.PoolingInterpolatedStringHandlerwhich allows using aMemory<T>pool rather than allocating new buffers.
Tip
We can reuse any of the above handlers to write custom ones.
The following method checks if the host & port part of a Uri matches a list of patterns. Rather than using Uri.GetLeftPart which allocates a string, we'll use TryWrite. Because it uses an interpolated string handler, there are no additional allocations (for example, to convert the port number into a string). Regex also works directly with Span<char>.
[SkipLocalsInit]
bool MatchesUriPattern(IEnumerable<Regex> patterns, Uri uri)
{
var maxLength = uri.Scheme.Length + "://".Length + uri.Host.Length + ":".Length + 5;
using SpanOwner<char> hostAndPort = maxLength <= 256 ? new(stackalloc char[256], maxLength) : new(maxLength);
hostAndPort.Span.TryWrite($"{uri.Scheme}://{uri.Host}:{uri.Port}", out var written);
var hostAndPortSpan = hostAndPort.Span[..written];
foreach (var pattern in patterns)
{
if (pattern.IsMatch(hostAndPortSpan))
{
return true;
}
}
return false;
}Interpolated string handlers have overloads for ReadOnlySpan<char>, which allows us to get substrings without allocating new strings.
Tip
Consider String.Create and String.Concat before using an interpolated string, as they would be more efficient.
string Format(string str, int num)
{
var index = str.IndexOf(':');
return $"{str.AsSpan()[..index]} {num} {str.AsSpan()[(index + 1)..]}";
}StringBuilder is a reference type and often in order to avoid allocating new ones, we pool them, or store them in reusable thread-static fields. As an alternative, we can use DefaultInterpolatedStringHandler as an append-only value type string builder.
Tip
StringBuilder works more like a linked list of arrays while DefaultInterpolatedStringHandler works more like a dynamic array. This means that StringBuilder might perform better when we can't approximate the final size of the string.
Tip
When possible, it's preferable to use stackalloc to provide the initial buffer as it performs better. If we don't provide an initial buffer, it uses a rented array with a size calculated from the parameters literalLength and formattedCount.
Important
The handler uses ArrayPool<char> internally when it grows out of the initial buffer, so we must call ToStringAndClear to return the rented array rather than ToString.
string BuildString(int count)
{
var initialBuffer = (stackalloc char[256]);
var builder = new DefaultInterpolatedStringHandler(0, 0, CultureInfo.InvariantCulture, initialBuffer);
builder.AppendLiteral("hello");
for (int i = 0; i < count; i++)
{
builder.AppendLiteral(", ");
builder.AppendFormatted(i);
}
return builder.ToStringAndClear();
}Textproperty - returns aReadOnlySpan<char>of the current content, useful for inspecting the buffer without creating a string.Clear()method - resets the handler for reuse, avoiding re-allocation.
var buffer = (stackalloc char[256]);
var handler = new DefaultInterpolatedStringHandler(0, 0, CultureInfo.InvariantCulture, buffer);
handler.AppendLiteral("Hello");
Console.WriteLine(handler.Text.Length); // 5
handler.Clear();
handler.AppendLiteral("World");
var result = handler.ToStringAndClear(); // "World"MemoryExtensions has overloads for many search methods that accept an IEqualityComparer<T>. This enables custom equality logic for span operations like IndexOf, Contains, Count, and more.
var comparer = StringComparer.OrdinalIgnoreCase;
ReadOnlySpan<string> names = ["Alice", "Bob", "Charlie"];
// Case-insensitive search
names.Contains("alice", comparer); // true
names.IndexOf("BOB", comparer); // 1
names.Count("charlie", comparer); // 1This also works for StartsWith, EndsWith, SequenceEqual, and the *Any/*Except variants:
ReadOnlySpan<string> values = ["Hello", "World", "hello"];
values.ContainsAny(["HELLO", "WORLD"], StringComparer.OrdinalIgnoreCase); // true
values.IndexOfAnyExcept(["hello"], StringComparer.OrdinalIgnoreCase); // 1 ("World")To get a hexadecimal string representing the hash of a string, we'll use the SHA256 algorithm (it could be replaced with others available in .NET).
Important
The methods below will produce different hashes.
We can use MemoryMarshal to reinterpret the string's char array as bytes (an
However it can (depending on the input) double the time to hash the string as we'll get
static string GetSha256(this string s)
{
var hash = (stackalloc byte[32]);
SHA256.HashData(MemoryMarshal.AsBytes(s.AsSpan()), hash);
return Convert.ToHexString(hash);
}UTF-8 encoding can produce a lower byte count for many strings, which would make the hash function faster. We'll use SpanOwner again to stackalloc or rent an array.
const int StackallocThreshold = 256;
[SkipLocalsInit]
static string GetSha256(this string s)
{
int inputByteCount = Encoding.UTF8.GetByteCount(s);
using SpanOwner<byte> encodedBytes = inputByteCount <= StackallocThreshold ? new(stackalloc byte[StackallocThreshold], inputByteCount) : new(inputByteCount);
int encodedByteCount = Encoding.UTF8.GetBytes(s, encodedBytes.Span);
var hash = (stackalloc byte[32]);
SHA256.HashData(encodedBytes.Span[..encodedByteCount], hash);
return Convert.ToHexString(hash);
}MemoryExtensions methods Split and SplitAny allow us to split strings with no allocations. Unlike String.Split, it writes the results to a Span<Range>, which means we have to pre-allocate the ranges. If there are more matches than the ranges provide, the last range will contain the remainder of the string.
Tip
We're using collection expressions to create a ReadOnlySpan<string> of separators (of course, it can also be allocated statically once).
var stringToSplit = ";;11==22";
var spanToSplit = stringToSplit.AsSpan();
var ranges = (stackalloc Range[2]);
var count = spanToSplit.SplitAny(ranges, [ "==", ";;" ], StringSplitOptions.RemoveEmptyEntries);
Debug.Assert(count == 2);
var result = (int.Parse(spanToSplit[ranges[0]]), int.Parse(spanToSplit[ranges[1]])); // results in (11, 22)Another option is to use SpanSplitEnumerator<T>-returning methods, which allow us to lazily iterate over the results without preallocating storage for ranges. While it does not feature StringSplitOptions, it works for any span type, not just char.
var stringToSplit = "11==22"u8;
foreach (var range in stringToSplit.Split((byte)'='))
{
var span = stringToSplit[range];
// process span
}Finally, the Regex.EnumerateSplits() method provides the full power of the regular expression engine, works with ReadOnlySpan<char>, and also performs the matches lazily.
Methods like ContainsAny and IndexOfAny can benefit from various optimizations, depending on the characters searched. For example, if the value contain only ASCII characters, or whether it's up to 5 characters, or represents a contiguous range (e.g. a-z). Determining the optimization is done once when StringValues is created.
Many .NET classes, such as JSON parsing, regular expressions and Uri have been enhanced with it.
private static readonly SearchValues<char> s_lineEndings = SearchValues.Create("\n\r\f\u0085\u2028\u2029");
int CountLineEndings(ReadOnlySpan<char> s)
{
int count = 0;
int pos;
while ((pos = s.IndexOfAny(s_lineEndings)) >= 0)
{
count++;
s = s.Slice(pos + 1);
}
return count;
}The Regex.Matches method returns only when it found all matches of a pattern, and it's a relatively large object, consisting of nested Groups and Captures collections, which incurs a lot of allocations. If we need something leaner, we can use Regex.EnumerateMatches which returns a lazy enumerator of ValueMatch structs. It's amortized allocation-free and accepts ReadOnlySpan<char> as the input.
However, the ValueMatch struct contains much less information - just the Index and Length of the match. Most notably missing are groups. This can be sometimes circumvented by using lookahead/lookbehind assertions.
var pattern = @"(?<=text\s*)with";
var builder = new StringBuilder();
for (var i = 0; i < 1_000_000; i++)
{
builder.Append(" sample text with matches ");
}
var inputSpan = builder.ToString().AsSpan();
var count = 0;
try
{
foreach (var match in Regex.EnumerateMatches(inputSpan, pattern, RegexOptions.None, TimeSpan.FromMilliseconds(1)))
{
var matchSpan = inputSpan.Slice(match.Index, match.Length); // equals "with"
count++;
}
}
catch (TimeoutException)
{
// handle timeout
}