Daily Knowledge Drop
Coming with C# 11 (being released later this year, coinciding with the .NET 7 release) the conversion from a string literal to a byte[]
is becoming easier, faster, and more efficient.
The byte[] is often used in dealing with streams (for example) and with the current, and prior C# versions, the conversation from a string to byte[] required as explicit conversion. However with C#11, this conversion is simplified, but also gains a large performance boost.
C# 10 and prior
In the current (and prior) versions of C#, when a string literal is required to be converted to a byte[], the System.Text.Encoding.X.GetBytes
method is used (where X is the encoding method, UTF8 specifically in this post):
byte[] bytes = System.Text.Encoding.UTF8.GetBytes("alwaysdeveloping.net");
using var stream = new MemoryStream();
stream.Write(bytes);
While not especially complicated, this does involve an explicit method call to perform the conversion.
C# 11
With C#11, it's possible to do this with an implicit conversion:
ReadOnlySpan<byte> spanBytes = "alwaysdeveloping.net"u8;
using var stream = new MemoryStream();
stream.Write(spanBytes);
Although a ReadOnlySpan can be used whereever a byte[] is required, if a byte[] is specifically needed:
ReadOnlySpan<byte> spanBytes = "alwaysdeveloping.net"u8;
byte[] bytes = spanBytes.ToArray();
using var stream = new MemoryStream();
stream.Write(bytes);
The u8
suffix on the string, indicates to the compiler that it should convert the string value into an array of bytes - or more specifically in this case, a ReadOnlySpan of bytes
. Using a ReadOnlySpan is more efficient and uses no additional memory - but if a byte[] is specifically required, the ToArray method can be leveraged to get a byte[] from the ReadOnlySpan.
Performance
Below are a couple of simple benchmarks run to compare the performance and memory usage of the old and new methods:
[Benchmark(Baseline = true)]
public void GetBytes()
{
byte[] bytes = System.Text.Encoding.UTF8.GetBytes("alwaysdeveloping.net");
}
[Benchmark]
public void StringLiteral()
{
ReadOnlySpan<byte> spanBytes = "alwaysdeveloping.net"u8;
}
Method | Mean | Error | StdDev | Median | Ratio | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
GetBytes | 19.5843 ns | 0.4163 ns | 0.6956 ns | 19.6017 ns | 1.000 | 0.0076 | 48 B |
StringLiteral | 0.0198 ns | 0.0209 ns | 0.0241 ns | 0.0085 ns | 0.001 | - | - |
As one can see, the new method is exponentially faster
and requires zero additional memory
when compared with the current method.
Extend features
In the initial announcement and previews of this feature, the implicit conversion was done without specifying the u8
:
byte[] array = "hello";
Span<byte> span = "dog";
ReadOnlySpan<byte> span = "cat";
However, in subsequent previews, the u8
was added to specifically indicate that the string literal should be converted to UTF8. Hopefully in future C# language updates, more encoding methods are added, to at least bring this feature on par with using System.Text.Encoding.X.GetBytes.
Notes
A relatively small update on the surface, but if your application makes heavy use of string literals and encoding, converting to this new feature should gain you a performance boost.
References
Literals - Ignore everything you have seen so far
C# 11 Preview Updates – Raw string literals, UTF-8 and more!
Daily Drop 163: 19-09-2022
At the start of 2022 I set myself the goal of learning one new coding related piece of knowledge a day.
It could be anything - some.NET / C# functionality I wasn't aware of, a design practice, a cool new coding technique, or just something I find interesting. It could be something I knew at one point but had forgotten, or something completely new, which I may or may never actually use.
The Daily Drop is a record of these pieces of knowledge - writing about and summarizing them helps re-enforce the information for myself, as well as potentially helps others learn something new as well.