SHARE:

Introducing C#11: UTF-8 String Literals

Introduction

.NET encodes strings with UTF-16 encoding. However in the world of the web, the encoding used is UTF-8 so when a developer tries to create strings in UTF-8 it becomes tedious. C# 11 fixes this by making it easier to create UTF-8 strings. In this post I will show you how.

An easier way to create UTF-8 strings

Usually encoding a string into an UTF-8 byte representation would be made as follow:

However, it leads to mutable array of bytes and it’s not performant because the encoding is made at the runtime. C# 11 simplifies it with a lighter syntax (u8 suffix), and the UTF-8 type is enforced at the compile time which makes the encoding process more performant, bonus it’s represented by an immutable ReadOnlySpan<byte> type:

If you still want to get an array of byte you can do the following:

Note that the ReadOnlySpan<byte> or byte[] types are enforced at compile time BUT UTF-8 strings aren’t compile time constants, so it means they can’t be used as default parameters in functions and will lead to a comilation error:

Error CS1736 Default parameter value for ‘message’ must be a compile-time constant.

Example:

Conclusion

Keep in mind that the usage of UTF-8 Strings literals would be restricted to web scenarios, so you won’t use this feature that often. Anyways, if you have to use UTF-8 Strings Literals feature, I hope this post would help you 🙂

Written by

anthonygiretti

Anthony is a specialist in Web technologies (14 years of experience), in particular Microsoft .NET and learns the Cloud Azure platform. He has received twice the Microsoft MVP award and he is also certified Microsoft MCSD and Azure Fundamentals.