Magic strings
Heads Up!
This article is several years old now, and much has happened since then, so please keep that in mind while reading it.
Why are magic strings so bad?
Model.GetPropertyValue("bodyText"). Here, "bodyText" is the magic string. The text can be used in multiple places (maybe in the meta tags as well as the page body, or on different templates), has to be in this exact format (I can't swap it out with "Can I have the text for the body please?", as polite as that is) and refers to an external entity (the Umbraco database via the Umbraco APIs in this case). These things make it a particularly code-smelly example of a magic string that needs replacing.
Magic strings is a term used to cover multiple scenarios of when a string is used in place of a more robust method.
Typos are the arch-enemy of magic strings. If we're having a particularly productive, Hollywood-hacker-style afternoon, smacking away at those keys, you'd be excused for mis-hitting the odd one.
They can be particularly difficult to spot too: depending on your font, "bodyIext" could easily go unnoticed. The folks that write our IDEs know this, and will correct us where possible. A string, however, cannot be easily corrected because any value of string is a valid string. As far as the IDE knows, you meant to type "bossyTRex". These cause runtime errors rather than the easier-to-spot compile-time issues.
Fat-fingered-ness aside, magic strings can be exclusive to those who speak a different first language or people who have difficulty reading and writing. Even if you speak the same language, issues can arise when trying to agree how to spell "colour"!
Or how about a piece of logic that sets this.ValidationMethod = "none". We then later decide we need to change the validation method, but it's not obvious what the other possible values are. Magic strings aren't self-documenting and so we rely on reference documents or "there's a comment that explains how to use it somewhere" or "didn't I put that in the README file?". This is far from ideal.
And if all that wasn't enough, magic strings make it more difficult to replace a value, test your code or change logic on your test environments compared to production. We can do better!
Is there such thing as a muggle string?
Are there cases when a string is non-magic? Or perhaps when a magic string is ok? Yes, not all strings are magic strings and not all magic strings are necessarily bad.
If you're typing a string value in code, the blog has boiled it down to 4 questions you need to ask yourself:
Can you easily change a few values at once?
Does it improve readability?
Is it part of configuration?
This won't cover all scenarios, but it's a good place to get started. If any of your answers to these questions is "yes", it's time to look into alternatives.
Did you just casually drop the term "magic numbers" too? I've only just got to grips with magic strings!
Ah, yes. Sorry about that!
Yes, magic numbers are the lesser talked about sibling of magic strings. But they can cause the same sorts of issues.
Take this code as an example:
public void PrepareForCheckout() {
if(Customer.Country == "United Kingdom") {
FinalValue = Value * 1.175;
}
}
It's not immediately obvious what this 1.175 is. You might have picked up that it's the old tax rate in the UK of 17.5%, but it could easily be missed, especially by a non-British developer.
Now that we know this is the VAT rate, we also know it's wrong. The UK changed its VAT rate to 20% a few years back. Correcting this value could be difficult too - we need to replace all instances, but if it's written as value = value + value * (17.5/100) (someone has gone about code clarity in an odd way!), a simple find-and-replace won't suffice.
If, for example, we had a constant of this value somewhere that we reference from everywhere, it would be both clearer what the code was doing and
public void PrepareForCheckout() {
if(Customer.Country == "United Kingdom") {
FinalValue = Value * UkVatRate;
}
}
(Now, there are many other tweaks we could make to this code, but I've just solved the one we've been talking about.)
ModelsBuilder to the rescue
The first example I gave in this article was Model.GetPropertyValue("bodyText"), which, back in the Umbraco 7-days of 2013, identified it as a problem and developed the package. ModelsBuilder is now a part of Umbraco and if you're not using it already, you really should be. It removes the need for magic strings when interacting with IPublishedContent across our Umbraco solutions by generating models for each document type. Dave Woestenborghs wrote an article on back in 2016 which still holds up well.
A constant problem
I mentioned creating a constant just now. And that's probably one of the simplest methods of avoiding magic strings. Depending on the size and complexity of your project, it can be done by pulling your strings up into constant declarations in your class.
public class Cart {
private const string BodyTextKey = "bodyText";
private const decimal UkVatRate = 1.175;
//...
}
In .NET, constants (const declarations) are actually pretty efficient - at compile-time, they're not allocated any memory like variables are, so it's equivalent performance-wise to directly using the string in each location. It does, however, help us out a lot pre-compilation by removing the magic strings.
To make constants reusable across your project you might find it useful to create a Constants.cs file in your project and stick all your relevant strings in there. This can get pretty cluttered on bigger projects, so you may need to categorise your constants further, grouping them into logical classes and namespaces or adding your constants to related services like so:
public static class VatHelper {
public const decimal UkVatRate = 1.175;
//...
}
In fact, with a helper, we could go even further in case we need to apply VAT to other countries in the future:
public static class VatHelper {
public decimal GetVatRate(string country) {
switch (country) {
case Countries.UnitedKingdom:
return 1.175;
default:
return 1;
}
}
}
//...
public class Cart {
private const string BodyTextKey = "bodyText";
//...
public void PrepareForCheckout() {
FinalValue = FinalValue * VatHelper.GetVatRate(Customer.Country);
}
//...
}
(OK, so in most real-world examples the country and its respective VAT rate would probably both live in a database somewhere, but it works as an example!)
Enum-ber of other solutions
Constants are great and all, but they can get a little bit repetitive. Enumerated types (that's enums to you and me) are useful for a finite number of constants that won't change.
Enums are a great solution to the ValidationMethod setting we were talking about earlier. We could create an enum with all the possible values and use this enum in place of our strings:
public enum ValidationMethod {
None,
Required,
EmailAddress,
PhoneNumber,
Numeric
}
//...
this.ValidationMethod = ValidationMethod.None;
//...
switch (field.ValidationMethod) {
case ValidationMethod.Required:
return !string.IsNullOrEmpty(field.Value);
// ...
}
Enums also work well for statuses:
public enum OrderStatus {
Draft,
Paid,
Pending,
Shipped,
Recieved
}
//...
Order.Status = OrderStatus.Paid;
//...
if(Order.Status == OrderStatus.Paid) {
ProcessOrder();
}
Magic numbers and enums
Under the covers, enums actually store integers for each value. In the case of our OrderStatus enum, Draft is equivalent to 0, Paid is 1, Pending is 2, etc. You can even explicitly set the number applied to each value which can be useful if you need to map a readable name to a number, or if you need to add a value to an existing enum so enums can also provide a valuable solution to magic numbers.
public enum OrderStatus {
/// We want to add a new status at the top,
/// but also to maintain the mapping of the original enum
/// so we specify the values
Cancelled = -1,
Draft = 0,
Paid = 1,
Pending = 2,
Shipped = 3,
Recieved = 4
}
//...
/// Readable definitions for all possible
/// response codes from the ACME API docs
public enum AcmeApiResponseCodes {
Ok = 200,
BadRequest = 400,
Unauthorized = 401,
InvalidClientId = 490,
ExpiredClient = 491
}
JSON and enums
As a side effect of the underlying value being an integer, you might notice that by default if you return an enum as a JSON result, you'll end up returning the number rather than the pretty enum.
{
"orderNumber": 51138461315,
"status": 2,
"...": "..."
}
JSON (and JavaScript too, without some workarounds) don't support enums. So we have two choices here: return the number, or return the string. You've got a few options if you want to convert the enum to and from a string if you'd rather.
You can set the converter for your individual property with an attribute:
public class Order {
public long OrderNumber { get; set; }
[JsonConverter(typeof(JsonStringEnumConverter))]
public OrderStatus Status { get; set; }
//...
}
Alternatively, set the attribute it on the enum definition if you want it to be serialized to a string every time you use it. While we're here, you'll notice you can also customise how the enum is rendered as a string with the EnumMember attribute.
[JsonConverter(typeof(JsonStringEnumConverter))]
public enum OrderStatus {
Draft,
Paid,
//...
[EnumMember(Value = "Recieved by customer")]
RecievedByCustomer
}
Or finally, you can set the behaviour globally for all enums by modifying the ConfigureServices method in Startup.cs. We need to add AddMvcAndRazor to get ahold of the MVC config before adjusting the default JSON options.
public void ConfigureServices(IServiceCollection services)
{
services
.AddUmbraco(_env, _config)
// Here's the good stuff
.AddMvcAndRazor(mvc =>
{
mvc.AddJsonOptions(json =>
{
json.JsonSerializerOptions.Converters.Add(new JsonStringEnumConverter());
});
})
// That's all, folks!
.AddBackOffice()
.AddWebsite()
.AddComposers()
.Build();
}
Newtonsoft.Json (rather than System.Text.Json
Flagged enums
Our ValidationMethod example from earlier applies here. It may be possible to have a field that needs validating as Required but also EmailAddress. With strings, we could have done this as an array and we can do the same with enums if we want...
// How we might have allowed multiple values with magic strings
field.ValidationMethods = new string[] { "email address", "required" };
// We can do the same when using enums too
field.ValidationMethods = new ValidationMethod[] { ValidationMethod.EmailAddress, ValidationMethod.Required };
But we can also do one better with flagged enums to allow an enum to have multiple values.
[Flags]
public enum ValidationMethod {
None = 0,
Required = 1,
EmailAddress = 2,
PhoneNumber = 4,
Numeric = 8
}
//...
this.ValidationMethods = ValidationMethod.EmailAddress | ValidationMethod.Required;
//...
if(this.ValidationMethods.HasFlag(ValidationMethod.Required)) {
return !string.IsNullOrEmpty(field.Value);
}
In this example we've added the [Flags] attribute to the enum and assigned each enum a value - this bit is important. We can then assign multiple values using the "bitwise OR operator" or pipe character (|) - it's the same we use two of in an or statement (this || that).
To check if an enum has a flag, we can use the HasFlag on the enum value. I've had to refactor our switch statement from earlier to use an if statement for each value we want to check for.
You might notice the integer I've assigned to each number isn't in order. Well, it is in order, but using a doubling sequence: 1, 2, 4, 8... etc. Each number is double the previous value. This is important because a flagged enum still only stores one integer!
It works by using bitwise operations (there's that word again) which, simply put, is looking at the individual "bits" (and I mean that in the technical term!) of a binary number and applying an "OR" operation - if any of the bits in that column is 1, it returns a 1, otherwise it returns a 0.
For those curious, let's look at the numbers in the doubling sequence in binary. (Skip ahead if you don't want to think about binary too much!)
0000 (0) 0001 (1) 0010 (2*) 0100 (4) 1000 (8) ...etc
You'll notice, that because they're all powers of 2, each column only ever contains a single 1. Therefore, no matter what combination we make of these in a bitwise OR, we can work out which initial enum values went into creating it.
var a = ValidationMethod.EmailAddress | ValidationMethod.Required;
/// That's 1 and 2
///
/// 0001
/// OR 0010
/// -------
/// 0011 (the last two columns have a 1 in the first number OR second number)
var b = ValidationMethod.Required | ValidationMethod.EmailAddress | ValidationMethod.PhoneNumber| ValidationMethod.Numeric;
/// That's 1 and 2
///
/// 0001
/// 0010
/// 0100
/// OR 1000
/// -------
/// 1111
Required, which is 0001, we simply need to check for a 1 in the last column - the HasFlag method is actually using the bitwise AND flag (&
this.ValidationMethods.HasFlag(ValidationMethod.Required);
// is the same as
(this.ValidationMethods & ValidationMethod.Required) == ValidationMethod.Required;
// is the same as
// Binary literals in C# are prefixed with 0b (C# 7+)
0b0000011 & 0b0000001 == 0b0000001;
/// 0011
/// AND 0001
/// --------
/// 0001 (the last column has a 1 in the first number AND second number)
Because of this behaviour, we can also add a value for common combinations or all into our enum by adding all the values of the combining values together (adding and bitwise operations provide the same result in this case because they're all powers of two, but not normally!)
[Flags]
public enum ValidationMethod {
// Regular items
None = 0,
Required = 1,
EmailAddress = 2,
PhoneNumber = 4,
Numeric = 8,
//Combinations
RequiredEmail = 3 // 0001 | 0010 = 0011 or cheat by doing 1 + 2 = 3
All = 15 // 0001 | 0010 | 0100 | 1000 = 1111 or cheat by doing 1 + 2 + 3 + 4 + 8 = 15
}
Not that an "All" value makes any sense in this case, but it's good to know!
Config your way out of it
One of the questions we asked at the beginning was "is it part of configuration?" This is a good question to ask because it may well change how we deal with it. What is configuration? I like to think of anything that can alter an application's behaviour. This value will have been flagged (in a specification or by yourself) as something that's likely to change - be that in the future or per environment. A URL to an API endpoint is a good example of a configuration variable:
-
it sits outside the control of your application and could conceivably change
-
or you may want to point your staging site at a sandbox version of the API.
Now we're in the land of .NET 5, we can even get rid of some of the magic strings we have historically used to pull values out of config by mapping whole configuration sections to C# objects. I've also used a constant to get the configuration section name, to avoid that as a magic string.
public class AcmeConfig
{
public const string Section = "ACME";
public string ApiBaseUrl { get; set; }
//...
}
//...
public class AcmeClient {
public AcmeClient(IConfiguration configuration) {
var config = configuration.GetSection(AcmeConfig.Section).Get<AcmeConfig>();
//...
}
//...
}
Umbracadabra! Magic-less code
Hopefully, I've been able to explain a little about why avoiding these magic strings might be necessary and how we can improve upon them. Don't go crazy - there's no need to replace every variable in your code, but it's worth thinking about in the future each time you open those double-quotes!
Happy const-ing.. and enum-ing... and Model Building!
Joe Glombek
Joe is on Twitter as @JoeGlombek