I am very disappointed in how the official React documentation recommends we build URL strings to query APIs in JavaScript.
I was reading about custom hooks and came across this:
fetch(`/api/cities?country=${country}`)
.then(response => response.json())
.then(json => {
if (!ignore) {
setCities(json);
}
});
This code works correctly in most expected cases—for example, if “country” is an ISO 3316-1 country code. These are made up strictly of alphanumeric characters. They will survive query string serialization with no changes.
But it’ll fall apart rather quickly if you throw other things at it.
Break bad query constructions.
This is a problem because basic “backtick strings” (correctly known as template literals) aren’t designed for creating strings to serialize and pass to other systems. They work best when you think of them instead as a way to print things.
As an example of how they’re unsuitable for this, consider a form where “country” is a simple input box, and we input “Saint Vincent & the Grenadines”.
This means:
- the query sent to the server is
/api/cities?country=Saint Vincent & the Grenadines
, and - the server’s request will have two parameters: “country” and one called ” the Grenadines” (note the leading space!) with an empty value.
“&” and “=” are perfectly valid characters to use in a query string—particularly free-form text searches. But if they’re put together using a standard backtick string, they’ll be misunderstood on the other end as dividing search parameters and their values.
Best case, you get the wrong data. Worst case, the API tries to interpret the extra parameter, and you’re in for some really head-scratching debugging.
What if it’s not even a string?
In fact, because it’s not TypeScript, we don’t even have any level of assurance that “country” is a string.
Very interesting things can happen if it’s not. For example, arrays will be printed joined with commas, like “a,b,c”. If one of your array elements has a comma in it, it’ll be indistinguishable from the element separators:
const a = ["a,b", "c", "d"]
`${a}` // 'a,b,c,d'
`${a}`.split(','); // [ 'a', 'b', 'c', 'd' ]
(If you do want to pass an array, many query string parsing libraries will treat repeated keys as individual elements of an array. For example, “a=b&a=c” will give you the array “[‘b’, ‘c’]” for “a”.)
Let’s do it a little better.
There is a simple, built-in answer. Unfortunately, it does have some caveats.
JavaScript includes URLSearchParams, which does take care of many problems. For example, instead of this:
`/api/cities?country=${country}`
You can write this:
'/api/cities?' +
new URLSearchParams([
[ 'country', country ]
])
URLSearchParams’ toString method will take care of several things for you, such as escaping parameter values correctly. (It’s called implicitly here.)
If “country” is, you’ll get country=Saint+Vincent+%3D+the+Grenadines
. This will come out correctly on the other end as a single country parameter with all characters intact.
URLSearchParams does fall down in other cases when parameter values aren’t strings. For example, if one of your parameters isn’t a string but undefined, URLSearchParams will literally put the string “undefined” into your query.
Arrays are also a problem, concatenating values with commas much like backtick strings do. If you might have array values, you’ll have to split those up.
I strongly recommend leveraging TypeScript to track types throughout your codebase to ensure you don’t trip over these kinds of issues. Also, use something like Zod validation to ensure things are the types you expect when they come from external sources.
If you do make sure you only feed URLSearchParams strings, you’ll be in good shape. And please remember: string formatting tools are for printing, not for serialization.
Great tip and reminder!
I think you are trying to solve a problem at the wrong end.
On the server side you’ll have to deal with all incoming data anyway.
Of course you will! But if you don’t serialize the query parameters correctly on the client side, you won’t be able to work with them on the server side.
You’ll have lost important information about whether a parameter contains a separator character such as an ampersand, or it is actually another parameter entirely. Serializing queries correctly removes that ambiguity.
Thankfully, most server frameworks already include query parsing, and they expect queries to be serialized as above.
You are soooo right! But it also means it did not understand much!
What you are saying applies to template literals. It may be worth pointing out that tagged templates can do better (because many people think they are basically the same as template literals).
I’m not sure about this one: “Building” (as in OOP builder pattern) *may* be a better (more general?) word than “serialization”. But there is also a risk of people not knowing what that word means.
Re: tagged templates: I did consider this and also looked around to see if anyone had built it already. It’s possible to build a really basic implementation that just escapes the values of each key/value pair, which could cover a number of use cases and could also enforce that values are strings.
That could be valuable, but I wanted to show how to do it with built-in library calls here. Maybe in the future.
This has nothing to do with template literals (what you are calling “backtick strings”) and it is just a function of sanitizing and joining the fields in question.
`/api/cities?country=${country}`
would work identical to
‘/api/cities?country=’ + country
and also
`/api/cities?${ new URLSearchParams([[ ‘country’, country ]])}`
works the same as
‘/api/cities?’ + new URLSearchParams([[ ‘country’, country ]])
Also, while URLSearchParams is fine for decoding, encodeURIComponent() is likely the better option for encoding as it doesn’t switch spaces to ‘+’. As an example, URLSearchParams would convert the string ‘a + b + c = d’ into ‘a+%2B+b+%2B+c+%3D+d’ but depending on how that is decoded, you can end up with ‘a+++b+++c+=+d’. encodeURIComponent would output ‘a%20%2B%20b%20%2B%20c%20%3D%20d’ which, while not appearing very succinct, is more universally decoded back to the original string.
Object.entries(parameters)
.map(([k, v]) => `${encodeURIComponent(k)}=${encodeURIComponent(v)}`)
.join(‘&’)
is a short method to use encodeURIComponent to encode key and value for all entries in the example parameters object.
yes, and putting the code into a small toQueryString method then would also be my favorite solution.
I would not care if it internally uses the encodeURIComponent or new URLSearchParams.
That’s also a valid way to do it, though less convenient. I do appreciate that it’s more robust.
I would argue that a query string deserializer that doesn’t replace plus with space (i.e. preserving unescaped plus signs) has a bug, though. The spec for form data encoding requires it, and it’s been a thing for the decades I’ve been working on the Web.