How to Optimize Common JavaScript Array Patterns

I’m a fan of the mantra, “Make it work, make it right, make it fast.” However, I rarely reach the “Make It Fast” part since most of the code I write doesn’t need to be blazingly fast. This is a sad reality as a software developer who gets satisfaction out of optimizing their code. So whenever there is a performance issue in my code, I get excited because it means I’m able to scratch that optimization itch. Let’s go over a few examples of common operations performed on JavaScript array patterns and how we can make them blazingly fast.

Notes before reading

  • The goal of this post isn’t to squeeze every ounce of performance out of your code. It’s to provide methods that are easy to read and understand without needing an in-depth understanding of data structures and algorithms
  • Before we jump into optimizing any code, it will be useful for you to read up on maps and sets in JavaScript, since all the examples I’m discussing use them.
  • For each operation we are going to be looking at three ways of doing it
    • Native JavaScript array methods (filter, reduce, map, etc.).
    • Lodash.
    • Utilizing maps and sets.
  • Each example will include benchmarks, since optimization is useless unless you measure it.
  • You’ll notice in most of these examples that Lodash is just as fast if not faster than maps and sets. Despite this, I still chose to include the map and set methods in the case that someone doesn’t want to include a dependency on Lodash.
  • I’m only showing immutable operations since most of the work I do benefits from immutable operations.

All of the code used to run these benchmarks can be found here.

Removing Duplicates

Native Array Methods

Code

list.filter((item, pos) => {
    return list.indexOf(item) === pos;
});

Benchmarks

Time(ms) Elements in array
0.06370800733566284 10
0.00720900297164917 100
0.24524998664855957 1000
20.85587501525879 10000
2028.1058329939842 100000
202138.53395798802 1000000

Based on the benchmarks, this code is fast enough up until somewhere between 10,000 and 100,000 records, and unacceptable at any point above that. If this code were running on a browser, your website would be frozen for about three minutes while this is occurring.

Lodash (Primitive Values)

Code

_.uniq(list);

Benchmark

Time(ms) Elements in array
0.04329100251197815 10
0.09937500953674316 100
0.060499995946884155 1000
0.49754098057746887 10000
4.50279101729393 100000
46.793334007263184 1000000

Whoa! That’s nearly 4,000 times faster once you hit 1,000,000 records! Lodash must be doing something right.

Lodash (Non-Primitive Values)

That improvement is great, but this will only work with primitive values (string, number, boolean, etc.). What if I want to do uniqueness based on a property?

Code

_.uniqBy(list, comparator)

Benchmark

Time(ms) Elements in array
0.13112500309944153 10
0.07079198956489563 100
0.42158299684524536 1000
5.113041996955872 10000
12.49974998831749 100000
73.71970799565315 1000000

It’s a smidge slower but still under 100 milliseconds!

Set (Primitive Values)

Code

[...new Set(list)]

Simple enough? This works because sets only allow unique values.

Benchmark

Time(ms) Elements in array
0.008958995342254639 10
0.005667001008987427 100
0.0382080078125 1000
0.36887499690055847 10000
3.9749999940395355 100000
43.52562499046326 1000000

Just as fast as Lodash! Now you don’t need to install yet another npm package!

Map (Primitive Values)

If we try the above example with non-primitive values this won’t work, since sets only identify uniqueness on primitive values. This is where we can use a map!

Code

new Map(list.map((item) => [extractKey(item), item])).values();

Maps work similarly to sets in that key values must be unique. However, their keys map to values!

Benchmark

Time(ms) Elements in array
0.026500016450881958 10
0.014999985694885254 100
0.12958300113677979 1000
1.3451250195503235 10000
8.251917004585266 100000
158.00600001215935 1000000

This is twice as slow as Lodash. However, if you’re avoiding including unnecessary dependencies, I think this is still acceptable enough to work.

Difference Between Two Lists

Native Array Methods

Code

When I googled “diffing arrays in JavaScript,” the first result that came up on StackOverflow was this:

let difference = arr1.filter(x => !arr2.includes(x));

Let’s benchmark it!

Time(ms) Elements in array
0.01491701602935791 1
0.005333006381988525 10
0.04645800590515137 100
3.2547500133514404 1000
313.62366700172424 10000
31434.29237499833 100000
3210745.023000002 1000000

This is definitely not okay. A total of 100,000 records takes 30 seconds, which is very slow. But once you hit 1,000,000, it takes almost an hour. Let’s see if the other methods offer any improvement.

Lodash (Primitive Values)

Code

_.difference(arr1, arr2);

Benchmark

Time(ms) Elements in array
0.12604200839996338 1
0.09495800733566284 10
0.26454201340675354 100
1.7619580030441284 1000
11.456708997488022 10000
30.76341700553894 100000
376.1795829832554 1000000

Much better. Even with a million records, we didn’t even break a second!

Lodash (Non-Primitive Values)

Code

Again in this example, if we try this method with no primitive values it won’t work. Thankfully, Lodash provides a solution.

_.differenceBy(arr1, arr2, comparator)

Benchmarks

Time(ms) Elements in array
0.24208301305770874 1
0.1150830090045929 10
1.638416975736618 100
1.484584003686905 1000
15.348375022411346 10000
35.60387501120567 100000
590.6338749825954 1000000

It’s 200ms slower, but still much faster than using native array methods.

Set (Primitive Values)

Code

const arr2Set = new Set(arr2);
arr1.filter((x) => !arr2Set.has(x));

Benchmark

Time(ms) Elements in array
0.02225002646446228 1
0.008125007152557373 10
0.032958000898361206 100
0.30558401346206665 1000
3.6421670019626617 10000
43.25270900130272 100000
737.2637079954147 1000000

Again, this is slower than Lodash but quicker than native array methods. This method is quicker than using native array methods due to how
Set.has works. A set calculates a hash of a value when storing it, and stores that value under that key. This makes accessing a value take O(1) time, as opposed to Array.includes which operates at O(n) time. Pretty cool right?

Map (Non-Primitive Values)

Code

Now let’s make this work for primitive values.

const arr2Set = new Map(arr2.map((x) => [extractKey(x), x]));
arr1.filter((x) => !arr2Set.has(extractKey(x)));

Benchmark

Time(ms) Elements in array
0.04791700839996338 1
0.02158302068710327 10
0.0885000228881836 100
0.517208993434906 1000
4.826333999633789 10000
88.70929199457169 100000
1597.0950419902802 1000000

This is the first optimization that passes the one-second mark. It’s still faster than native methods, but the two-fold speed increase might make it worth it to include Lodash in your project.

Aggregating Two Lists By A Property

This operation will take two lists that have a common property and return a list of objects that hold those matching objects.

Note: For this operation, we are making the following assumptions:

  • Both lists are the same length.
  • There are entries with duplicate properties in either list.
  • Each entry in each list has a corresponding entry in the other list.

Native JavaScript Methods (map and find)

Code

listB.map((b) => ({
    b: b,
    a: listA.find((a) => a[aProperty] === b[bProperty]),
  }));

Benchmarks

Time(ms) Elements in array
0.021625012159347534 1
0.011750012636184692 10
0.13941702246665955 100
5.005832999944687 1000
208.6930420100689 10000
20707.64387497306 100000
2087215.1352920234 1000000

Using JavaScript’s array methods is unsurprisingly slow again. Seeing a pattern yet?

Native Javascript Methods (reduce and map)

When doing benchmarks for the map/set version of this operation, I found another more performant way to do this with native array methods.

Code

const listAMapById = listA.reduce((acc, a) => {
    return Object.assign(acc, { [a[aProperty]]: a });
}, {});
listB.map((b) => ({
    b: b,
    a: listAMapById[b[bProperty]],
}));

In this example, we process one of the lists into an object, then index onto that list when finding the object for the other list.

Benchmarks

Time(ms) Elements in array
0.03525000810623169 1
0.030667006969451904 10
0.15033301711082458 100
1.9047499895095825 1000
7.687875002622604 10000
84.34062498807907 100000
960.1207909882069 1000000

Whoa! Using the native JavaScript array methods wasn’t incredibly slow for once!

Lodash

Code

_.mergeWith(
    _.sortBy(listA, aProperty),
    _.sortBy(listB, bProperty),
    (a, b) => ({ a, b })
);

I wasn’t able to find a method provided by Lodash that did this out of the box, but I came up with this. This method would not work if any of the above assumptions were not met.

Benchmarks

Time(ms) Elements in array
0.4717079997062683 1
0.24620798230171204 10
0.34333401918411255 100
2.9508340060710907 1000
17.965292006731033 10000
194.1733749806881 100000
6806.113000005484 1000000

In a shocking upset, using Lodash doesn’t beat native javascript arrays! This is probably caused by needing to sort both arrays before actually merging them together.

Map

Code

const listAMapByProperty = new Map(listA.map((a) => [a[aProperty], a]));

listB.map((b) => ({
    b,
    a: listAMapByProperty.get(b[bProperty]),
}));

Benchmark

Time(ms) Elements in array
0.02512499690055847 1
0.016208022832870483 10
0.027875006198883057 100
0.23816600441932678 1000
2.2608749866485596 10000
24.74924999475479 100000
576.2636669874191 1000000

Native methods may have beaten Lodash this time, but using a map in this situation seems to be the fastest out of them all.

Takeaways

These are a few things I have taken away from doing these benchmarks.

Using Lodash is the fastest (most of the time).

After I ran these benchmarks, I looked at the source code for the Lodash methods I used. In a lot of cases, it is using sets and maps to get this performance. However, it also has tweaks in place that squeeze out extra performance.

So if performance is important to you and you don’t mind pulling in an npm package, you should probably use Lodash if you’re processing data with a lot of entries. However this isn’t the case all the time, so be sure to research other methods and run benchmarks.

You don’t need Lodash for good performance.

Although Lodash is the fastest, you can get pretty darn close to its speed without it. For more things you can do without Lodash, give this repo a look: You-Dont-Need-Lodash-Underscore.

Maps and sets are awesome!

After running all of these benchmarks, I’m definitely going to start using sets and maps more in my code. Not only are they really fast, but they also are easy to use and provide a good API for operating with them.

JavaScript array methods are fast enough for smaller amounts of data.

If you aren’t operating on arrays with more than 10,000 entries, you probably don’t need to optimize. All of the operations I benchmarked took less than 300ms to execute on data sets of that size

Conversation
  • Andrew says:

    Can’t believe it! So cool!

  • Join the conversation

    Your email address will not be published. Required fields are marked *