ACHIEVE NATIVE IOS PERFORMANCE WITH REACT NATIVE

ACHIEVE NATIVE IOS PERFORMANCE WITH REACT NATIVE

Phase 1: Flux Store Optimizations

Our first idea was to investigate our data flow (pre-dates Redux, but our architecture is similar) for potential performance bottlenecks. We had many stores running many actions and believed some expensive dispatches could have been the cause of the problem. We started by adding some basic logging that showed time spent processing each of our dispatch actions across stores. Next we limited our logging to only show dispatches that took longer than 10ms and immediately saw some points of interest.

  • Every action that triggered an update to our user store — which holds the state of all known users — resulted in a 30ms emit.
  • Our message load/cache restore actions — which directly result in processing and rendering our chat messages — resulted in a 500ms on an iPhone XS and 1000ms on iPhone 6s!

While neither of these actions were directly causing stutter or frame drops, they were clearly wasting a lot of CPU and battery.

User actions

By using the Chrome profiler we realized that one of our top level components was listening to the user store updates. This was problematic since as it was top level, it rendered many children (of which many were not pure). On a very active account, these costly actions and subsequent renders were happening every second.

By removing the dependency from the top level component and instead re-working it so that our child components that more directly required the users listened to it instead, we shaved 30ms from those dispatches. Nice!

Message actions

Meanwhile, the message load/cache actions were initially baffling because they were optimized before and known to be fast in React. Surely if they were this slow, our desktop user’s would have noticed. We switched to the profiler, and unfortunately found that the actions were only taking 20ms in Chrome, which is not what we were seeing in the simulator/device.

What we were dealing with was a difference between V8 and JSC, which we learned by switching back to basic logging to directly observe the impact on a device. After wrapping the components listening to these actions in some timings, it revealed that the bottleneck was in our markdown parsing.

We found that a big cost factor was our escape rule, which was inline compiling a massive regular expression. The fix was easy since the regular expression could have been a constant (never compile regex’s inline in a tight loop!). This was great because it cut the cost down by 30% and now we were at 350ms on iPhone XS and 700ms on iPhone 6S. That is a 15-30FPS gain on loading messages, and directly translated to reduction in TTI — for us this means being able to navigate the UI!

Unfortunately, we were still left with an unacceptable amount of time being spent parsing messages. After further digging we realized that running our Unicode emoji detection was extremely expensive on JSC but basically free on V8. We were — for the last four years — concatenating hundreds of Unicode symbols into a single regular expression which apparently is not a good idea. Luckily, someone brave had created a good Unicode emoji range regex that we could use instead. This removed nearly all the remaining cost! We were down to 30ms on iPhone XS and 90ms on an iPhone 6S. That was a 90% reduction from where we started, shaved nearly a second from TTI, and generally reduced CPU and battery usage significantly!

This phase netted a lot of big and easy wins! Unfortunately, while all actions were now under a few milliseconds, and the app ran better, our goals were not yet reached. Basic logging was no longer giving actionable information, which meant any remaining CPU cost was coming from React’s commits which run on separate ticks. It was time to figure out how to determine the cost of our component hierarchy.

Phase 2: React Component Optimizations

At first we tried to use the Chrome profiler again since React 16+ snapshots its commits in the profiler. However, it records way more than just React. We were also not interested in the cost on V8 since we were trying to improve performance on JSC. Luckily, both of these could be solved since late last year the React team released a React Profiler that focuses on render time, great!

So we fired up the profiler, hit refresh, and then clicked around. A few things quickly jumped out:

  • Our React commits could cost up to 200ms when affecting things at the higher levels of the component tree. On top of this, we were going through up to three commit passes when the app was starting up.
  • When viewing the direct message list, we were looping over the whole list on every render to do sorting (some power user’s could have up to 1500 channels).
  • We had a lot of common views like <Icon> constantly re-rendering despite having no changes in props or state. While these were cheap to render, they added up.
  • FlatList and SectionList, which powered all of our lists, are very expensive and their virtualization is not yet fully optimized. In large servers, for our channel and member lists, they ended up slowly allocating thousands of views which become slow to unmount. Updating our channel list could cost up to 100ms.

Jackpot! There were a lot of actionable items, but not as trivial as the Flux store performance pass.

Commit Passes

Fixing the three commit passes was surprisingly simple. When React Native was released, there was no easy way to get the current dimensions of the screen so we relied on the very first render to trigger an onLayout which we sent to a ScreenStore for others to consume. Luckily, React Native has since improved the Dimensions module by giving it more information and firing events when they change. We made a change so that our stores could initialize with the correct data, avoiding multiple commit passes and even shrinking the view hierarchy. This actually shaved about 150ms off TTI since diffing the tree at the top level was not cheap.

Direct Message Rendering

Fixing the frequent looping over the whole channel list on render was also very trivial, and almost felt silly that we hadn’t fixed this code years ago. We already had a store that maintained a sorted list of the user’s direct messages. This drop in replacement shaved about 2ms per render, but if you were viewing direct messages with a long list of many active users, this would happen quite frequently.

Pure Component

A lot of engineers at Discord used to default to using PureComponent since it avoids renders if props don’t change. Unfortunately, it isn’t that simple.

When creating a component, one should consider how it will be used and if the props will rarely change, or if the component should just be a pass though and give the rendering decision to its children.

Another thing not always clear to our engineers, was that style and children are just like any other prop and if you are passing a style as an array or dynamically creating style objects, you are breaking PureComponent, the same goes for children. Unless explicitly controlled, any children are just creating new components which never pass an equality check.

With the above in mind we did a pass over our most commonly used Component’s — removing unnecessary instances of Pure and minimizing our use of dynamic styles. All of this work resulted in about 30ms being shaved from rendering when doing common actions like switching channels.

Fast List

Lists have always been a headache in React Native. We’ve actually implemented our core chat view natively because lists don’t perform well for many dynamic rows. We always hoped and assumed that React Native lists were good enough for simpler use cases like our channel and member lists. Unfortunately, despite the React Native team rewriting lists and promising better performance, memory usage and virtualization of the built-in FlatList and SectionList did not meet our expectations.

One of the reasons that built-in lists have performance challenges is because they support a lot of features, and do aggressive pre-rendering as a solution to combat the fact that the underlying <ScrollView> renders in the main thread and JavaScript has to feed it more content as you scroll. What this means is after a <FlatList> mounts, it eventually renders all its rows slowly every frame. Using the Perf Monitor, it could be seen that when looking at large Discord servers, they mounted nearly 2000 views.

Since this is a known pain point in the React Native community, there are already solutions to this problem. We spent some time first testing out react-native-large-list which was immediately a significant improvement, brought down rendering down to 60ms, and only ever rendered about 400 views in large Discord servers. It was not the best to work with because it required restructuring our data that was already computed for both iOS and Desktop in stores, but it was a trade off that we could deal with. Unfortunately, when using the library for much larger lists (eg: 100,000 rows) like channel members, it would lock up the CPU.

Dramatic representation of a recycler view with a pool of reuse-able views.

Next we found recyclerlistview which actually worked even better! It had similar results in the amount of views it mounted, and it was able to keep 60 FPS fill rate on scroll. What was better was that it brought down rendering to 30ms for server switching, and channel switching down to 10ms! Unfortunately, despite trying to trick it via many methods, it suffered from one frame blanks on mount, weird scroll positions when trying to scroll to an item on mount, and the sticky headers implementation conflicted with Animated.

At this point, we were out of open source solutions. So we either had to settle on recyclerlistview or come up with our own solution.

At first, we felt that maybe doing it purely in JavaScript was futile. We spent some time trying to glue together UITableView with React Native, and while we made meaningful progress, it started feeling overly complicated. After stepping back and thinking about what else we could do — it hit us. We already solved this problem once before on the web! We already had an internal List component that virtualizes its children. There is no way we could just drop it into React Native right?

We replaced <Scroller> with <ScrollerView> and <div> with <View> and dropped it in. It worked! Right out of the box it performed almost as well as recyclerlistview and was able to use nearly the same code for rendering as the desktop app. We then spent some additional time adding sticky header support and using some similar techniques employed by recyclerlistview to recycle views – avoiding allocations both on the React and UIKit side.

Before and using our new <FastList> component on an iPhone 6s and an iPhone XS — this yielded significantly less blanking and stuttering on large lists:

The result was a new component we called <FastList>. The team intends to merge these together for cross-platform use and open source it for the community. With that we removed a lot of memory allocations and were down another 70-90ms of render time. We could now scroll the channel members list as fast as we wanted and it kept up admirably.

This phase took a lot more work since it required writing some new libraries. It net resulted in over 100ms render time savings in common app usage patterns. The app was now performing great during active usage!

Phase 3: Main Thread Optimizations

So now that Flux stores were performing well, and components were not spending a lot of time rendering — there remained a few big mysteries:

  • When opening the emoji picker for the first time, the UI froze for up to 2s on an iPhone XS and this didn’t show up anywhere in JavaScript profiling.
  • When coming back from backgrounding the app, switching between a direct message and server for first time froze the UI for up to 2s on an iPhone XS.

If that was all happening just after an initial boot it could have been explained away with resources needing to be loaded into memory, but it was also happening after backgrounding… what could it be?

Emoji Picker

We suspected the emoji picker itself was the source of the UI lockup, but we knew that if it was something purely related to resource loading we only would have seen the issue once per app boot.

So we started adding logging to every ancestor component of our emoji picker and found that our chat input component was remounting when switching between private messages and servers! Oddly, the remount didn’t seem expensive after the first time it remounted. More importantly, these components were never supposed to remount by design, so why were they? It turns out they were wrapped in a different root component depending on context, and changing a component type in the tree is just like setting a different key. Merging the two components reduced the 2s lockup down to 500ms, which was a win but not yet fast enough.

Image Loading

While the component was now loading much faster, there was still the mystery of why is still remained relatively slow. What was even weirder is we could not reproduce this on a development build and only on a production build. What was different between the two? One thing we saw was that all app bundled images were loaded via development servers in dev mode but stored in memory caches for production.

Before diving deeper we tried a simple test. We deleted all the images from disk and launched the app. It was fast, which meant indeed something about the image loading was the root cause. So we took console.log’s long lost cousin NSLog and decorated all the local file loading code for RCTImage with call times. RCTImage loads all local files out of bound using [UIImage imageNamed] on the main thread to avoid flickering. Unfortunately, those calls were taking 50ms each for every call after launching the app or after backgrounding!

[UIImage imageNamed] is the de-facto method for loading images on iOS, so, why was it slow? It turns out, React Native was passing it absolute paths instead of referencing named images in the bundle. So, we deleted all the images again and added a few of them to the bundle and tried again. Now, the exact same images were loading in 0.1ms and the app was fast. Unfortunately, this method means we would have to explicitly add all images to an Xcode project and we can’t over the air deploy them.

So we tried another crazy thing. RCTImage has a fallback to read bytes from disk and pass them into UIImage if [UIImage imageNamed] for some reason does not work. So we commented out the [UIImage imageNamed] calls and let it fall through. To our surprise, without any caching, constantly reading these images from disk and decoding them was only taking 0.3ms! We were shocked by the result and this could only mean there is some kind of bug in UIKit when handling memory cache misses for files that actually exist on disk.

Luckily, React Native allows you to create custom image loaders. So we created a native module LocalAssetImageLoader which reads files directly from disk, decodes them, and manages its own memory cache. This bypassed the bottle-neck we were seeing and gave us blazing fast image loading.

Battery Heat

Finally, we were still noticing that using it for a bit — even lightly — could result in the battery heating up. While looking at our idle CPU usage we noticed our JavaScript thread was now mostly idle now at 0–5% CPU of 1 core (down from up to 30%), but for whatever reason the main UI thread would sit at 10% CPU when looking at a server. After slowly disabling pieces of the UI until it went away the culprit was the right drawer. While it was closed it always had a spinning animation in the background chewing away at CPU — death by the smallest cut!

This phase was great because UI lockups were gone, the app was running at 60 FPS, and loading views with a lot of images was fast again. But better yet, since we were previously spending 50ms per image load on init, the changes immensely helped TTI as well — shaving nearly 2s. However, the additional FPS boost now made some degraded gesture interactions fairly obvious — so we kept going.

Phase 4: Perceived Performance Optimizations

Even when everything is running at 60 FPS things can still feel slow, and thats where perceived performance comes into play. To tackle this, we chose to focus in on one of the apps core experiences: navigation.

Navigation drawers

One of the primary interactions with Discord involves the left and right drawers. The component that powers them was created the week that React Native was released in 2015, and still suffered from the shortcomings of those times.

Drawers are a very gesture based interaction, and when that interaction requires logic to jump between the main thread and the JavaScript thread, it results in the drawers not perfectly following a swipe or finger drag. Due to the dependency between the threads, it also means that if the JavaScript thread is busy the interaction can briefly pause.

The other problem with the drawers was that they depended on React Native JS Responder system which works well in most cases, but not when there are gestures and scroll views involved. This could create gesture conflicts, such as trying to open the drawer but instead having the chat view scroll. Our initial reaction was to just create very nice drawers using native UIViews and then expose them to React Native to fill with content. However, we decided to browse around for better animation and gesture modules to see if there was a solution for this.

Luckily we found both from the same author who has clearly been working to address this problem. react-native-gesture-handler which exposes UIKit’s responder system to React Native and with it we also found references to react-native-reanimated. What makes them great together is that you can connect them to tie gestures to animations — all on the main thread – but you get to write all the code in JavaScript. react-native-reanimated basically exposes an animation logic graph not that different from the way you compose React components and is able to send it to the main thread, run it in a loop, and it can even reference values from other objects on the main thread.

Using this we were able to tie the <PanGestureHandler> to an animation of how a drawer works and get a drawer that behaves buttery smooth like any other native app. Best of all it was done with a lot less code.

There is more to do with perceived performance and hopefully we as a team can continue to make it better. However, through this work we now had an app that uses way less CPU and battery while running at 60 FPS with buttery smooth interactions!

Phase 5: RAM Bundles

Finally, we decided to apply one final optimization with the hopes of tying everything together into a final TTI drop. This was actually something we had our eyes on for awhile — RAM Bundles.

The idea behind RAM bundles is simple: instead of loading one gigantic JavaScript blob, why not break it down into modules that can be lazily loaded as needed? Our build system already was using Haul, and conveniently a plugin already existed to enable them. We got it integrated, but unfortunately we learned that our codebase was not all that modular.

Despite this, with some minimal changes it still was able to defer loading some 100 of our 2600 modules, which did still yield a 800ms drop of TTI on an iPhone 6.

Going beyond that, we rolled up our sleeves and further inspected the app entry point as a whole and were able to drastically speed up TTI by deferentially loading or omitting several key parts of the code base:

  • Updated our modal, action sheet, and alert API’s to require dynamic imports — thus excluding them from initial load.
  • Stubbed out several key desktop-only features to avoid loading unused React code (such as the Game Store — :c) on iOS.
  • Always imported lodash since we found that importing submodules actually costs more because duplicate code is loaded.
  • Dynamically import many of our core native modules to avoid loading this code until the point in time the feature is actually used.
  • Similarly, employed dynamic imports (inline require)for many of our common feature components to achieve lazy loading.
  • Dynamically imported our entire localization code and associated moment locale code to be deferred until the app had loaded.

After all this, we saw a tremendous speed up in the apps initial boot times in addition to the gains we had already achieved. On our benchmark iPhone 6 device, we saw average loading time decreases of 3500ms. On an iPhone XS the average time to load was an additional 700ms faster as well!

This final step tied everything together for us — now the app was launching quickly, interacting fluidly, and using much less CPU. Most importantly, the app just felt so much more usable on our iPhone 6 device. It was at this point that we felt our goals had been achieved.

The Results

Overall performance was greatly improved — the UI freezes and stutter that can be seen on the left have been all but eliminated.

When we stepped back from the project we were extremely happy with the final outcome. While we started by just trying to improve the FPS of the app, ultimately our exploration resulted in:

  • Fluid gesture interactions for our core navigational system.
  • A fairly consistent 60 FPS across our supported devices and a very noticeable reduction in battery consumption.
  • A much better development experience since even the app under development mode runs much better than the production app before these changes.
  • An average of two seconds shaved off the initial load time. This was primarily achieved through our efforts enabling RAM bundles and optimizing our code paths to leverage it.

Whats more, this investigation and mitigation was completed within a span of a few weeks. Although there are real pain points and challenges to using React Native, the overall gains significantly outweigh the costs which motivates us to keep investing in the platform.

Author: