Monorepos and microfrontends — going together like pineapples on pizza?
In my last blog post, I described a fairly arduous process we followed here at HMH to consolidate two of our UI repositories for our Ed platform into one monorepo. Any engineer worth their salt knows that you shouldn’t do any development without knowing why you are doing it, and looking back on it now with the fresh eyes of experience and a few extra grey hairs, I’ve realised that I never really took the time to explain to a wider audience our motivation behind that piece of work. Thankfully, my colleague Cathal Geoghegan stepped in to describe the nitty gritty of our tooling and the results in his blogpost in October 2020, and I’d like to build on that with my own take on how it’s going.
What’s a monorepo and why do we need it?
Like any good engineer, I love an auld Google’n’copy’n’paste — my first result for “what is a monorepo” gives this article by Perforce with this handy summary:
“A monorepo (mono repository) is a single repository that stores all of your code and assets for every project.”
Eagle-eyed readers out there might remember that my colleague Clíona de Róiste wrote a blogpost a couple years ago about how Ed is following the strangler app pattern. To summarise, because we needed to update our tech stack from Angular 1.x (no longer supported and not very well known anymore by engineers out there in wild), we started to disassemble Ed and rebuild it in React piece by piece (at the same time as adding new features to keep our customers and product team happy!). This resulted in us having two code repositories, one for the Angular views and one for the React apps that were imported into those views. More often than we expected (of course), this approach required 2 pull requests to be merged at the same time before a build could be kicked off to bring a new feature to production.
You can imagine the headache that caused — lots of manual coordination, and a terrible onboarding experience for new engineers. I should know — I was one of them! While it helped us solve our tech stack upgrade problems, sped up our deployments, and brought us incrementally further towards more isolated apps, it made the development experience kind of, well, terrible.
So in order to avoid having PRs in multiple places, and promote greater code reusability across our growing UI teams, we decided to implement a monorepo.
Is a monorepo the same as a monolith?
According to another excellent post detailing 11 tools to build a monorepo in 2021:
“Note that a monorepo is NOT a monolith application(!) — it is not built or deployed all at once. It is a group of applications developed separately.”
Even after implementing the strangler app pattern to start breaking Ed out into more manageable pieces, our Ed product was definitely a monolith. It was one big bang application with one big bang deployment, and no matter what pages or areas of functionality you changed in the application, all of the tests for every area ran on every deployment.
And you know what? That wasn’t necessarily a bad thing!
Our UI team used to be much smaller, Ed was just getting started as a product, and having all our features colocated with the same frameworks probably helped the team get it off the ground much more quickly way back when Angular 1.x was the go-to. And with the amount of code coupling that creeped in, we found that our deployments were much safer and introduced less bugs when we ran all the automated end-to-end tests.
Fast forward several years, and here we are trying to tackle this problem of growth. Ed has grown into a massive platform with a huge range of features and product areas, and our UI engineering teams (not to mention our services and analytics teams!) have multiplied to keep up with the demand from our customers and product management alike. As we scaled in people, we needed to find a way to scale our tech stack so that we didn’t have teams tripping over each other.
So, even though we implemented a monorepo to help us share code and streamline the development experience, it didn’t really solve our monolith problem at all.
What is microfrontend architecture?
In researching this article I’ve been drowning myself in podcasts and all sorts of resources to answer this question. This is an awesome article about microfrontends that explains it much better than I could. In essence, it’s just like the microservices pattern that you might see on the backend, but for front-end development — an architectural mindset you can use to separate your concerns and enable scalability as your UI teams grow.
When you follow this pattern, you have independently deployable apps, owned by disparate teams, that together make up the whole product the end user sees. Ideally you use a feature or business orientated mindset to draw the boundaries between apps in your overall application. In the example of Ed, we had inadvertently already started doing this when we were porting parts of our application to React — we had separate React apps almost for every page or set of related pages in Ed, like student assignments and teacher reports.
A really neat podcast about this topic from ThoughtWorks Technology describes this concept of MFE and how it can be used to address all the operational problems we were having in the Ed UI team:
“If you’re not trying to scale an organization to a hundred people working on the same product, if you’re finding that you don’t really have problems with lots of coupling in your monolith code base, and if you’re not trying to move from one tech stack to another, there’s probably not that much benefit in doing it [microfrontend architecture].”
Bearing that in mind, let’s break those problems down a bit specifically for HMH:
- Team autonomy: We have nearly 13 (!) teams working on our Ed UI platform right now. That’s a lot of coordination when we’re all trying to get our features into production. We also need our teams to be able to work independently (as we’re across different timezones too). We also don’t have full stack teams in most cases, which you could argue doesn’t lend itself to true team autonomy either — but that’s probably a whole discussion in itself!
- Code coupling: Remember how I said changes in one app required all tests to be run in the pipeline? This is still a major issue for us, especially back when we still had quite a few shared Angular services providing data and sharing functions between our apps.
- Removing Angular 1.x: Like I alluded to earlier, our new recruits to HMH engineering don’t have much knowledge of the syntax of this flavour of Angular, and it’s on end of life now anyway until December 2021. So it’s fairly urgent for us to tackle this one.
As always, the caveat is that this is a pattern and not a magic quick fix. But still, sounds pretty promising!
Thinking about MFE, we were already building isolated React apps each representing a single page or view or isolated area of functionality — designing it the way our users think of it. And we’ve recently gotten started with implementing Single SPA on Ed, one of the frameworks the microfrontend guru Luca Mezzalira recommends in the “What are Micro-Frontends?” Smashing Podcast.
Right now, we’re somewhere in between a monolithic and microfrontend architecture for Ed. For example, we are currently working on a separately deployable login app for our platform, and there are some 3rd party products that appear to the user to be part of Ed, even though they are deployed separately. But for the bulk of Ed development across our teams, we have one build pipeline. Which isn’t exactly conducive for full autonomy across teams and features yet, but we’re getting there.
Alright, let’s talk about the pineapples on the pizza — how does a monorepo relate to microfrontend architecture?
The beauty of software engineering is that there is no one way to solve any problem. I like to think of using monorepos for microfrontends like putting pineapples on pizza — it’s not right or wrong. You just need to know if your guests will eat it or not, and if not, then it ain’t the right pizza for the party.
In this analogy, “will the guests eat it” is kind of like asking “does the way we use the monorepo enable us to have independent microfrontends”? Lots of people out there seem to think monorepos do enable microfrontend architecture, but only if you do it the right way, and after spending the past 18 months working in one, I tend to agree.
As I mentioned earlier, there are lots of tools out there to help you manage your monorepo and different applications inside it. Currently at HMH we use yarn, lerna, and we recently added depcheck in our continuous integration jobs on all our pull requests, to help us with package management. The idea of a monorepo is to have multiple discrete applications living inside it, so by packaging up our apps into workspaces so that they look like separately published NPM packages, we’re already on our way to a microfrontend architecture.
We also use the monorepo to share code. We have a fairly well-documented design system (based on Material UI) that our design team use, and a shared library of reusable UI components, each as their own package in the monorepo. This goes a long way in making sure all of our separate UI teams working across Ed are doing the same thing and giving a consistent user experience on the platform.
Our biggest challenges so far
Honestly, I think our challenges were mostly to do with jumping into the monorepo concept without doing our due diligence on what that would mean. We’re not Facebook or Google, we don’t have a team dedicated specifically to tooling and infrastructure for the monorepo (yet!). And the more I dig into these concepts, the more I find experienced engineers out there emphasising the importance of managing your processes and tools when working in a monorepo.
- Merge queue: Now that everyone is building and merging to the same codebase, we had to implement a “Merge Queue” process during busy times to ensure each PR had the latest code from master, and so that we could group PRs together in our deployments. A little bit like this example that uses GitHub Actions for auto-merging, we implemented Bulldozer for auto-merging and brought in a fairly strict communication process around PRs that all of our teams follow. A merge queue isn’t unheard of with monorepos, but it can be a real frustration for engineers who have never worked in such a restrictive environment before.
- CI times: Our continuous integration times blew up the bigger our monorepo got, and in fairness, a lot of that was due to our own lack of proper architectural patterns and workspace guidelines. At the same time as the monorepo, we also started implementing more heavy app-level integration tests with React Testing Library to slim down our e2e automation suites [1]. We had to do some serious experimentation with our jest threads and all sorts of Jenkins resourcing to get our PRs down to a somewhat manageable merge timeframe.
- Dependency management: Oh, where to begin with this one. I think I’ll park this one for now to give its own section, so I can describe more fully where we’re at.
I think the bulk of our problems with the monorepo and our attempt at a microfrontend pattern can be described by some quotes from Cam Jackson of the ThoughtWorks podcast I mentioned earlier:
“I think what people struggle with sometimes is balancing this kind of a spectrum of centrally controlling and standardizing everything versus complete chaos where nobody talks to each other at all…”
“At the other end what you sometimes see in I guess more corporate places is the desire to over standardize and over centralize. And so everybody has to do everything exactly the same way. We all have to use exactly the same versions of every single dependency. All of our build pipelines have to look the same and all of a sudden, well, you may as well have just built a monolith...”
The biggest culprit — dependency management
While it’s pretty easy to divide up applications and microfrontends, it’s actually pretty hard to then divide those up into packages and workspaces. It’s closely related to one of the two so-called hard things in software engineering — naming things.
What we found was that bigger generic packages caused too many independent apps to be inadvertently dependent on each other, which made the CI runtimes in our pull requests very long.
Here’s an example of what was happening:
- When we first started out, we stuck all of our reusable React components in a generic
react-components
package. - All of our apps starting using these components — so they imported things like
import SortableTable from 'react-components'
, withreact-components
as a dependency in the package.json. - Pretty soon, nearly all of our apps were depending on
react-components
for some component or another. - Now, imagine a PR came in to change one of the components in
react-components
, like thatSortableTable
component. - Changes in a PR makes
lerna
search for packages depending on the packages changed in the PR (so in this case,react-components
), and runs the tests for those dependent packages. - This meant that, even if you were only implementing changes for a component used in one app, all of the tests were running for completely independent apps in your PR.
- Resulting in a complete waste of time and a lot of frustration!
To solve this problem, we had to get a crack team of engineers pulled off their current projects for a whole sprint to identify these interdependencies, set the pattern for how to break them up, and to implement jest changes to speed up and stabilise the CI in our PRs. An expensive mistake, and not one that we want to repeat!
So how do we address this?
It kind of comes back to our old school S.O.L.I.D. programming principles.
Most importantly to remember in the context of a package in a workspace — “Each software module has one, and only one, reason to change”. Using our example above, react-components
had way too many reasons to change, because it contained all sort of different components.
Whereas now, we separate components into their own packages under the packages/components
workspace - like sortable-table
, which is depended on by far fewer packages, and only the packages that actually use the SortableTable
component.
Now to be candid, we still have a problem of too much coupling between our apps. If you really want to have discrete microfrontends that are independently deployable, should you be sharing components like SortableTable
at all? Or let each vertically-sliced page or application use their own version of a component? That’s still a challenge that we are grappling with.
Sounds like an awful lot of packages to maintain — how does an engineer find what they’re looking for?
This is something we are struggling with too (lots of struggles!). We have recognised that as our packages become more numerous and granular, we need to find a clever way to surface their documentation so that engineers can easily find existing functionality when developing new features.
While we don’t have anything more than naming conventions and lots of different workspace types at the moment, there are a few solutions out there like npm-gui that might work for us down the line.
Is there anything else that we’re missing?
There’s one thing we haven’t quite managed to take advantage of in the microfrontends architecture (as my colleague Aleksandra Lorek astutely pointed out when reading the draft of this article), and that’s experimentation and ability to use different tech, and to mix and match frameworks in one product.
In few years time there will probably be another big effort to migrate away from React. While we have separate apps and a decent pattern at microfrontends, are we sure the way that we have “sliced” our application will allow us to have multiple frameworks or even multiple versions of React on one page without using iframes?
And of course, one of the main pain points we haven’t taken the opportunity to solve (even though we have all the tools in place) is having separate deployments. We can’t really support the idea of autonomous teams when everyone is waiting to hop on the same train to production.
So… pineapples on pizza?
All the problems we’re incrementally addressing at HMH with our Ed UI are just that — only incrementally solvable. There is no “silver bullet” in technology. Right now we’re doing a monorepo, with a view on microfrontend architecture, but tomorrow, we might find other problems that lead us in a different direction. It’s kind of funny actually — as we move towards a proper microfrontend architecture of independent applications, we might even get to a place where we will want to disassemble the monorepo in favour of a multirepo product!
I guess all that’s left to say is, never a dull day in software! Now go have your pizza — you’ve earned it.
P.S. If you’re interested in trying out the monorepo concept for yourself, this is an awesome guide to get you started using Git, NPM, and yarn. As for microfrontends, this guide using React for cats and dogs apps is both cute and very informative!
[1] I threw this in here almost offhand, but I’ve definitely got an upcoming blog post about how we’ve been trying to slim down our automation suites to only cover high level user flows by using app-level integration testing with React Testing Library and Mock Service Worker. It’s super exciting and one of my proudest contributions to HMH over the past year or two, but as it doesn’t have much to do with the monorepo or microfrontends, I’ve reluctantly left it out to describe another day. Stay tuned for that!
Got some ideas on microfrontends or monorepos that you’re dying to put into practice? Come join us!