Scaling the data storage for B2B Software-as-a-Service (SaaS) applications typically falls under two schools of thought: one or more fully-contained accounts per database or specialized multi-tenant sub-systems that handle certain types of data. There are pros and cons to each approach.
B2B apps are often simpler than B2C apps when it comes to data storage. In a B2B app, one account/user doesn’t need to know about another, whereas in a B2C app accounts/users have to have a global context for all other accounts/users (think Facebook Friend requests). Fully-contained accounts can be on individual databases or multi-tenant such that one or more accounts are on the same database delineated by an account ID.
One analogy I like to use is housing for people. A fully-contained account on a private database is a like a single family home. Multiple fully-contained accounts on one database is like one building in an apartment complex where everyone gets their own space but there are greater economies of scale having everyone together.
Challenges for fully-contained accounts in a multi-tenant environment arise when the size of the accounts start changing and growing at different rates. Back to the apartment complex example, one family wants a home gym and decides they want a new bedroom for it, only all the other three bedroom apartments in the complex are full. What does the family do? Well if the family can afford it the tendency is to go to a single family home that is suited uniquely to them.
A single family home is much more expensive to maintain, on a per family basis, compared to a building in an apartment complex. Services like landscaping, security, and amenities like a gym or pool are much more affordable when spread out over dozens or hundreds of people. A single family home can have all those accoutrements but the cost structure is no where near the same.
Now, assume some accounts in the multi-tenant fully-contained database approach grow so large that they need their own dedicated database. With the housing analogy, larger families start moving to single family homes from the building in an apartment complex. Assume the apartment complex didn’t have a pool — the app didn’t have that feature yet. Now, the startup decides a new feature is needed — the equivalent of a pool — and adds it to the application. The pool is manageable in the apartment complex that houses 100 families with the cost of chemical treatments, cleaning, etc spread out over a large number of users. For the single family homes that were created by the accounts that got too large, each now has a pool by the nature of SaaS applications having a single code base for all customers, and those pools all need to be managed. Going from pool to pool, even with automated tools and help, becomes less efficient and more costly to do chemical treatments, cleaning, etc.
Specialized storage and application sub-systems are designed to solve this problem. Think of a simple survey application. The info for the accounts, users, billing, survey questions, and more are pretty simple and not storage intensive. What is storage intensive is keeping track of all the survey responses and survey respondents. That information is going to grow exponentially compared to the core account data. It deserves a dedicated sub-system.
Now, back to the housing analogy. Instead of families desiring a home gym moving to single family homes, the apartment complex introduces new buildings in the same complex, only the different buildings serve specific functions like a gym with an indoor pool, parking deck, and storage units for for each family. The individual features and storage needs get dedicated attention such that the regular gym with pool can grow into a 100,000 square foot facility with Olympic-size pool, and all the families can stay put in the apartment building that’s part of the same complex. Areas that change disproportionately can get the necessary attention without affecting other aspects of the application.
When starting out, fully-contained accounts is easier to implement and doesn’t require much overhead, making it the right choice for young B2B SaaS apps. If data grows linearly and everything is tightly related, keep it simple with fully-contained accounts. As the application grows, and certain types of data grow disproportionately faster than others, specialized sub-systems scale better and in a more cost-effective manner, making them the right choice for mature, data-intensive B2B SaaS apps.
What else? What are your thoughts on these two schools of thought for scaling B2B SaaS app data?