How to build a Software as a Service AKA SaaS

Intro

So I recently built a software as a service, AKA a SaaS. At least technically it's a SaaS because you can purchase a monthly or yearly subscription. But for slightly more, the customer can also purchase lifetime access, which allows the business to get a quicker ROI.

Tech Stack

The front end is a single page application built using Angular. I used Angular mainly because it's what I usually use for work anyway, but I think it turned out to be a really good decision because most of the complexity of this app is definitely on the front end. There's a lot of logic, a lot of state management, a lot of responsiveness, dynamic updates, and all that kind of stuff. And Angular does a really good job in enforcing a lot of best practices.

First of all, it uses TypeScript by default, which is just so much better than vanilla JavaScript. My first programming languages were C and C++, so I really prefer static typing. Angular definitely has somewhat of a learning curve. The most painful part is definitely RxJS, which to put it simply is a much more complicated but more powerful alternative to Promises. For styling, I went with a pretty lightweight CSS framework called Bulma. I think it mostly met my needs. It kept the bundle size small, which means that the initial page load for users is gonna be pretty fast.

But I actually wrote most of the CSS myself because I wanted the app to have at least a unique feel, even if it wasn't great, which I somewhat regret because I really could have saved a lot of time by going with some kind of Angular component library like Material UI or Ant Design. Would have saved a lot of time, but then the site would have felt a lot more cookie cutter, but it's a trade-off and I don't entirely regret it. For hosting the front-end, I used a content delivery network. Since it's a single page application, it's just a bunch of static assets, JavaScript, CSS, and HTML. Content delivery networks basically store a bunch of static assets all around the world so that they're close to the end user. So if you make a request from the US, your request won't have to travel very far. The same is true if you're making it from the other side of the world, like in India.

Backend with Firebase

For the backend, I mostly used a bunch of Firebase services. Firebase is a cloud service that uses a bunch of Google Cloud services under the hood. It also integrates with other Google Cloud services pretty well, but I have to say, Firebase was not as smooth of an experience as I was expecting. A lot of things do work pretty well. The best one was Firebase authentication. It's really, really easy, like literally one or two lines of code to add an authentication provider like Google or GitHub, which I did for the site. And it basically manages everything, including the authentication cookies and tokens and all that stuff. And I think authentication is so important that it's not a bad idea to have a provider handling that kind of stuff for you. So that was good. Now moving on to the stuff that wasn't so good.

So first of all, I had to implement a pretty basic REST API. I just have a few end points, things to handle stuff like creating the user the first time somebody signs in, creating a payment or subscription on the server, and of course, updating the user after they subscribe and marking that they are a pro member.

Drawbacks of Serveless Functions

So I went with Firebase functions, which basically use Google Cloud functions. And while it does pretty much just work, there were some issues that I have specifically when it comes to latency. Out of the three major providers, AWS Azure and Google Cloud, Google Cloud is known for having the worst cold start times. That basically means if a function has not had some traffic in let's say 10 to 15 minutes, that function will basically no longer have provisioned resources. And so the first time that function is called after 10 to 15 minutes, there will be a cold start delay, which looking at the metrics dashboard can get pretty high. You can see here that in some cases, the latency went as high as seven seconds. Here's a chart of the execution time. These spikes are mainly due to cold starts. That means that there were no active instances at that time. So in order to scale up, the function took some time. Now, according to these metrics, 99% of people will have latency of less than 0.04 seconds, which is really good, but there were some users who had much higher latency. Looking at just the one hour time chart, you can see that probably one person among a few dozen had a latency of five seconds. You can work around this by actually provisioning a minimum number of instances. That means that the Google Cloud infrastructure will always have at least a certain number of instances provisioned so that you can work around cold start times.

That's the theory. I actually do have one minimum instance provisioned, but you can see here that the active instances still end up dipping to zero sometimes, and then there is a significant cold start. In this case, it's not super high. It's one to two seconds, but it's kind of annoying because you have to pay for extra minimum instances to have that provision. It's not free. So when you're paying for something, you're not quite getting what you pay for, and I'm not the only one who thinks this. If you go look on forums, most people say that minimum instances just cost you a bunch of extra money with minimal latency improvements, and it kind of defeats the whole purpose of Cloud Functions. They're meant to scale up and scale down relative to the traffic that you're getting so that you don't end up paying for what you don't need. But honestly, if I could do it over again, I would not have gone with Firebase Functions.

I chose to because it was the simplest thing to do, and I thought it would have handled my use case, and it mostly does, but at some point in the near future, I'm definitely gonna migrate to Cloud Run. I ended up using TypeScript for the Firebase Functions because the TypeScript SDK has the most support. So I'll probably create a REST API with Express, containerize it with Docker, and end up deploying it to Cloud Run. Since I'm paying for minimum instances anyway, it makes more sense to use a service like Cloud Run, which can actually handle a lot more concurrent requests.

Each Firebase Function can only handle one request at a time, but a Cloud Run instance can handle a lot more. I'll have at least one minimum instance provision, so I'll basically eliminate all my cold starts. It's been pretty rare that I've had more than 10 concurrent requests, even when my SaaS launched. But if I really need to, I can always provision more. I think Cloud Run just makes a lot of sense. I don't even have to worry about vendor lock-in because I can just take my Docker container and deploy it anywhere I want. I could even just take a single VM, and that would probably handle enough traffic that I needed anyway.

The Database

Moving on to the database, I used Firestore mainly because it was the most obvious choice, and while it does satisfy my requirements, I did run into some pretty significant issues, to be honest. Firestore is a NoSQL database, which stores a collection of documents. It's similar to MongoDB or DynamoDB, and I didn't really need relational data. I'm not really storing much anyway, just storing some basic user info, what problems they've completed, whether they've purchased Pro or not.

So a NoSQL database works perfectly fine. I don't really need complex queries or joins and things like that. Now, Firebase is essentially a backend as a service. So with Firestore, you can actually read and write directly from the browser, which is pretty interesting, but can also be dangerous. So you can implement some security rules to make sure that a user can't do something that they're not supposed to. For example, someone could in theory manipulate the JavaScript to grant themselves a Pro membership without paying. And to prevent this, I've implemented Firestore security rules. That part works perfectly fine.

The one issue I did have is actually reading Firestore from the browser. A very small percentage of users reported to me that they were experiencing issues with neat code. They were able to call the Firebase functions completely fine when there was some functionality such as checking which problems they've already completed from the neat code 150 list. I implemented that by writing code on the Angular app to directly read from Firestore. That caused some issues because for some reason, and I'm not completely sure why, some users have an issue with this. I don't know what exactly it results from. I've read online that it can be caused by a poor internet connection, a VPN, a proxy, a corporate network, anything, but this is a known issue actually. There's an issue on the Firebase JavaScript SDK that's opened over three years ago. And you can see I'm not the only one who's had this issue. Many other people have had this issue and it still hasn't been fixed. There's actually still comments on this thread from even a couple of days ago where some people are frustrated. It sounds like they're making progress with getting this fixed, but I've lost my trust in this functionality and I'm basically going to be updating the code to not read Firestore at all from the client and basically directly go through the Firebase functions because I've not had any issues with that.

Accepting Payments with Stripe

Now, one of the most important and confusing things is accepting payments from users. There's a ton of services and APIs you can use. I ended up choosing Stripe just because it's known for having a really good developer experience. And I think overall it did. For my use case, I needed to implement a yearly subscription, but also a one-time purchase, which gives users lifetime access. Maybe it's just me, but it wasn't as smooth of an experience as I was expecting. I did find some holes in the documentation. I first went with the Stripe card element just because it's more established and I thought it would have been fine, but I was using the orders API, which apparently does not support the card element. I had to go with the payment element. That required rewriting code. I had to do this a few times for a few different reasons. I wish it had been made more clear when I was reading the documentation. There's so many different payment objects. There's subscriptions, payment intents, invoices, orders, so many things. And I think the differences between these are not explained in the documentation as well as they could be.

But I do have to say Stripe has really good support if you email their support team or even go on their Discord, which is mainly for developers asking coding-related questions. They have engineers dedicated to answering questions. It was really helpful for me when I encountered issues or things that weren't documented, and usually an engineer was able to help me out. Stripe heavily relies on webhooks. So basically when you load the neat code page and then you fill in your credit card details and you try to make a payment, that information actually does not go to any of my servers. It goes straight to the Stripe servers and it'll basically directly make a request to one of my endpoints. And then I'll use the status to check if the payment was successful, then I'll grant you access to either a yearly subscription or lifetime access, depending on which one it was. This is called a webhook. It essentially pushes that information straight to my server so I can handle it accordingly and then update my database with the relevant information.

Now you might be wondering why doesn't Stripe just send that response information back to the browser and then from the browser, I can directly make a request to my server and handle pretty much the same situation. In theory, that would work, but think about a case where the payment information goes through, but maybe your browser crashes right after that so you never received the response and therefore you can't send a request to my server. In this case, your credit card would end up getting charged, but you would not be granted access so this would be an issue. I think it's pretty rare that this would happen, but this is the major problem that webhooks can prevent.

Why creating the platform from scratch

Now implementing all this stuff and going through all these technical details and issues was a lot of work. You might be wondering why didn't I just go with an existing course platform like Teachable, which pretty much has the same functionality. Users can sign up, they can make a payment and then get access to a list of courses. I could have technically done that and maybe if I was smart, that's what I would have done, but I also think it would have given me a lot less control over the experience. A lot of course platforms don't let you authenticate with Google or GitHub. You actually have to sign up with your email and password. I think most people nowadays prefer signing up with an existing provider like Google because it just takes one click. Also, I wouldn't have been able to control the styling as precisely and a lot of the features. And I have a lot more features planned which definitely would not have been possible with a Teachable type platform.

So overall, I think the complexity was worth it in the long run. Also, I learned a lot and I plan on teaching a lot of the stuff I learned creating this app in future courses.

Hackerino