Designing and Building a Video Conference Meeting Room
By Cody Riddar | March 28, 2023
Outlining the Challenge
One of my first tasks as a Software Engineer at 3Together was to build the user interface for the SwellChat meeting room. If you’ve had the opportunity to join a SwellChat meeting, you might look at the meeting room and think it was pretty simple.
Not so! But before I get into the details, let me illustrate the deceptive nature of the task by sharing an anecdotal exercise I had to do in college.
I had to pseudocode an ATM transaction.
Pseudocode is basically like writing the business logic of a software application or feature in plain language, without real code. The purpose of pseudocode being to identify where the challenges might be, where unanswered questions might be, and to help estimate the level of effort involved; before actually doing the work of coding it up.
The point of the ATM exercise was to show inexperienced students like myself that while some of the technology we interact with regularly may seem simple, it actually isn’t.
I don’t remember my initial answer all these years later, but it may have gone something like this:
- (User inserts ATM card)
- Show Prompt for PIN
- User enters PIN
- Validate PIN
- Prompt for Usage
- Check balance
- Withdraw cash
- (User selects withdraw cash)
- Prompt for quantity, in multiples of $20
- (User enters quantity of money)
- Validate they have that quantity
- Disperse funds
- Return card
- Reset for next user
I won’t bore you to sleep breaking out those 12 steps into the probable 100+ steps that actually exist, but I’ll break down the first couple processes to illustrate the point:
- (User inserts ATM card)
- Validate card was inserted correctly
- If not, return card and prompt user to correct it
- Validate card is of type Debit, Visa, or Mastercard
- If not, return card and alert user
- Validate card is from an accepted banking institution
- If not, return card and alert user
- Validate card is not expired
- If not, return card and alert user
- Validate card was inserted correctly
- Show Prompt for PIN
- User enters PIN
- Validate PIN
- Is PIN 4 digits?
- If not, decline authentication
- Is card from a different banking institution?
- Yes: Call other banking institution
- Did bank answer?
- Yes
- Validate PIN is correct
- Is PIN Valid?
- Yes: Continue
- No: Return card and alert user
- Is PIN Valid?
- Validate PIN is correct
- No
- Alert user that their banking institution is currently unavailable
- Return card
- Yes
- Did bank answer?
- No:
- Validate PIN is correct
- Is PIN Valid?
- Yes: Continue
- No: Return card and alert user
- Yes: Call other banking institution
- Is PIN 4 digits?
- And so on…
Back to the Meeting Room
Let’s apply this exercise to a video conference meeting room. On the surface, if you were asked to pseudocode one you might write:
- (Attendee enters meeting)
- Draw rectangle on screen
- Attach audio and video
- Continue drawing rectangles as people enter
- Is screen filled?
- Yes: Add another “page” for new rectangles
- No: Keep adding rectangles until it is
Or something like that. But that ignores a few things that are probably pretty important:
- Could I fit more rectangles if I resize them?
- If I resize them, how small is too small?
- Make sure the action bar (containing camera, mic toggles, settings, and exit) is not layered on top of any attendees’ video.
- Make sure the list of attendees on the right also doesn’t overlay any attendees’ video.
- Resize or show more rectangles if the user hides the attendee list
- Provide a display mode that shows the active speaker in one big rectangle and the user in a smaller one, allowing the user to switch between modes.
- When somebody screen-shares, show the shared screen in one big rectangle, and the active speaker in a smaller one.Automatically restore the previous mode when the sharing stops.
- Resize and reorder the rectangles if somebody leaves
- And most importantly, and the part that ended up being the biggest challenge, as the number of rectangles changes from people coming and going, do not be visually disrupting.
The Approach
So let’s get into the approach I took to build the SwellChat Meeting Room.
The first mode I tackled was the gallery, where you try to show as many people as will fit and still be easily viewable. Active Speaker and Screenshare modes weren’t very complex, so I won’t get into those any further than to say they’re not all that complex. Handling the view switching was probably harder than displaying either mode.
Mostly Math
Gallery mode was mostly math. First, I started by identifying the constants:
- What was the rectangle ratio going to be? 1:1? 4:3? 16:9?
This was important so I could keep rectangles from cutting off parts of the video as the number of rectangles changed. We landed on 16:9 because that was what the AWS Chime SDK was giving us by default, and it’s a pretty common format.
Without getting too deep into the Chime SDK, just know that when you start a SwellChat meeting, we create a meeting in SwellChat but use the Chime SDK to orchestrate all of the media connections (mic, camera, speaker), and to make sure when anybody joins, that you’re all in the same meeting. Chime hands back “events” like “somebody joined” or “somebody left” or “somebody turned the camera off” to SwellChat and then we write logic in code to handle all of them visually.
- What was the minimum size for a rectangle?
Here I went with a minimum width of 300 pixels. The main reason being that some older phones have a width of 320 and we wanted to support some of them. So give 10 pixels to a potential scrollbar and a little for padding, and that left 300. Any wider and the user would have to scroll left and right to see another attendee’s entire video. Not a great experience.
So, given we had a minimum width of 300, what would the minimum height be? Remember that 16:9 ratio? So 300 times 9 divided by 16 is 168.75.
- How much “gutter” do we want? Gutter is the space between rectangles, so you can visually tell them apart. I settled on 10 pixels.
But what that really means is that for every rectangle across, minus 1, left to right, and up and down, I would need to compensate for 10 extra pixels. The minus 1 is because if you had two attendees, there would only be one gutter between them. The number of gutters to accommodate horizontally or vertically is alway one less than the number of rectangles in that direction.
You’ll see later that although this is logical, I throw it away in exchange for performance.
- How much padding do we want around all the rectangles, as a whole? It doesn’t look good when your content is immediately adjacent to the edge of your browser, or other visual elements on the page. Whitespace is good.
I went with 10 pixels again here, for consistency with the gutter. That way as you’re filling up the screen with rectangles, the space around each is the same: 10 pixels.
Although the constant here is not 10 pixels, it’s 20. We have 10 pixels of padding around the entire meeting room, so 10 at the top and bottom each. And 10 on the left and right each. I would need to subtract 20 from the available height and width, which is not constant. It’s highly, highly variable. But we’ll get to that.
- How much space does the action bar take up? Including its own padding at the bottom of the screen?
That’s 80 pixels.
- How much space does the attendee list on the right take up when it’s open?
That’s 270 pixels.
How Much Space Do We Really Have?
Now that the constants were identified, it was time to put the math in action to begin building our rectangles.
It all starts with “how much space do we really have”. You can query the browser for its own height and width in pixels. So we have the starting vertical and the horizontal space to begin calculating. (I later learned that you can actually just query an individual element called a “<div>” for its height and width, eliminating the need to account for the attendee list, but I’ll keep it in for this exercise).
To calculate the available space for video rectangles, it goes something like this:
Calculate Available Width
- Start with the browser width
- Subtract the attendee list (if open)
- Subtract the padding
Calculate Available Height
- Start with browser height
- Subtract the action bar (it couldn’t be hidden back then. It can now)
- Subtract the padding
Calculate the Maximums
Next we calculate the potential maximums for one page. How many rectangles across can fit? How many rectangles up and down can fit?
- Across: Divide available width by minimum width (300) + gutter (10). I realize there’s going to be one extra gutter (gutters = rectangles – 1) but was ok with that in order to avoid a performance hit from looping that would be necessary to leave out the first gutter until I hit the maximum available width. Maybe there was a more clever way to calculate this that would leave out the extra gutter without a loop?
- Top to Bottom: Divide available height by minimum height (168.75) + gutter (10).
Do We Need Another Page?
- Divide the number of attendees by the maximum rectangles in a row times the maximum rectangles in a column. That’s how many pages we’ll need.
E.g. If we could support 3 rows and 4 columns (12 per page) and had 20 attendees, we’d need 2 pages. 20 divided by 12 is more than one but less than three. - Note, if we needed another page, we’d always draw the rectangles the minimum size to fit the most on the first page.
Or Can Everybody Fit on One Page?
- If the number of attendees was less than the maximum rows times columns, everybody fits on the first page.
Then how big can the rectangles be, in order to maximize space?
- First we’ll do a prime number check. Sounds odd, but if we build our grid 2×2, 3×3, 4×4 as much as possible, the room will maximize space and be more visually appealing.
Think about it this way, if I had 2 attendees in the room and a 3rd joined, and I had a maximum available columns of 4, I could easily add the 3rd person to the first row. But I’d have to scale down the rectangles of all attendees to make it fit. And then do that again for the 4th attendee. Then start a new row on the 5th if I had a maximum rows of 2 or greater.
However, if I had a maximum columns of 4 and could support multiple rows, I could place the 3rd person on the next row without having to scale down the first two attendees. And a 4th. When the 5th attendee joins, we start building towards a 3×3 grid, if the maximums allow for it. And so on.
- From the first step, we know how many rows and columns we’re going to build, sticking with primes until the first maximum, rows or columns, is reached. Then adding extra rows or columns based on availability. We’ll call these my target rows and columns.
So if my screen supports 4 columns and 3 rows, after the 9th attendee, I’ll add a 4th column, keeping 3 rows. On the 13th attendee, I add a 2nd page.
- But we’re not done yet. We still need to know the size of each rectangle.
And we know each rectangle is going to have a 16:9 ratio.
Here I start by trying to maximize by rows. Divide available height by the target number of rows I can support. This is my “test height.” I use the test height to get the “test width” by multiplying by 16 and dividing by 9. Then I multiply my “test width” by the target number of columns. Would they all fit? If so, we have our rectangle sizes.
If not, I then know that I’ll be maximizing using width. I divide available width by the target number of columns, and know that the heights will fit all my target rows because I’ll end up with a smaller height than the earlier “test height.”
Below is a screenshot of the constants and variables in our meeting room calculations.
Let’s Build It
Now I build the screen, drawing rectangles by row and column according to the calculated sizes, then assigning attendees to them in the order they arrived into the meeting, and then attaching the audio and video to each rectangle.
Whenever a new attendee enters or one leaves, that entire process happens again. Same when you switch modes to Active Speaker and back.
Which leads to…
The Disruptive Part
Remember earlier when I mentioned that the hardest part was drawing these rectangles without being disruptive visually? And remember how any time an attendee enters or leaves the entire grid system redraws itself?
In our meeting room load tests we immediately noticed the deficiencies of this. When an attendee entered or left, the screen would redraw, reattaching the video of every attendee after their rectangle was recreated and ready. This led to a whole bunch of seizure inducing flashing video rectangles.
That wouldn’t work at all.
The hard part in solving the problem wasn’t implementing it though. It was finding the actual solution. After many attempts at being clever, and many failures while testing each attempt with my coworkers (including one super embarrassing demo to two of our newest software engineers just days before they were starting), the solution ended up being what I call “a one-liner.” If you’re familiar with Angular and ngFor array looping, you may already know the solution. However, at the time, this was a big learning moment for me.
This little gem tells Angular to track the items in an array (attendees) being drawn on screen, by an index that you define. On a screen redraw from a change to the array (change detection), if the item tracked still exists during the redraw, Angular “preserves its state,” which is programmer lingo for “the existing version is reused rather than destroying and creating a new rectangle.”
Magic is real. Thank you Angular.
So What’s Next?
In Software, Nothing is Ever Done
In software, nothing is ever really done. Technology changes and evolves, and it’s part of the job to stay on top of that and keep innovating the product. As new customers arrive with new feedback, new features are born out of needs and wants.
This is even truer for a startup. For a new product, the first version is usually built emphasizing speed. Time to market is more important than scalability because what’s the point of building out an ecosystem meant to handle 100k users before you’ve found out if you can even sell your product to one?
So it’s always known during development of version one that version two is probably either a wholesale rewrite or leads off with a technical debt payoff called a “refactor.” It’s typically not wise to expand upon a hack, even if it was justifiable at the time in order to get to market.
Our SwellChat meeting room is no exception. I don’t consider the underlying code to be rushed or of poor quality, so a refactor probably isn’t required for our plans. A rewrite is probably more in the cards. The meeting room simply wasn’t written in a way that can easily support some of the future features we have identified, such as drag and drop of attendee rectangles around the screen, putting the rectangles in a different order, or choosing how many rectangles to show on each page.
But I look forward to it, because another factor that has evolved over the last few years is the skill of myself and the team. Specifically, skills involving something called CSS Grid. Version one of the meeting room relied on a UI feature called the Angular Material Grid List. It’s capable of some great things, like keeping a rectangle in 16:9 format without any special code. Or of growing and shrinking just by telling it how many rows and columns you’d like.
However, it’s not very lightweight (i.e. slow) and customizing the individual parts is challenging due to Angular Encapsulation (Translation: Like trying to paint a ball that is inside a box, while the paint brush is outside the box). So the grid “tile” sometimes ignores you when you tell it something like “make that thing blue.” Then to make it “behave,” other parts of your app can accidentally also inadvertently turn blue.
CSS Grid is super lightweight, very powerful and performant, requires less code, and since it’s just CSS and not a “boxed in” Angular Material component, you have full control over how it looks without risk to the rest of the application.
We’ll probably announce when we rewrite the meeting room because it will come with some cool new features. However, even if you miss that email, I think you’d notice just by how much faster it will appear.
— Cody Riddar, Full-Stack Engineer, 3Together
Join Our Newsletter
Subscribe to our newsletter to receive the latest product offerings, blog updates, and exclusive offers.