Please, take your pick: is it when your colleague suddenly disappears from the virtual room, like they’ve been dropped through an electronic trap door? Is it when the lag gets so bad that mouths move without speaking, and voices seem to be coming from the void, like David Lynch just started directing your meeting? Is it the choppiness that turns normal human conversation into a guessing game? Or, most horrifying of all, is it your face, omnipresent in a little box in the corner, impossible to take your eyes off of, revealing your downward gaze, so that everyone knows you’re staring at yourself?
There’s no wrong answer, and the list of right ones goes on and on. Despite so many technological improvements since it first became a feature of the modern workplace, video conferencing still seems to be stuck in the Stone Age of telecommunication. The technology is plagued by fundamental problems that directly affect the core experience: poor communication quality, constant lag and dropped connections, and a flawed user experience and interface. Even though massive, resource-rich corporations like Google and Microsoft have tried to tackle the problem, it’s still one of the most frustrating and problematic aspects of the professional environment. So, why?
Turns out, there are a number of reasons, some purely technical, others deeply ingrained in how human beings go about their daily lives. Let’s start with the technical, since it’s less likely to cause traumatic flashbacks to the time of your last video conference, when you couldn’t stop staring at the pimple on your forehead and your boss sounded like a robot in a hurricane.
Plenty of start-ups have responded to the call of improving video conferencing, but one in particular, Highfive, is the MVP of the SEO game. If you Google “Why is video conferencing so bad,” you’ll find a handful of blog posts by the company trying to explain that very problem, from “4 Reasons People Hate Video Conferencing” to “10 Reasons Why Video Conferencing Still Sucks.” (According to the company, video conferencing in 2018 isn’t so bad—if you use Highfive.) When I spoke to Jeremy Roy, Highfive’s CTO and co-founder, he explained that the problem stems from a couple of conceptually simple but mechanically complicated reasons.
First of all, the Internet is perfect for trading cat GIFs and sending emails you regret immediately afterwards, but it isn’t built for the demands of video conferencing. “The Internet is a very powerful tool, I love the Internet, but it was never designed for real-time audio and video,” Roy says. “In fact, in a lot of ways, it was designed for the opposite of that. The Internet is this incredibly complex use of machinery that is meant to, for the most part, be able to get little packets of information from one place to another place that could be really far away and have a lot of complex mapping between them, and get that packet there reliably, within a reasonable amount of time. Reasonable, when you’re talking about surfing the web in the 90s, was whole seconds. But in audio and video, that’s no good at all.”
An example he gives is watching CNN and seeing Anderson Cooper speak to some correspondent out in the field whose answers come five seconds after the question is asked. That delay is a part of the technology, but while you can live with it in the specific and limited arena of an on-air news segment involving a host and a reporter, it’s no good when you’re having a regular conversation, much less a conversation between more than two people, at which point it becomes absolutely untenable. (Cue traumatic flashbacks to five people all simultaneously saying, “I think… no, you go ahead… well, I think…”)
The most basic demand of real-time audio and video is that it has to circumvent that delay. This task is made even more complicated, though, by the way that computer networking works. While you might be paying for a certain connection through your ISP, that does nothing to guarantee that your speed will actually match that number, and when you create a video connection, there’s no strict regulation of how much bandwidth you have on either side—it can fluctuate wildly, unlike, say, a landline phone, which has a fixed amount of bandwidth.
“Nobody tells you this information that you critically need, which is how much can I use, and the consequence of going over the number without being told what the number is much, much worse than if you had just slowed down and used less, because you will congest this link,” Roy says. “Imagine a ten-lane freeway merging down into two lanes: you thought you had ten lanes, so you shoved ten lanes worth of cars through it, but unknown to you, a mile ahead, there’s a two-lane congestion point. Much better if you knew that would be to just send two lanes of cars so they’d stay nice and smooth and keep going.”
This has little to no impact on browsing, since a 400 millisecond delay just means you have to wait slightly longer for Twitter to make you mad, but the equivalent on a video call is that the video freezes, or the audio cuts out, or the call disconnects entirely. Even though connections as a whole have gotten faster and better over the years, they’re still not fast enough for effective video conferencing as a rule, which leaves it up to the software to successfully influence and regulate that problem.
Competing with IRL Meetings
The second problem is that, while video conferencing might seem like a straightforward experience—get a bunch of individual audio and video connections and display them in the same digital space—the result is inherently disappointing, because it pales in comparison to what we’re used to: both professionally-edited footage and that most analog of experiences, real life.
Roy says that video conferencing software is essentially tasked by our minds, which are spoiled by the smoothness of TV and film, with trying to do the equivalent in real time of what highly specialized professionals do when they edit and display video. And because it’s meant to connect us with our colleagues or clients, it needs to compare with interacting in person as well, which is a very tall order. Our entire lives are based on person-to-person contact and communication, and the qualities inherent to this experience—comprehensibility, immediacy, and intimacy—are ones that we come to expect as a baseline.
According to Dr. Milton Chen, a researcher in the space and the CEO of telemedicine company VSee, “First impressions count, and unfortunately, delays and distortions caused by technology glitches are often perceived as flaws in the person rather than flaws in the technology.” This can stem from problems as simple as split-second audio delays, lips and voices going out of sync, and the difficulty of making eye contact due to the inconsistent positioning of webcams (not to mention that strange, all-to-human temptation to stare at yourself). One study cited by Chen contains the terrifying revelation that “whether the camera angle makes you look taller or shorter can also impact your power and influence in negotiations.”
That means video conferencing isn’t just affecting our own personal experiences: it’s having an impact on how other people experience us, and how we experience other people. Its flaws become a social dilemma. And because video conferencing is meant to mirror a familiar IRL engagement, subtle aspects of the software and the UI can lead to odd, unforeseen parallels.
As Chris Ward, the co-founder of Locus video conferencing, puts it, “The experience does not carry over any of the spatial cues we experience in the physical world. … Some systems move around videos to highlight and enlarge the speaker. While it is nice to see who is talking, this is disorienting. Imagine a physical meeting, where anytime a new person speaks, they are teleported to the front of the room and grow 100% taller. When the next person speaks, they are teleported back to their seat. … To make it even worse, imagine seeing people around the room moving their mouths, but hearing all of their voices come from a single speaker in the wall. Conversations would be impossible to follow!”
Everything Else Just Works
It doesn’t help that so much of our other technological experiences have gotten so good. When you talk to someone about the flaws of video conferencing, the question is rarely, “Why is video conferencing so bad?” It’s, “Why is video conferencing still so bad in 2018?” Compared to the almost seamless experiences of checking and sending email, sharing pictures and images over Instagram and Snapchat, broadcasting to our followers on Twitter or Facebook or LinkedIn, consuming video through YouTube, swiping through Tinder, or any number of other processes on our browsers or phones that go off without a hitch—and that allow us to do things we didn’t used to be able to do—video calls make our lives harder rather than easier. Usually, a video call happens for a reason, that reason being the participants in the call are spatially distant, but it mostly leaves us with the feeling that we wish we could’ve just gathered in person, or spoken over the phone. (This piece by The Next Web’s Matthew Hughes is a good example of the high standards that we’re holding this technology to nowadays.)
“You’re trying to replace a very everyday activity, you’re trying to provide a technical alternative to walking over to somebody’s desk if you were right next to them, and that’s a very routine motion for people,” Roy says. “It’s not the kind of activity you do where you’re expecting to be delighted: you have to do it, it needs to get done, and so, anything that puts an obstacle or puts friction in the way of routine things knocks you off of your game a lot more than when you’re doing something that is unusual, because when it’s unusual you expect to put a little bit of energy into it.”
Your Face, In Your Face
Lastly, there’s one special, and incredibly peculiar, feature of video conferencing: your face. When you have a conversation in real life, you’re rarely confronted by a confused, sweaty, distracted image of yourself as you try to engage with other people, and you’re rarely forced to be a spectator to your behavior. In video conferencing, however, that seems to be the norm, and the result, at least for me, is that, no matter how hard I try not to, I spend the entire call staring at my own face. Chen writes that, beyond the awkwardness, this has a further negative affect, an affect that might explain why we all seem to hate the technology so much: “One study even found that some people got cases of ‘video aversion’ or ‘video anxiety’ when they saw themselves in a video conference. It caused high negative feelings which were sometimes transferred to the service providing the conference.”
If all of these factors combine to make video conferencing a negative rather than a positive aspect of our work lives, Roy thinks that, in the future, this will shift as we use technology to actually enhance the experience, not just facilitate it. Think information being displayed on screen, or Alexa listen to and supplementing the conversation with graphs and data. Video conferencing can never be better than talking to people in real life, so it would be better off trying to be different, incorporating the unique experiences provided by technology rather than just trying to ape an IRL conversation.
And hey, if that creeps you out, then you can always take solace in the one advantage that video conferencing has and will always have over going into the office, no matter how often it skips, drops, and disappoints: at least you don’t have to wear pants.
Kevin Lincoln is a writer who lives in Los Angeles.