AI can run your work meetings now

  News
image_pdfimage_print
Headroom is one of several apps advertising AI as the solution for your messy virtual/video meetings.
Enlarge / Headroom is one of several apps advertising AI as the solution for your messy virtual/video meetings.

Julian Green was explaining the big problem with meetings when our meeting started to glitch. The pixels of his face rearranged themselves. A sentence came out as hiccups. Then he sputtered, froze, and ghosted.

Green and I had been chatting on Headroom, a new video conferencing platform he and cofounder Andrew Rabinovich launched this fall. The glitch, they assured me, was not caused by their software, but by Green’s Wi-Fi connection. “I think the rest of my street is on homeschool,” he said, a problem that Headroom was not built to solve. It was built instead for other issues: the tedium of taking notes, the coworkers who drone on and on, and the difficulty in keeping everyone engaged. As we spoke, software tapped out a real-time transcription in a window next to our faces. It kept a running tally of how many words each person had said (Rabinovich dominated). Once our meeting was over, Headroom’s software would synthesize the concepts from the transcript; identify key topics, dates, ideas, and action items; and, finally, spit out a record that could be searched at a later time. It would even try to measure how much each participant was paying attention.

Meetings have become the necessary evil of the modern workplace, spanning an elaborate taxonomy: daily stand-ups, sit-downs, all-hands, one-on-ones, brown-bags, status checks, brainstorms, debriefs, design reviews. But as time spent in these corporate conclaves goes up, work seems to suffer. Researchers have found that meetings correlate with a decline in workplace happiness, productivity, and even company market share. And in a year when so many office interactions have gone digital, the usual tedium of meeting culture is compounded by the fits and starts of teleconferencing.

Recently, a new wave of startups has emerged to optimize those meetings with, what else, technology. Macro (“give your meeting superpowers”) makes a collaborative interface for Zoom. Mmhmm offers interactive backgrounds and slide-share tools for presenters. Fireflies, an AI transcription tool, integrates with popular video conferencing platforms to create a searchable record of each meeting. And Sidekick (“make your remote team feel close again”) sells a dedicated tablet for video calls.

The idea behind Headroom, which was conceived pre-pandemic, is to improve on both the in-person and virtual problems with meetings, using AI. (Rabinovich used to head AI at Magic Leap.) The use of video conferencing was already on the rise before 2020; this year it exploded, and Green and Rabinovich are betting that the format is here to stay as more companies grow accustomed to having remote employees. Over the last nine months, though, many people have learned firsthand that virtual meetings bring new challenges, like interpreting body language from other people on-screen or figuring out if anyone is actually listening.

“One of the hard things in a videoconference is when someone is speaking and I want to tell them that I like it,” says Green. In person, he says, “you might head nod or make a small aha.” But on a video chat, the speaker might not see if they’re presenting slides, or if the meeting is crowded with too many squares, or if everyone who’s making verbal cues is on mute. “You can’t tell if it’s crickets or if people are loving it.”

Headroom aims to tackle the social distance of virtual meetings in a few ways. First, it uses computer vision to translate approving gestures into digital icons, amplifying each thumbs up or head nod with little emojis that the speaker can see. Those emojis also get added to the official transcript, which is automatically generated by software to spare someone the task of taking notes. Green and Rabinovich say this type of monitoring is made clear to all participants at the start of every meeting, and teams can opt out of features if they choose.

More uniquely, Headroom’s software uses emotion recognition to take the temperature of the room periodically, and to gauge how much attention participants are paying to whoever’s speaking. Those metrics are displayed in a window on-screen, designed mostly to give the speaker real-time feedback that can sometimes disappear in the virtual context. “If five minutes ago everyone was super into what I’m saying and now they’re not, maybe I should think about shutting up,” says Green.

Emotion recognition is still a nascent field of AI. “The goal is to basically try to map the facial expressions as captured by facial landmarks: the rise of the eyebrow, the shape of the mouth, the opening of the pupils,” says Rabinovich. Each of these facial movements can be represented as data, which in theory can then be translated into an emotion: happy, sad, bored, confused. In practice, the process is rarely so straightforward. Emotion recognition software has a history of mislabeling people of color; one program, used by airport security, overestimated how often Black men showed negative emotions, like “anger.” Affective computing also fails to take cultural cues into context, like whether someone is averting their eyes out of respect, shame, or shyness.

For Headroom’s purposes, Rabinovich argues that these inaccuracies aren’t as important. “We care less if you’re happy or super happy, so long that we’re able to tell if you’re involved,” says Rabinovich. But Alice Xiang, the head of fairness, transparency, and accountability research at the Partnership on AI, says even basic facial recognition still has problems—like failing to detect when Asian individuals have their eyes open—because they are often trained on white faces. “If you have smaller eyes, or hooded eyes, it might be the case that the facial recognition concludes you are constantly looking down or closing your eyes when you’re not,” says Xiang. These sorts of disparities can have real-world consequences as facial recognition software gains more widespread use in the workplace. Headroom is not the first to bring such software into the office. HireVue, a recruiting technology firm, recently introduced an emotion recognition software that suggests a job candidate’s “employability,” based on factors like facial movements and speaking voice.

Constance Hadley, a researcher at Boston University’s Questrom School of Business, says that gathering data on people’s behavior during meetings can reveal what is and isn’t working within that setup, which could be useful for employers and employees alike. But when people know their behavior is being monitored, it can change how they act in unintended ways. “If the monitoring is used to understand patterns as they exist, that’s great,” says Hadley. “But if it’s used to incentivize certain types of behavior, then it can end up triggering dysfunctional behavior.” In Hadley’s classes, when students know that 25 percent of the grade is participation, students raise their hands more often, but they don’t necessarily say more interesting things. When Green and Rabinovich demonstrated their software to me, I found myself raising my eyebrows, widening my eyes, and grinning maniacally to change my levels of perceived emotion.

In Hadley’s estimation, when meetings are conducted is just as important as how. Poorly scheduled meetings can rob workers of the time to do their own tasks, and a deluge of meetings can make people feel like they’re wasting time while drowning in work. Naturally, there are software solutions to this, too. Clockwise, an AI time management platform launched in 2019, uses an algorithm to optimize the timing of meetings. “Time has become a shared asset inside a company, not a personal asset,” says Matt Martin, the founder of Clockwise. “People are balancing all these different threads of communication, the velocity has gone up, the demands of collaboration are more intense. And yet, the core of all of that, there’s not a tool for anyone to express, ‘This is the time I need to actually get my work done. Do not distract me!’”

Clockwise syncs with someone’s Google calendar to analyze how they’re spending their time, and how they could do so more optimally. The software adds protective time blocks based on an individual’s stated preferences. It might reserve a chunk of “do not disturb” time for getting work done in the afternoons. (It also automatically blocks off time for lunch. “As silly as that sounds, it makes a big difference,” says Martin.) And by analyzing multiple calendars within the same workforce or team, the software can automatically move meetings like a “team sync” or a “weekly 1×1” into time slots that work for everyone. The software optimizes for creating more uninterrupted blocks of time, when workers can get into “deep work” without distraction.

Clockwise, which launched in 2019, just closed an $18 million funding round and says it’s gaining traction in Silicon Valley. So far, it has 200,000 users, most of whom work for companies like Uber, Netflix, and Twitter; about half of its users are engineers. Headroom is similarly courting clients in the tech industry, where Green and Rabinovich feel they best understand the problems with meetings. But it’s not hard to imagine similar software creeping beyond the Silicon Valley bubble. Green, who has school-age children, has been exasperated by parts of their remote learning experience. There are two dozen students in their classes, and the teacher can’t see all of them at once. “If the teacher is presenting slides, they actually can see none of them,” he says. “They don’t even see if the kids have their hands up to ask a question.”

Indeed, the pains of teleconferencing aren’t limited to offices. As more and more interaction is mediated by screens, more software tools will surely try to optimize the experience. Other problems, like laggy Wi-Fi, will be someone else’s to solve.

This story first appeared on wired.com

https://arstechnica.com/?p=1725545