Experts debate the ethics of LinkedIn’s algorithm experiments on 20M users

This month, LinkedIn researchers revealed in Science that the company spent five years quietly researching more than 20 million users. By tweaking the professional networking platform’s algorithm, researchers were trying to determine through A/B testing whether users end up with more job opportunities when they connect with known acquaintances or complete strangers.

To weigh the strength of connections between users as weak or strong, acquaintance or stranger, the researchers analyzed factors like the number of messages they sent back and forth or the number of mutual friends they shared, gauging how these factors changed over time after connecting on the social media platform. The researchers’ discovery confirmed what they describe in the study as “one of the most influential social theories of the past century” about job mobility: The weaker the ties users have, the better the job mobility. While LinkedIn says these results will lead to changes in the algorithm to recommend more relevant connections to job searchers as “People You May Know” (PYMK) moving forward, The New York Times reported that ethics experts said the study “raised questions about industry transparency and research oversight.”

Among experts’ biggest concerns was that none of those millions of users LinkedIn analyzed were directly informed they were participating in the study—which “could have affected some people’s livelihoods,” NYT’s report suggested.

Michael Zimmer, an associate professor of computer science and the director of the Center for Data, Ethics, and Society at Marquette University, told NYT that “the findings suggest that some users had better access to job opportunities or a meaningful difference in access to job opportunities.”

LinkedIn clarifies A/B testing concerns

A LinkedIn spokesperson told Ars that the company disputes this characterization of their research, saying that nobody was disadvantaged by the experiments. Since NYT published its report, LinkedIn’s spokesperson told Ars that the company has been fielding questions due to “a lot of inaccurate representation of the methodology” of its study.

The study’s co-author and LinkedIn data scientist, Karthik Rajkumar, told Ars that reports like NYT’s conflates “the A/B testing and the observation nature of the data,” making it “feel more like experimentation on people, which is inaccurate.”

Rajkumar said the study came about because LinkedIn noticed the algorithm was already recommending a larger number of connections with weaker ties to some users and a larger number of stronger ties to others. “Our A/B testing of PYMK was for the purpose of improving relevance of connection recommendations, and not to study job outcomes,” Rajkumar told Ars. Instead, his team’s objective was to find out “which connections matter most to access and secure jobs.”

Although it’s called “A/B testing,” suggesting it’s comparing two options, the researchers did not just look at weak ties versus strong ties, exclusively testing a pair of algorithms that generated either. Rather, the study experimented with seven different “treatment variants” of the algorithm, noting that different variants yielded different results, such as users forming fewer weak ties, creating more ties, creating fewer ties, or making the same number of weak or strong ties. Two variants, for example, caused users to form more ties in general, including more weak ties, while another variant led users to form fewer ties in general, including fewer weak ties. One variant led to more ties, but only strong ties.

“We don’t randomly vary the proportion of weak and strong contacts suggested by PYMK,” a LinkedIn spokesperson told Ars. “We strive to make better recommendations to people, and some algorithms happen to recommend more weak ties than others. Because some people end up getting the better algorithms a week or two earlier than others during the test period, this creates enough variation in the data for us to apply the observational causal methods to analyze them. No one is being experimented on to observe job outcomes.”

https://arstechnica.com/?p=1884675