Contributing Columnist

What LinkedIn learned leveraging LLMs for its billion users

Feature
Apr 25, 20249 mins
Generative AISoftware DevelopmentTechnology Industry

The social media giant turned to generative AI to improve its member services. Here’s an inside look at what its engineers learned about leveraging LLMs for business results.

Linkedin
Credit: Greg Bulla / Unsplash

With more than 1 billion users globally, LinkedIn is continuously bumping against the limits of what is technically feasible in the enterprise today. Few companies operate at the scale that LinkedIn does or have access to similar troves of data.

For the business- and employment-focused social media platform, connecting qualified candidates with potential employers to help fill job openings is core business. So too is ensuring that post feeds across the platform are relevant to the members who consume them. And at LinkedIn scale, those matchmaking processes have always relied on technology.

During the summer of 2023, at the height of the first wave of interest in generative AI, LinkedIn began to wonder whether matching candidates with employers and making feeds more useful would be better served with the help of large language models (LLMs).

So the social media giant launched a generative AI journey and is now reporting the results of its experience leveraging Microsoft’s Azure OpenAI Service. CIOs in every vertical can take a tip or two from the lessons LinkedIn learned along the way.

Fits and starts

As most CIOs have experienced, embracing emerging technologies comes with its share of experimentation and setbacks. For LinkedIn, this was no different, as its road to LLM insights was anything but smooth, said LinkedIn’s Juan Bottaro, a principal software engineer and tech lead.

The initial deliverables “felt lacking,” Bottaro said. “Not enough dots were being connected.”

Those first waves of hype around generative AI didn’t help.

“LLM was the new thing and it felt like it could solve everything,” Bottaro said. “We didn’t start with a very clear idea of what an LLM could do.”

For example, an early version of the revised job-matching effort was rather, for the lack of a better word, rude. Or at least overly blunt.

“Clicking ‘Assess my fit for this job’ and getting ‘You are a terrible fit’ isn’t very useful,” Bottaro said. “We wanted [responses] to be factual but also empathetic. Some members may be contemplating a career change into fields where they currently do not have a strong fit and need help understanding what are the gaps and the next steps.”

So, among LinkedIn’s first lessons was the importance of tuning LLMs to audience expectations — and helping the LLMs understand how to be perhaps not human, but at least humane with its responses.

A question of speed

Even though LinkedIn has more than a billion members, most of the job search features that would rely on LinkedIn’s LLM work were to be targeted initially to Premium members, a far smaller group. (LinkedIn declined to comment on how many Premium members it has.) 

When operating at that scale, speed can be essential, especially for something as nuanced as matching candidates with relevant job openings. Here, it was believed an LLM would help, as an oft-touted benefit of LLMs is their speed, enabling them to complete complex steps rapidly. That was notthe case with LinkedIn’s deployment, Bottaro said.

“I wouldn’t characterize LLMs as fast. I wouldn’t characterize speed as an advantage,” he said.

But speed can be defined in multiple ways. Even though operationally the LLM may not have been as fast as hoped, Bottaro said the acceleration of the overall deployment process was astounding. “The superpower of this new technology is that you can create prototypes very fast, somewhere between two and three months. That was not possible before this technology,” he said. 

Asked how long the various facets of the project would have taken without LLM, Bottaro said some couldn’t have been accomplished at all while other elements “maybe it would have taken several years.”

As an example, Bottaro referenced the part of the system designed to understand intent. Without an LLM, that might have taken two to three months, he said, but the LLM mastered it “in less than a week.”

Cost considerations

One aspect that Bottaro dubbed “a hurdle” was the cost. Again, cost can mean different things in different phases of a project, as LinkedIn’s experience shows.

“The amount that we spent to be able to develop was negligible,” Bottaro said. But when it came to delivering data to LinkedIn’s customers, costs exploded.

“Even if it was just going to a couple million members,” said Bottaro, potentially hinting at the number of Premium users, the pricing soared. That is because LLM pricing — at least the licensing deal that LinkedIn struck with Microsoft, its LLM provider and parent company — was based on usage, specifically input and output tokens.

One AI vendor CEO, Tarun Thummala, explains in a LinkedIn post unrelated to this project that LLM input and output tokens are roughly equivalent to 0.75 of a word. LLM vendors typically sell tokens by the thousands or millions. Azure OpenAI, which LinkedIn uses, charges $30 for every 1 million 8K GPT-4 input tokens and $60 for every 1 million 8K GPT-4 output tokens out of its East US region, for example.

Evaluation challenges

Another functionality goal LinkedIn had for its project was automatic evaluation. LLMs are notoriously challenging to assess in terms of accuracy, relevancy, safety, and other concerns. Leading organizations, and LLM makers, have been attempting to automate some of this work, but according to LinkedIn, such capabilities are “still a work in progress.”

Without automated evaluation, LinkedIn reports that “engineers are left eye-balling results and testing on a limited set of examples and having a more than a 1+ day delay to know metrics.”

The company is building model-based evaluators to help estimate key LLM metrics, such as overall quality score, hallucination rate, coherence, and responsible AI violations. Doing so will enable faster experimentation, the company’s engineers say, and though LinkedIn’s engineers have had some success with hallucination detection, they haven’t been able to finish work in that area yet.

Data quality

Part of the struggle LinkedIn experienced with its job match effort boils down to a data quality issue from both sides: employers and potential employees.

LLMs can work only with what it is given and sometimes job posts are not precise or comprehensive in what skills an employer seeks. On the flip side, some job applicants post poorly phrased histories that do not effectively reflect their wealth of experience in problem-solving, for example. 

Here, Bottaro sees potential for LLMs to help employers and potential employees alike. By improving what employers and LinkedIn users write, both sides could benefit, as the company’s job-matching LLM could work far more effectively when its data inputs are higher quality.

User experience

When dealing with such a large member base, accuracy and relevancy metrics can “give a false sense of comfort,” Bottaro said. For example, if the LLM “gets it right 90% of the time, that means that one out of ten would have a horrible experience,” he said.

Making this deployment more difficult is the extreme level of nuance and judgment involved in delivering useful, helpful, and accurate answers.

“How do you agree what is good and bad? We spent a lot of time with linguists to make guidelines about how to deliver a thorough representation. We did a lot of user studies, too,” Bottaro said. “How do you train the person to write the right response? How do you define the task, specify how the response should look? The product may be trying to be constructive or helpful. It is not trying to assume too much because thatis where hallucinations start. The consistency of the responses is something we are super proud of.”

Operating in real-time

LinkedIn’s sheer scale presents another challenge for job matches. With a billion members, a job ad can have hundreds — if not thousands — of responses within a few minutes of posting. Many job applicants won’t bother to apply if they see that hundreds have already applied. That puts the onus on the LLM to very quickly find members that are matches before less qualified applicants submit their material. After that, it’s still a matter of that member seeing the notification and reacting to it quickly enough.

On the employer side, the struggle is finding the applicants that are the best fit — not necessarily those who respond the quickest. The reluctance of some companies to post salary ranges further complicates both efforts as the best qualified applicants may not be interested in a role depending on its compensation. That is a problem that an LLM can’t fix.

APIs and RAG

LinkedIn’s massive trove of data includes a lot of unique information about individuals, employers, skills, and coursework, which its LLMs have not been trained on. As a result, its LLMs are unable to use these assets, as they are currently stored and delivered, for any of their reasoning or response-generating activities, according to LinkedIn’s engineers.

Here, retrieval augmented generation (RAG) is a typical work-around. By setting up a pipeline of internal APIs, enterprises can ‘augment’ LLM prompts with additional context to better guide — and guardrail — the LLM’s response. Much of LinkedIn’s data is exposed via RPC APIs, something the company’s engineers say is “convenient for humans to invoke programmatically,” but “is not very LLM friendly.”

To solve this issue, LinkedIn’s engineers “wrapped skills” around its APIs to give them an “LLM friendly description of what the API does and when to use it,” as well as configuration details, input and output schema, and all the necessary logic to map the LLM version of each API to its underlying (actual) RPC one, LinkedIn said.

“Skills like this enable the LLM to do various things relevant to our product like view profiles,

Search articles/people/jobs/companies and even query internal analytics systems,” the company’s engineers wrote in a statement. “The same technique is also used for calling non-LinkedIn APIs like Bing search and news.”

Contributing Columnist

Evan Schuman has covered IT issues for a lot longer than he'll ever admit. The founding editor of retail technology site StorefrontBacktalk, he's been a columnist for CBSNews.com, RetailWeek, Computerworld and eWeek and his byline has appeared in titles ranging from BusinessWeek, VentureBeat and Fortune to The New York Times, USA Today, Reuters, The Philadelphia Inquirer, The Baltimore Sun, The Detroit News and The Atlanta Journal-Constitution. Evan can be reached at eschuman@thecontentfirm.com and he can be followed at twitter.com/eschuman. Look for his blog twice a week.

The opinions expressed in this blog are those of Evan Schuman and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.

More from this author