How LinkedIn Suggests People You May Know
You open LinkedIn. In the right sidebar, you see: People You May Know. And there they are — your old college classmate, your previous coworker, someone who works in the same industry across town. How does LinkedIn know?
Let us pull back the curtain. This is one of the most fascinating recommendation systems in tech, and it drives more than 50% of all new connections on LinkedIn.
Source: This explanation draws from LinkedIn Engineering's article on PYMK (People You May Know) and their published research papers at WWW'13 and KDD'14.
It Starts With a Graph
LinkedIn models everything as a graph. Think of it like a giant web:
- Each person is a dot (node)
- Each connection is a line (edge)
- Companies, schools, and skills are also dots connected to you
If you are connected to Alice, and Alice is connected to Bob, then there is a path from you to Alice to Bob. That makes Bob a second-degree connection. And that is where the magic begins.
LinkedIn built a custom graph database called LIquid that powers this. It handles over 270 billion relationships and processes 2 million queries per second. That is serious engineering.
The Core Algorithm: Friends-of-Friends
The simplest way LinkedIn finds suggestions is triangle closing.
If you know Alice, and Alice knows Bob, then maybe you know Bob too. LinkedIn scores each possible connection using hundreds of signals:
| Signal | What It Means | |--------|--------------| | Mutual connections | The more you share, the higher the score | | Same company | Did you work at the same place? | | Same school | Classmates or alumni | | Same industry | Related roles are 5x more likely to be suggested | | Location | People nearby are more relevant | | Age difference | You are more likely to know people your own age | | Profile views | If you view someone and they view you back, that is a strong signal |
LinkedIn feeds all these signals into a logistic regression model — a machine learning model that predicts how likely it is that Person A knows Person B.
Key insight: Not all overlaps are equal. Working together at a 10-person startup for 5 years is a much stronger signal than working at the same megacorp for 2 months. LinkedIn published a research paper on this at WWW'13.
The ML Layer: Learning From Your Behavior
Here is where it gets smart. LinkedIn watches what you do with suggestions:
- If you connect, the model learns that this suggestion was good and shows more like it.
- If you ignore it, the model learns that this was not relevant and lowers its score.
- If you click "I don't know this person", the model learns that this was wrong and stops showing it.
This is called impression discounting (published at KDD'14). Every suggestion you see and ignore gets its score reduced. The system rewrites itself daily based on real user behavior.
Scale: Processing Terabytes Every Day
At LinkedIn's scale (1+ billion members), PYMK is a massive engineering challenge:
- Daily data processed: hundreds of terabytes
- Potential connections evaluated: hundreds of billions
- Refresh cycle: every single day
- Batch processing pipeline: uses Apache Kafka for streaming, Azkaban for workflow management, and Voldemort for fast data serving
The system first pre-computes suggestions in an offline batch, then serves them instantly when you open LinkedIn. If you need real-time suggestions after connecting with someone new, the online system queries LIquid directly.
Key Takeaways
- Graphs are everywhere — LinkedIn's entire product runs on a graph database.
- Mutual connections are king — triangle closing is the number one signal.
- Every action trains the model — your clicks, ignores, and connections teach the algorithm.
- Scale changes everything — processing hundreds of billions of potential connections daily requires custom infrastructure.
Next time you see "People You May Know", remember that there is a graph database processing billions of relationships, a machine learning model learning from your every click, and an engineering team processing terabytes of data daily to make that one little box appear.