<ul><li>Clients are subscribing to events for a particular live video i.e. they are telling the server which live video they are watching.</li><li>The Frontend server stores all subscriptions in an in-memory table.</li><li>Every time a new event is published, the supervisor actor does a lookup in the in-memory table to determine which actors need to receive this event.</li></ul>

Challenge 3: Multiple Live Videos

LinkedIn has built the Realtime Platform to distribute multiple types of data in real-time such as:<ul><li>Likes, comments and viewer count for Live Videos</li><li>Typing indicators and Read receipts for Instant Messaging</li><li>Presence i.e. the green online indicators</li></ul>Their goal is to increase user engagement by enabling dynamic instant experiences between users, such as: likes, comments, polls, discussions etc.

The Realtime Platform

<ul><li>Scale horizontally to handle more concurrent viewers -> Add multiple Frontend nodes and coordinate them using a Dispatcher node.</li><li>In a similar fashion to the Frontend node, the Dispatcher has a subscriptions table to know which frontend nodes should receive which events.</li><li>This table is populated when Frontend nodes send subscription requests to tell the Dispatcher which live videos they're interested in (i.e. which live videos its connections are subscribed to).</li></ul>

Challenge 4: 10K Concurrent Viewers

<ul><li>How LinkedIn displays Presence indicators in real-time: https://engineering.linkedin.com/blog/2018/01/now-you-see-me--now-you-dont--linkedins-real-time-presence-platf</li><li>How LinkedIn measures end-to-end latency across systems: https://engineering.linkedin.com/blog/2018/04/samza-aeon--latency-insights-for-asynchronous-one-way-flows</li><li>How LinkedIn scaled one server to handle hundreds of thousands of persistent connections: https://engineering.linkedin.com/blog/2016/10/instant-messaging-at-linkedin--scaling-to-hundreds-of-thousands-</li></ul>

Final architecture

<ul><li>User devices have a persistent connection to the Realtime Platform servers.</li><li>The servers use server-sent events to stream data fast on this connection via the EventSource interface.</li></ul>A persistent connection is an HTTP Long Poll i.e. a regular HTTP connection where the server doesn't disconnect it.

Challenge 1: The Delivery Pipe

Expanding to other regions requires two steps:<ol><li>Replicate the setup in each data-center.</li><li>Have the Dispatcher broadcast the events to its peer Dispatchers from the other data-centers.</li></ol>

Bonus Challenge: Multiple data-centers

Having a lot of people interact on live videos poses many technical challenges. Mainly because viewers generate a lot of interactions that need to be delivered fast.To get a sense of the scale, the top live streams in the world gathered millions of concurrent users:<ol><li>Cricket World Cup Semifinal 2019 - 25M concurrent viewers</li><li>British Royal Wedding 2018 - 18M concurrent viewers</li></ol>

Interactive Live Videos

<ul><li>Each connection is managed by an Akka actor.</li><li>Actors are so lightweight that there can be millions of them on a single system. Moreover, all of them can be served by a small number of threads, proportional to the number of cores. This is possible because a thread is assigned to an actor only when it has work to do.</li><li>Actors are managed by an Akka supervisor actor that sends them events (likes, comments etc.) which need to be forwarded to user devices.</li></ul>Akka is a toolkit for building highly concurrent, distributed, and resilient message-driven apps.

Challenge 2: Connection Management

<ul><li>Scale horizontally again to handle more events -> Add multiple Dispatchers and move the subscription table into a key-value store so it's accessible to all Dispatchers.</li><li>Dispatchers are independent from Frontend nodes and don't have persistent connections between them.</li><li>Any Frontend node can subscribe to any Dispatcher.</li><li>Any Dispatcher can publish events to any Frontend node.</li></ul>

Challenge 5: 100 Likes/second

<ul><li>Each Frontend node handles 100k persistent connections. It handles only this many connections because the server is doing a lot of work processing multiple types of data (likes, comments, instant messaging etc.).</li><li>Each Dispatcher can publish 5k events per second to the Frontend nodes.</li><li>End-to-end Publish Latency is 75ms at p90, from the moment the Like is received until the Like is sent to a viewer. There are only two lookups in the subscriptions tables and a series of network calls.</li><li>The system is completely horizontally scalable. You can add more Dispatchers and Frontend nodes to handle more viewers and more events.</li></ul>

Performance and scale

Video with transcript included: https://bit.ly/3nwZkBt Akhilesh Gupta does a technical deep-dive into how LinkedIn uses the Play/Akka Framework and a scalable distributed system to enable live interactions like likes/comments at massive scale at extremely low costs across multiple data centers. This presentation was recorded at QCon London 2020: http://bit.ly/2VfRldq #LinkedIn #DistributedSystems #Scalability

technologyandthefuture

Technology & The Future

computerscience

Computer Science

artificialintelligence

Artificial Intelligence

QCon San Francisco International Software Conference is back this October 2-6, 2023. Software leaders at early adopter companies will come together to share actionable insights to help you adopt the right technologies and practices. 

Get exposed to new ideas and innovative approaches to software development and engineering, guaranteed to inspire and challenge you. 

Don’t miss this opportunity to take your knowledge and skills to the next level, and stay ahead in the fast-paced world of technology. 

Attend in-person. Or get a Video-Only Pass to watch recordings later. 

Register now: https://bit.ly/3jD8ccm
----------------------------------------------------------------------------------------------------------------------------------
Video with transcript included: https://bit.ly/3nwZkBt             

Akhilesh Gupta does a technical deep-dive into how LinkedIn uses the Play/Akka Framework and a scalable distributed system to enable live interactions like likes/comments at massive scale at extremely low costs across multiple data centers.

This presentation was recorded at QCon London 2020: http://bit.ly/2VfRldq 

#LinkedIn #DistributedSystems #Scalability

InfoQ - Human progress through technology

Software is changing the world, and our mission is to help progressive software development teams adopt new technologies and practices.

InfoQ provides software engineers with the opportunity to share experiences gained using innovator and early adopter stage techniques and technologies with the wider industry. We carefully curate and peer review everything we publish. We strongly believe that the high-quality insights, offered by both our editors and other contributors, have the power to uplift entire communities, no matter what their native language. 

InfoQ currently offers content in English, Japanese, Chinese, Portuguese and French.