Tech Talk: Distributed Snapshots

This video describes the Chandy-Lamport algorithm that can be used to consistently snapshot the global state of a distributed system. I’ll also describe its special simplified case that’s used in Jet.

rate limit

Code not recognized.

About this course

Having fault-tolerance can be a factor to choose a distributed system even if the expected load can be handled by a single machine – a distributed system can tolerate failures of its parts while a system running on a single machine cannot. How can a stream-processing engine guarantee an exactly-once semantics? I’ll describe the Chandy-Lamport algorithm that can be used to consistently snapshot the global state of a distributed system. I’ll also describe its special simplified case that’s used in Jet. Moderater: Vladimir Schreiner Presenter: Viliam Durina

About this course

Having fault-tolerance can be a factor to choose a distributed system even if the expected load can be handled by a single machine – a distributed system can tolerate failures of its parts while a system running on a single machine cannot. How can a stream-processing engine guarantee an exactly-once semantics? I’ll describe the Chandy-Lamport algorithm that can be used to consistently snapshot the global state of a distributed system. I’ll also describe its special simplified case that’s used in Jet. Moderater: Vladimir Schreiner Presenter: Viliam Durina