(#021) Coded Data Rebalancing - Abhinav Vaishya

Date & Time: 23-01-2021, 22:15 IST

Abstract

In distributed storage systems, data is usually stored in a distributed fashion in several nodes with some replications. This is done so that the data is reliably maintained and is easily available for multiple clients. In such systems, nodes can fail or can be added, because of which the replication factor changes. This phenomenon is called Data Skew. The goal is to correct this Data Skew and reinstate the replication factor. This involves communication between nodes and thus incurs a cost. Coded Communication (communication of linear combinations of data symbols) has the potential to reduce this communication load by a multiplicative factor. In this talk, we will also see how we can preserve the essential structure of how the data is stored in the system.

Prerequisites

None

Resources

Paper - Coded Data Rebalancing: Fundamental Limits and Constructions

Talk Slides