Processing math: 45%

Rylan Schaeffer

Logo
Resume
Research
Learning
Blog
Teaching
Jokes
Kernel Papers


Chebyshev Inequality

Let X be a real valued random variable. α>0,

P[(XE[X])2α]V[X]α

Alternatively, c>0

P[(XE[X])2cV[X]]1c
Proof: Apply Markov’s inequality to the standard deviation $$ X - \mathbb{E}[X] $$:
P[XE[X]α]P[(XE[X])2α2]E[|XE[X]|]α

Comparison with other inequalities

Chebyshev’s is great when you have a sum of pairwise independent things because then V[nXn]=nV[Xn]

Examples

Pairwise Independent Hashing

Let p be prime and consider a hash family H={hab:aZkp,bZp} where hab:ZkpZp given by h_{a, b} = a \cdot x + b \mod p. Fix x_1, ..., x_N \in \mathbb{Z}_p^k and pick y \in \mathbb{Z}_p. What is the probability, over h \in H drawn uniformly at random, that h(x_i) = y for more than (1/p + \epsilon)N values of i? In other words, assuming we choose elements from the hash family uniformly at random, what’s the probability that multiple elements map to the same value y?

Define X_i = \mathbb{I}\{h(x_i) = y\} and X = \sum X_i. Note that \mathbb{E}[X] = \frac{N}{P}. Then:

\mathbb{P}[X \geq (1/p + \epsilon)N] \leq \frac{\mathbb{V}[X]}{\epsilon^2 n^2} = \frac{\sum (1/p)(1 - 1/p)}{\epsilon^2 n^2}