Average distance of two random k-mers

#math #distance

Intuitively, I've always thought that the average distance of two random k-mers is equal to $4^k / 2$, but it turns out that this intuition is wrong.

To prove it, let's compute the average of $f : (x, y) \mapsto |x - y|$ over $[0, u]^2$ for a fixed $u$: $$\begin{aligned} & \frac{1}{u^2} \int_0^u \int_0^u |x - y| \,\mathrm{d}y \,\mathrm{d}x \\ &= \frac{1}{u^2} \int_0^u \left(\int_0^x x - y \,\mathrm{d}y\right) + \left(\int_x^u y - x \,\mathrm{d}y\right) \,\mathrm{d}x \\ &= \frac{1}{u^2} \int_0^u \frac{x^2}{2} + \frac{(u - x)^2}{2} \,\mathrm{d}x \\ &= \frac{1}{u^2} \int_0^u x^2 - x + \frac{u^2}{2} \,\mathrm{d}x \\ &= \frac{1}{u^2} \int_0^u x^2 \,\mathrm{d}x \\ &= \frac{u}{3} \end{aligned}$$

In particular, for $u = 4^k$ this gives an average distance of $4^k / 3$.

Thanks Lucas for pointing it out!