Causality preliminary
Published:
This is an reading notes from an causality introduction website
Yule-Simpson’s Paradox
- Two subjects, with each passing rate for male is less than that for female, but overall, passing rate for Male is greater than that for female
- Mathematically, a/b < c/d, and a’/b’ < c’/d’, but (a + a’)/(b+b’) > (c + c’)/(d + d’), it is trivial
- But statistically, it does make sense: a third (underlying, unobservale) variable could twist/change relations of two varialbes. It means that observation data might result in wrong conclusions.
Rubin Causal Model, More Precise
- i-th individual, Zi = 1/0 means treating/intervening or not, Yi(1/0) means potentional outcome for intervening or not
- individual causal effect, Yi(1) - Yi(0), but only Yi = Zi * Yi(1) - (1 - Zi) * Yi(0) could be observed in practice
- make Z random, which means Z and {Y(1), Y(0)} are independent, then Average Causal Effect (ACE): ACE(Z->Y) = E[Yi(1) - Yi(0)] = E[Yi(1)] - E[Yi(0)] = E[Yi(1)|Zi=1] - E[Yi(0)|Zi=0] = E[Yi|Zi=1] - E[Yi|Zi=0]
- When sample is finite, {Yi(1), Yi(0)} might be unkown but constant, so Yi is random only because Z is made random
Fisher Randomization Test
- sharp null, H0: Yi(1) = Yi(0), for all i=1,…,n
- Z = (Z1, …, Zn), Y = (Y1, …, Yn), m = sum_i Zi means number of individuals intervened
randomization assignment, p(Z Y) = 1 / C(m from n)
Neyman Repeated Sampling Procedure
- under finite sample, how to estimate ACE tao = 1/n sum_i (Yi(1) - Yi(0))
- one unbiased estimator, 1/m sum_i (ZiYi) - 1/(n-m) sum_i (1 - Zi)Yi
Ignorability
- Z means intervening or not, Y means potential outcome, X means covariant
- a strong ignorability, Z and {Yi(1), Yi(0)} are independent
- a general ignorability, Z and {Yi(1), Yi(0)} are independent conditioned on X???
- the effect of some causation (could be verified), the causation of some effect (could be assumed or explored)
ACE = E[Y(1) - Y(0)] = E[Y(1)] - E[Y(0)] ?=? E[E[Y(1) X]] - E[E[Y(0) X]] = E[E[Y(1) X, Z=1]] - E[E[Y(0) X, Z=0]] = E[E[Y X, Z=1]] - E[E[Y X, Z=0]] propensity score, e(X) = p(Z=1 X)
Causal Diagram
- Directed Acyclic Graph (DAG), parents, children
- Do operator, means intervention: In DAG, do(Xi)=xi means cut off all edges pointing to Xi and let Xi = xi as a constant usually, p(…|do(Xi)=xi) != p(…|Xi=xi)
RCM <=> PCM, p(Y do(Z) = z) = p(Y(z))