Explanations in Neural Networks
Published:
Interpretability (of a DNN) is the ability to provide explanations in understandable terms to a human. F Doshi-Velez & B Kim, 2017。
- Ideal complete explanation is based on serious math symbols and logic rules.
- Explainable boundary or explainable depth. Deeper explanability usually means higher reliability (AIMedical)
- Taxonomy: Active (Pre-) vs Passive (Post-); Global vs Local; Rules (Tree), Hidden Semantics (), Attribution (Weights for features), Examples (Similar, prototypes)