Explanations in Neural Networks

less than 1 minute read

Published:

Interpretability (of a DNN) is the ability to provide explanations in understandable terms to a human. F Doshi-Velez & B Kim, 2017。

  • Ideal complete explanation is based on serious math symbols and logic rules.
  • Explainable boundary or explainable depth. Deeper explanability usually means higher reliability (AIMedical)
  • Taxonomy: Active (Pre-) vs Passive (Post-); Global vs Local; Rules (Tree), Hidden Semantics (), Attribution (Weights for features), Examples (Similar, prototypes)