Long Story Short
I like this definition, clean, consise and to the point with zero marketing fluff.
“A class of parametrized non-linear representations encoding appropriate domain knowledge (invariance and stationarity) that can be (massively) optimized efficiently using stochastic gradient descent”