EnVision: Artificial Intelligence

Showing posts with label Artificial Intelligence. Show all posts

Wednesday, February 5, 2020

Software Commodification

Something that people outside the don't fully grasp is the increasing pace of .

You had a Phd in CS 20 years ago ? congrats all your knowledge is a library.

You were an excellent physicist 10 years ago ? Congrats we can import and run everything you knew without even understanding half of it.

You studied Artificial Intelligence in depth 15 years ago ? A 17 year old with a weekend in can provide better solutions than you. Everything you know is getting obsolete in an increasing rate.

As people we are are used to packaging up our knowledge and re-learning new skills (tools, theories, frameworks) every couple of years, because the old have been completely commoditized.

It is an exciting but tiresome journey that it shouldn't be required by all. However as everything is becoming software there may not be a choice by most people.

Personally the increasing pace of commodification allows me to cut through the noise and add depth to my knowledge. Knowing that the low hanging fruits have been picked will clear up the space of and Machine Learning.

Saturday, April 21, 2018

The "No Free Lunch" Theorem

The "No Free Lunch" theorem was first published by David Wolpert and William Macready in their 1996 paper "No Free Lunch Theorems for Optimization".

In computational complexity and optimization the no free lunch theorem is a result that states that for certain types of mathematical problems, the computational cost of finding a solution, averaged over all problems in the class, is the same for any solution method. No solution therefore offers a 'short cut'.

A model is a simplified version of the observations. The simplifications are meant to discard the superfluous details that are unlikely to generalize to new instances. However, to decide what data to keep , you must make assumptions. For example, a linear model makes the assumption that the data is fundamentally linear and the distance between the instances and the straight line is just noise, which can safely be ignored.

David Wolpert demonstrated that if you make absolutely no assumption about the data, then there is no reason to prefer one model over any other. This is called the "No Free Lunch Theorem" (NFL).

NFL states that no model is a priori guaranteed to work better. The only way to know for sure which model is the best is to evaluate them all. Since this is not possible, in practice you make some reasonable assumptions about the data and you evaluate only a few reasonable models.

Thursday, February 8, 2018

Critique on "Deep Learning: A Critical Appraisal "

Deep Learning: A Critical Appraisal

Gary Marcus argues that deep learning is :
1. Shallow : Meaning it has limited capacity for transfer
2. Data Hungry: Requires millions of examples to generalize sufficiently
3. Not transparent enough: It is treated as a black box

I'm not an academic but I've been reading research papers and I've seen a huge effort on all 3 fronts. (cudos to https://blog.acolyer.org/)

New architectures and layers that require far fewer data and can be used for several unrelated tasks.
A lot of opening the black box approachs based on anything from MDL, to information theory and statistics on interpreting the weights, layers and results.

It's not all doom and gloom but huge the milestone jumps like the ones we had in the last 5 years in most AI/ML tasks are probably in the past. What we will see is a culling of a lot of bad tech and hype and the quiet rise of Differentiable Neural Computing.

For a high level understanding of deep learning click here

Monday, July 3, 2017

Udacity AI nanodegree

I enrolled in Udacity's AI nanodegree 2 months ago and I just learned I was accepted.
I thought it would be a good refresher and maybe fill in some knowledge gaps I have.
The reviews on the net are pretty good so I'm pretty sure it will be a great experience especially since there will be AI legends like Peter Norvig doing the teaching.

The curriculum consits of five parts

Foundations of AI : In this Term, you'll learn the foundations of AI with Sebastian Thrun, Peter Norvig, and Thad Starner. We'll cover Game-Playing, Search, Optimization, Probabilistic AIs, and Hidden Markov Models.
Deep Learning and Applications : In this term, you'll learn the cutting edge advancements of AI and Deep Learning. You'll get the chance to apply Deep Learning on a variety of different topics including Computer Vision, Speech, and Natural Language Processing. We'll cover Convolutional Neural Networks, Recurrent Neural Networks, and other advanced models.
Computer Vision : In this module, you will learn how to build intelligent systems that can see and understand the world using Computer Vision. You'll learn fundamental techniques for tasks like Object Recognition, Face Detection, Video Analysis, etc., and integrate classic methods with more modern Convolutional Neural Networks.
Natural Language Processing : In this module, you will build end-to-end Natural Language Processing pipelines, starting from text processing, to feature extraction and modeling for different tasks such as Sentiment Analysis, Spam Detection and Machine Translation. You'll also learn how to design Recurrent Neural Networks for challenging NLP applications.
Voice User Interfaces : This module will help you get started in the exciting and fast-growing area of designing Voice User Interfaces! You'll learn how to build Conversational Agents for products and services more natural to interact with. You will also dive deeper into the core challenge of Speech Recognition, applying Recurrent Neural Networks to solve it.

In my projects so far I've mostly tackled Computer Vision and Predictive Analytics problems, so it would be a nice change to dive into NLP and Voice processing.
I hope I can fit it in my busy schedule and I'll try to write some posts describing the experience for any future students.

Wednesday, June 14, 2017

Classic Machine Learning Literature

I'm often asked by software engineers on what to read to get into the Machine Learning world
For that purpose I've compiled a list of Machine Learning and Applied Mathematics books that I've used to gain a deeper understanding.

Machine Learning

We start of with the classic but very dated "Machine Learning" by Tom M. Mitchell.
This was the first one I read on the subject. Low on math, high on intuition, it is a descent introductory book. You can easily implement most of the algorithms described and get a fair understanding of what's going on. First couple of years in the business you may use it as basic reference but after that you will need the math heavy books.

Pattern Classification

We continue with my personal favorite "Pattern Classification" 2nd edition by Richard O. Duda, Peter E. Hart, David G. Stock. This impressive book is heavy on applied math, low on proofs and very readable. It is better used by beginners as well as experienced machine learning engineers. It builds the reader a very good intuition and understanding. The graphs and figures help a lot. I still use it as a reference on some issues.

Pattern Recognition and Machine Learning

A natural extension of "Pattern Classification" is the excellent "Pattern Recognition and Machine Learning" by Bishop. Somewhat heavy on the math, it provides a clear path of understanding but it is not for noobies. You should come into this book with some experience. This excellent book is still very relevant with great introduction on matrix calculus and probability theory.

Probabilistic Graphical Models

Going deeper, I refer to "Probabilistic Graphical Models". This is a subdomain of Machine Learning and it is not for the faint of heart. This massive book is hard, and I mean eyes glazing, concentrate and get a headache hard. If you manage to get through it you will have a greater understanding than most mortals. If however you are like me you are just gonna sample some of the parts and leave the rest for the PhD's.

Deep Learning

A new book that has gained classic status very fast is the "Deep Learning" by Ian Goodfellow and Yoshua Bengio. I found it very approachable and left me with a better understanding of deep learning. Very light on math, it concentrates on intuition and best practices rather than proofs. Highly recommended for all DL practitioners.
For a high level understanding of deep learning click here

Back to basics books

Numerical Recipes

Most books rely heavily on linear algebra, probability theory and algorithm "primitives". If you really want to know whats under the hood you should check this out.

Statistical Digital Signal Processing and Modeling

Before the Machine Learning and AI hype there was simply DSP.

Artificial Intelligence: A Modern Aproach

A general purpose AI book. Lots of good content, ideas, algorithms, though process, if a bit dated. I used the second edition, apparently the latest one is a bit better.

Matrix Computations

If you really really want to reinvent the wheel and by wheel I mean super fast BLAS primitives usually found in LAPACK and its variants, look no further than here.