Wednesday, September 19, 2018

Feature for Crystal Gazing


At the dawn of the age of artificial intelligence, when predicting future through data sciences has become a second nature for most of the decision makers. Though we may feel elated on the break-through in the data sciences. Still ground needs to be prepared for seeing the results through.

When we want to peep into the future, our launching station is the past observations. Let’s pause - what’s an observation, isn’t it simply a instance of attribute values. Oh, lets simplify - [Sales Executive] how was my car sales yesterday - fine [ 20 (10 sedans, 6 hatchback & 4 suv) , color mix (white 10, gray 7 & black -3]. Is that enough to predict today’s or let’s say tomorrows sales.

Most of us will say. -No.  So, what we need are the influencers of the predicted value. From gut feel sales manager will tell, following are the subset of attributes which may help --Like the day (is it holiday or working day), what time of the year, is there a promotion going on, how is the overall economy doing etc.

These attribute (influencers) are features which a data scientist will search for, to predict values of certain event. Getting these features in right format is the first step in the journey of prediction.

Then comes the support strength of the features in predicting value, here statistics may help in shaping our features (scaling the features, correlation, visual exploration - list goes on...). Experimenting is the core in honing these features for a near perfect prediction.  It’s all about Features...

Let’s bring the black boxes - yes you heard it right “algorithms”. Scoring and evaluating the algorithm result by using training and test set. In simple words we use past data to train the algorithm and build the model. To check the model’s strength in predicting, we hold back a subset of past data, which acts as test data set, thus providing us the feedback to choose the best predicting model. And the Magic ensues. [let the future come to us...]