At the dawn of the age of
artificial intelligence, when predicting future through data sciences has
become a second nature for most of the decision makers. Though we may feel
elated on the break-through in the data sciences. Still ground needs to be
prepared for seeing the results through.
When we want to peep into the
future, our launching station is the past observations. Let’s pause - what’s an
observation, isn’t it simply a instance of attribute values. Oh, lets simplify
- [Sales Executive] how was my car sales yesterday - fine [ 20 (10 sedans, 6
hatchback & 4 suv) , color mix (white 10, gray 7 & black -3]. Is that
enough to predict today’s or let’s say tomorrows sales.
Most of us will say. -No. So, what we need are the influencers of the
predicted value. From gut feel sales manager will tell, following are the
subset of attributes which may help --Like the day (is it holiday or working
day), what time of the year, is there a promotion going on, how is the overall
economy doing etc.
These attribute (influencers) are
features which a data scientist will search for, to predict values of certain
event. Getting these features in right format is the first step in the journey
of prediction.
Then comes the support strength of
the features in predicting value, here statistics may help in shaping our
features (scaling the features, correlation, visual exploration - list goes
on...). Experimenting is the core in honing these features for a near perfect
prediction. It’s all about Features...
Let’s bring the black boxes - yes
you heard it right “algorithms”. Scoring and evaluating the algorithm result by
using training and test set. In simple words we use past data to train the
algorithm and build the model. To check the model’s strength in predicting, we
hold back a subset of past data, which acts as test data set, thus providing us
the feedback to choose the best predicting model. And the Magic ensues. [let
the future come to us...]