Tuesday, October 28, 2014

Operational Business Decision - A Process Of Response to Exception

Business Processes have evolved over time. These process are streamlined to almost military precision. Continuous improvement / Kaizen application over the years have made our business efficient. As the businesses exist in dynamic environment , there is a constant flux in the business ecosystem , which poses new challenges here now and then.

Business decision are rudders keeping the ship on-course to value maximization.  Drift in the course is what exception are. So business decision starts from identifying these exception, prime focus is now on reducing the lead time of reporting these to the decision makers (information dissemination as close as possible to the occurrence of exception) , the better it is.

Kind of business information needed depends on Decision Impact Horizon i.e. long term (strategic) --- short term (operational). Transactional information is the one needed for short term decision process. Accumulated transaction forms the basis for long term decision processes.

Pattern deviation detection is the future of exceptional handling in businesses - thus machine learning is the way to go for organization to apply statistical algorithms to the transactional data and detecting pattern deviation as close as possible to the event occurrence aka real time intelligence.

To correct the course, business decisions are taken. Till now we are half way in accomplishing our goal. Impact of business decision needs to be measured, and on this feedback further action may be warranted.

Tuesday, June 17, 2014

Information Density In Data

Century of Data has just begun, and we all are overwhelmed with the flow and intensity of data coming to us every day. This is incredible feat human race has achieved in a very short span. 

We had to coin new jargon to represent - 'processing of massive amount of data' - Big Data. Soon with the current rate of acceleration of data accumulation we are moving towards handling super big data. 

The challenge now is finding the right, relevant information from data. 'Like a needle in Haystack'. Millions of dollars are poured by businesses in getting information from data. Classifying sources of data on the basis of information density is next logical step to bring sanity to the process of information extraction from data.

Factors of information density - (Inference Quotient * Data Volume * Data Speed * Structural Quotient)

Inference Quotient: Degree to which direct inference can be extracted from data to reach the relevant information

Data Volume: More the volume, lesser the density

Data Speed: Greater the speed, lesser the density

Structural Quotient: Degree of Structuredness (more structured higher density)

Thursday, April 24, 2014

Anatomy Of Intelligence

200 million years ago - Dinosaurs ruled this planet, today its human species. What's changed? Is it intelligence which is the differentiating factor? Not exactly, vital component in the mix is the communication and that's implicit in intelligence.

It all starts from observation - recording event and there results - then making conclusions, improvising as more events of similar nature are encountered. In nutshell pattern recognition match and act. The real difference comes when communication kicks in and amplifies the whole process to a level where innumerable permutation combinations are at disposal of the mankind.

Underling current is the streams of data processed to produce applied information and made available to masses. Same analogy holds true in the porous boundaries of business houses. So a great business organization have one great characteristics, it have smooth uninterrupted communication channels for relevant information to flow through the whole organization.

Where does innovation fits into the picture. Lateral thinkers break the patterns to innovate - it’s so unique to human species. Intelligence then becomes the vehicle for innovation to reach out to every individual of our civilization.

Monday, February 17, 2014

Blueprint Data Streams

Organization wish to stay on top of the data generated in-house and by the environment around its ecosystem. But there is a distance to be covered to make the wish come true. 

 So where do we start - question to be asked is “What we have? ". Getting answer to these questions right is half the battle. Data streams are running throughout the organization but most of these streams are in private spaces of the individuals or at the most with close knit teams. As these individuals and teams time and effort is applied in developing or enhancing these data streams.

 Action plan starts with blueprint of these data streams. This is no small undertaking but need to be done to stay on top of the data. Suffice to say data is the oxygen for the being called [organization]. To know data/oxygen [life force] pathways is of paramount importance.

 Two pronged strategy is proposed to develop the blueprint of data streams

 Source Identification ... & Flow Map....


 Blueprint Data Streams is a living document so as the time passes by, delta in the organization data infrastructure and flows needs to be updated in.

Friday, December 6, 2013

Emulating Human Learning

It all starts with the perceptive nerves. Signals are interpreted and conclusions are made, actions are taken. Lets simplify this assume a new situation is perceived - step one compare with past experiences - no match , do a close match , approx match - and that's what experience is all about.

Information age is ripe to emulate human learning. Big data, immense parallel processing grids availability and algorithms. Algorithms that's right - termed machine learning. It’s not an exact science but we humans are all about [not perfection] but intelligent enough to correlate and approximate.

Regression, Classification [Supervised] and Clustering [Unsupervised] are various methods for learning. Supervised Learning draws from historical data while, unsupervised methods can, charter completely unknown territories. So next decade may see ,advent of machines providing second opinions to human judgment.


As I write my thoughts Business houses are in process of deploying technologies to make this a reality. Don't be surprised if next time your teller machine recognize your mood , and provides you options accordingly. As we humans do...

Thursday, October 31, 2013

Sample to Population - The Big Data Leap

Capacity of technology to churn huge amount of data has ushered a new era in statistics.

Lets visit Sampling definition " sampling is concerned with the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population " , 'subset' is the term which expresses statisticians workaround to get as close as possible to the population spread. As we know the act of getting sample right is of paramount importance.

Oh , what if we can get the whole population data - so that's a clear possibility with the advent of Big Data moving in fast. Massive parallel processing of the Map-Reduce Framework on the Distributed File System , its all possible. To top it, likes of Revolution R & Mahout are already there.

Statistics as we know of today is on its journey to accept population as the base data for analysis rather than the sample. Better estimates , predictions are on its way....

Tuesday, September 3, 2013

Bus Parallel Processing Architecture - Cloud Ready

Business Intelligence conceptual design paves the way for BPPA (Bus Parallel Processing Architecture). Bus Architecture is the epicenter of developing a well knit dimensional model i.e. translate business processes/sub-business processes into star schema's with dimensions drawing on the master data and transactions finding its way to the facts. Knitting is taken care off through conformed/shared dimensions.

Logical layer takes the bus architecture to the implementation turf, developing the database ER model leading to creation of database tables and bringing star schema into existence in relational database. Physical layer provides the infrastructure for DW i.e. servers, networks etc.

BPPA works on the principle of dimensional redundancy across servers (virtual servers). So each fact moves into a dedicated box thus allowing scaling out the enterprise DW - enabling cloud move for our DW. Conformed/shared dimension are redundant on respective star schema. Cost or overhead of this approach is maintaining redundant copies of dimensions. ROI of this is performance gain and ability to port the scaled out solution to multiple small servers and eventually to cloud.

To top up fact partitioning can be done, as most database vendors now provide this - out of the box, thus reducing IO footprints of fact access.