Unravelling the mysteries of data analysis

Unlike in mathematics, where every thing is certain, in statistics inferences/conclusions are always associated with
certain degree of uncertainty. When the observed data are huge, which is the scenario mostly, it becomes impractical to analyze
all these huge data considering the time and resource constraints. Interestingly, often it it is not necessary also to analyze all data almost same conclusions can be arrived at based on much lese , it
becomes almost impossible to analyze the data if observed data is huge, which mostly is the situation. that is available can be analyzedto be analyzed onThis is because, considering the practicalities, usually the data to be analyzed pertains to a sample drawn from a population. So, there is nothing like 100% correct inference/conclusion in statistics. This is not to say that everything is uncertain about the conclusion. The point is, conclusions can’t be exactly 100% accurate. Some degree of error must be tolerated. It’s perfectly alright to expect 99%, or even 99.99%, confidence attached to conclusions. A confidence level of 95% or more is widely acceptable. There are situations, however, where much higher degree of confidence may be necessary. In pharmaceutical trials, a confidence level of 99.99% or higher may be necessary.
 
To be of practical use, the information/inference/conclusion should be associated with high degree of confidence/validity. This is because; major decision may have to be taken based on this information. Obviously, lower the degree of confidence associated with the conclusions, greater is the risk of wrong decision, resulting in waste of time and resources. InferStat, considers this the core principle. All our reports contain information/conclusions/inferences with a desired/specified degree of confidence.
 
The data to be analyzed pertain to certain variables. Taking into consideration the business view-point, broadly these variables may be classified into three categories as follows.
 
- Result-variables (for example, sales, profit, cost, cycle time, etc.)
- Enabling-variables (for example, raw material consumption, utility consumption, etc.)
- Indicator-variables (for example, regions, market segment, product-type, etc.)
 
Depending upon the need of clients, and/or availability of data, analysis of data will involve analyzing data for one or more variables belonging to one or more of the three categories mentioned above. Further, analysis will most likely also involve examining correlation or dependence of variables on each other, especially the dependence of result/s on enabling and indicator variables. Often, there is a need to analyze the data in relation to certain standard/specifications. Such analyses are of immense value as it brings out some key information on performance aspects (good/bad), and the reasons thereof. Thus, statistical analysis brings out some crucial information, of which the value always ranges from good to great. In general, it can be said that the output of statistical analysis brings to surface numerous improvement opportunities, which when realized, can significantly add to the top and bottom lines of the enterprise.