Large scale “blind” classification results

Figure 1
Figure 1: Classification accuracy representing the chance of correct classification given information about an arbitrary company on a given year.

The goal of this sub-project was to assess how much can be learned from the complete data without using any domain knowledge. For that we have considered a task of predicting failure or success of a company at each given year from the 53 parameters. We deliberately did not account for the fact that many companies were present through the years. For the purposes of prediction our task was to predict failure based only on the last eight quarters worth of data.

For the task we have selected 8 state of the art prediction approaches and evaluated each of them independently. Our results are summarized in Figure 1. The height of each bar denotes the chance of correctly predicting failure of a given company. The chance of correct prediction is a robust measure which is different from simply reporting the accuracy of prediction on a dataset. For example, this particular dataset has much fewer examples of failures (about 4000) than successes (about 18000). To achieve high accuracy of prediction (82%) it is enough to always predict success as the most frequent data. Our metric accounts for that property and reports how likely a company is to be correctly classified regardless its true performance.

Notable points:

  • One of the methods can achieve stable correct prediction rate of 60%
  • Most of the methods are at the chance level. Given that these are state of the art approaches, such performance highlight the difficulty of the problem
  • This is a robust cross-validated result that did not arise by chance

To get a further insight into the data we have constructed its flat map. Each company (per year) is represented by 53 parameters and it is impossible to simultaneously plot all of them as a graph: we can go only with a pair of parameters at a time for x and y axis. This approach, however, would lead to 1378 plots which is uninformative. Instead we have utilized our algorithms that place companies on a 2D plane relative to each other based on all available information not just two arbitrary chosen metrics but all 53 of them. Figure 3 shows a map of all 1027 companies. Arguably the map reveals some hidden structure in the data that is potentially informative. Unfortunately, the failing companies are not forming any discriminative substructures in the map.

Figure 3
Figure 3: A flat map of the companies