Focusing on the features

Figure 2Figure 2: Classification accuracy representing the chance of correct classification using only 23 features.

In the above, we have assumed that all of the 53 parameters describing each company are equally informative. Contrary to the fact that the parameters are usually chosen based on expert knowledge and supposed to each be informative.

However, for this particular task we have found that only focusing on the following 23 parameters out of the 53 can provide increased predictive power of 68% mean: EBIT, EqyIssue, CFFinan, EqyRepurchase, CapEx, IntExp, Retain- Earn, AccRecv, RDExp, STDefRev, CommonEqy, DebtLTIssue, DebtRepay, SGAExp, Inventories, Revenue, AccPay, STDebt, TotalCash, Cash, CFOper, PretaxIncome, CurrAssets. The increased performance is summarized in Figure 2. We can further reduce the feature set while only slightly sacrificing predictive performance (back to 60%). The following features provide that performance: TotalLiab, EqyIssue, CFDivPay, Cash, TotalDebt, RetainEarn, CapEx, CurrAssets, MinorityInt, SGAExp, IncomeBefXO.