var textForPages = ["9 788021 075573 2 0 14 ISBN 978-80-210-7557-3 DOI: 10.5817/CZ.MUNI.M210-7557-2014 Administration, Masaryk University PUDIL, BLAŽEK et al. Faculty of Economics and \& University of Economics, Prague Faculty of Management 2014 Identifying Corporate Performance Factors Based on Feature Selection Faculty of Management University of Economics, Prague Identifying Corporate Performance Factors Based on Feature Selection \& Faculty of Economics and Faculty of Management Administration, Masaryk University University of Economics, Prague \& Faculty of Economics and Administration, Masaryk University Identifying Corporate Performance Factors Based on Feature Selection in Statistical Pattern Recognition Identifying Corporate Performance M E T H O D S , A P P L I C AT I O N , I N T E R P R E TAT I O N Factors Based on Feature Selection in Pavel Pudil, Ladislav Blažek, Statistical Pattern Recognition Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová METHODS, APPLICATION, INTERPRETATION Pavel Pudil, Ladislav Blažek, Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová 2014 2014 2014 Faculty of Management t Faculty of Managemen University of E University of Economics, Pragueconomics, Prague \& \& Faculty of Economics and Faculty of Economics and PUDIL, BLAŽEK et al. Administration, Masaryk Universityasaryk University Administration, M DOI: 10.5817/CZ.MUNI.M210-7557-2014 DOI: 10.5817/CZ.MUNI.M210-7557-2014 ISBN 978-80-210 ISBN 978-80-210-7557-3-7557-3 2 0 14 2 0 14 9 788021 07 9 788021 0755735573 PUDIL, BLAŽEK et al. 2014 Identifying Corporate Performance Factors Based on Feature Selection Factors Based on Feature Selection in 9 788021 0755735573 9 788021 07 2 0 14 2 0 14 ISBN 978-80-210-7557-3-7557-3 ISBN 978-80-210 DOI: 10.5817/CZ.MUNI.M210-7557-2014 DOI: 10.5817/CZ.MUNI.M210-7557-2014 Administration, Masaryk Universityasaryk University Administration, M Faculty of Economics and Faculty of Economics and \& \& University of E University of Economics, Pragueconomics, Prague Faculty of Management t Faculty of Managemen 2014 Maria Králová Ondřej Částek, Petr Somol, Jana Pokorná, Pavel Pudil, Ladislav Blažek, METHODS, APPLICATION, INTERPRETATION Statistical Pattern Recognition Identifying Corporate Performance Administration, Masaryk University Faculty of Economics and \& University of Economics, Prague Faculty of Management","","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 1 Identifying Corporate Performance Factors Based on Feature Selection in Statistical Pattern Recognition METHODS, APPLICATION, INTERPRETATION","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 2 2","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 3 3 Identifying Corporate Performance Factors Based on Feature Selection in Statistical Pattern Recognition METHODS, APPLICATION, INTERPRETATION Pavel Pudil, Ladislav Blažek, Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová FACULTY OF ECONOMICS AND ADMINISTRATION MASARYK UNIVERSITY BRNO 2014","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 4 Team of authors: prof. Ing. Pavel Pudil, DrSc. (3, 4) prof. Ing. Ladislav Blažek, CSc. (1, 6) Ing. Ondřej Částek, Ph.D. (1, 2, 5) RNDr. Petr Somol, Ph.D. (3, 4) Ing. Jana Pokorná (2) Mgr. Maria Králová, Ph.D. (5) Translation by Mgr. et Mgr. Bc. Milan Boháček, M.A. Corrections by Steven Del Riley, BA Cover designed by Petr Somol Reviewers: prof. Ing. Michal Haindl, DrSc. doc. Ing. Robert Zich, Ph.D. This work has been supported by the Grant Agency of the Czech Republic – project No. P403/12/1557 BY CC BY 4.0 Crea�ve Commons A�ribu�on 4.0 © 2014 Masarykova univerzita © 2014 Pavel Pudil, Ladislav Blažek, Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová ISBN 978-80-210-7557-3 ISBN 978-80-210-7672-3 (online : pdf) DOI: 10.5817/CZ.MUNI.M210-7557-2014","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 5 5 Contents Introduction 9 1 Formulation of objectives and methodological approach 11 1.1 Summary of previous research activities 11 1.2 Methodology of current research 16 2 Competitiveness and its measurement 21 2.1 The term competitiveness 21 2.2 Approaches to measuring competitiveness 22 Financial performance 23 2.3 Financial performance indicators used 24 2.4 Period of performance measurement 25 2.5 The development of performance measurement methodology 26 2.5.1 Cluster analysis 27 2.5.2 Hyperbola 28 2.5.3 Summation 30 2.5.4 Quintiles 31 2.6 Assessing the appropriateness of methods to measure financial 32 performance 33 2.6.1 Experiment settings 33 2.6.2 Experiment output 36 2.7 Description of the methodology used to measure performance 39 3 Feature Selection Methods in Statistical Pattern Recognition 39 3.1 Introduction 40 3.1.1 Common Research Issues in Machine Learning and Management 41","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 6 6 CONTENTS 3.2 Dimensionality Reduction 41 DR Categorization According to Nature of the Resulting Features 42 DR Categorization According to the Aim 42 3.3 Feature Subset Selection 44 3.3.1 FS Categorization With Respect to Optimality 44 3.3.2 FS Categorization With Respect to Selection Criteria 45 3.3.3 FS Categorization With Respect to Problem Knowledge 46 3.4 Sub-optimal Search Methods 47 3.4.1 Best Individual Features 48 3.4.2 Sequential Search Methods and their Evolution 48 Floating search methods 50 Oscillating search method 50 3.4.3 Non-sequential and alternative methods 52 3.4.4 Pitfalls of feature subset evaluation – experimental comparison of criterion functions 53 3.4.5 Summary of recent sub-optimal feature selection methods 54 3.4.6 Dependency-Aware Feature Selection (DAF) 55 3.5 Performance Estimation Problem 58 3.6 Problem of Feature Selection Overfitting and Stability 59 3.6.1 Problem of Feature Selection Stability 61 3.7 Summary 61 4 Testing approaches and methods based on learning methods for identifying factors of competitiveness 63 4.1 Introduction 63 4.2 Feature selection based evaluation of competitiveness factors 67 4.2.1 Feature Selection Methodology 68 4.2.2 Evaluating Stability of Feature Selection Methods 69 4.3 Introducing the modified feature selection methodology 71 Non-Parametric Model 71 Handling Missing Values and Non-Numeric Values 71 4.4 Pattern classification approach 72 4.5 Regression approach and pseudo-kernel regression model 72 4.6 Experiments and results 75 4.6.1 Regression-based analysis results 75 4.6.2 Classification-based analysis results 78","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 7 1.1 SUMMARY OF PREVIOUS RESEARCH ACTIVITIESCONTENTS 7 4.7 Comparing Regression-based and Classification-based analysis results 79 4.8 Improved Model for Attribute Selection on High-Dimensional Economic Data 80 4.8.1 Improvements of the regression model 80 Varying the distance function 80 Kernel width multiplication constant 83 4.8.2 Optimized model performance on 37- and 74-dim data 85 4.9 Conclusions 88 5 Identifying factors of competitiveness using bivariate analyses and linear regression analyses 91 5.1 General characteristics 92 Effect of company size on the financial performance 92 Effect of industry on the financial performance 94 Interaction of company size and industry 94 5.2 Internal competitiveness factors of a company 98 Multidimensional model 99 5.3 External competitiveness factors of a company 100 Multidimensional model 102 5.4 Stakeholder orientation and characteristics of an organizational structure 103 Multidimensional model 106 5.5 Owners 106 Multidimensional model 110 5.6 Employees 112 Multidimensional model 114 5.7 Customers 116 Multidimensional model 118 5.8 Suppliers 120 Multidimensional model 122 5.9 Corporate social responsibility 124 Multidimensional model 125 5.10 Comprehensive model 126 6 Interpretation of the results achieved 129 6.1 Overall characteristics of the sample 131 6.2 Identification of typical combinations of factor values leading to certain types of financial performance 143","Castek_1_Castek - kopie 23.1.2015 7:35 Stránka 8 8 CONTENTS Group A as a whole 143 Group A 1 145 Group A 2 146 Group A 3 148 Group A 4 149 Group A 5 150 6.3 Summary 152 7 Conclusion 155 8 Bibliography 159 List of Figures 167 List of Tables 168 List of Graphs 170","Castek_1_Castek 23.1.2015 7:37 Stránka 9 9 Introduction The issue of finding competitiveness factors of companies represents a very topical and interesting subject in terms of both practice and theory, with many publications dealing with this issue. However, these works do not present a uniform way of thinking, as they vary in their approach to the issue, terminology used, applications of the methods, solution justifications, and even in the credibility of the results achieved and their interpretation. This raises such questions as (1) how to express corporate competitiveness, (2) how to measure the effect of potential factors of competitiveness in a multidimensional space under the conditions typical for business research, and finally, (3) what are the competitiveness factors and how can we interpret their effects? This publication presents the results of research focusing precisely on this area. It deals with the issue of applying selected statistical methods to identify the competitiveness factors of companies so as to respect the synergistic effect of their influence. The theoretical and methodological part of this monograph describes the approach to the way of expressing the financial performance of companies, and especially approaches to the identification of factors that affect the financial performance. Experiments with an application of these approaches are described and analysed using empirical data of companies from manufacturing and construction industries, including interpretation of the results achieved. The research was conducted in the period of 2012-2014 under a project of the Grant Agency of the Czech Republic entitled, “Developing Methods for Identifying 1 and Evaluating Factors That Critically Affect Corporate Performance” as a joint research project of the University of Economics in Prague, Faculty of Management in Jindřichův Hradec, and Masaryk University, Faculty of Economics and Administration in Brno. 1 This work has been supported by the Grant Agency of the Czech Republic – project No. P403/12/1557.","Castek_1_Castek 23.1.2015 7:37 Stránka 10 10","Castek_1_Castek 23.1.2015 7:37 Stránka 11 11 1 Formulation of objectives and methodological approach The presented work deals with analysing and formulating methodology on the search for competitiveness factors of companies, including its verification and interpretation on an extensive sample of empirical data. This work builds on several years of research in the given field, conducted by the research team of the Centre for the Competitiveness of the Czech Economy established at the Faculty of Economics and Administration at Masaryk University. The results of this work are summarized in the monographs by Blažek et al. (2007), Blažek et al. (2008) and Blažek et al. (2009). In order to better understand the focus and purpose of the presented work, we consider it necessary to briefly summarize these research activities. 1.1 Summary of previous research activities The main focus of the research was on the competitiveness of companies, namely the search for the factors that affect it. In other words: finding the reasons why some businesses are competitive and others are not. In addition to this, it found certain types of businesses that were competitive, and conversely other types of businesses that in turn are not. The researchers pursued the idea that these factors can be identified on the basis of a sophisticated analysis of empirical data and that it is also possible to identify typical configurations of values of these factors which impact the level of competitiveness that individual businesses achieve. In other words: the same or similar level of competitiveness can be achieved in different ways. Two companies reaching the same level of competitiveness may differ from each other very significantly. When elaborating this issue, it was of course necessary to proceed to the operationalization of the key concepts. This concerned mainly the concept of “competitiveness”, which proved to be too vague, elusive and difficult to measure for the sake of further analysis. Therefore, it was replaced with the term “economic success”.","Castek_1_Castek 23.1.2015 7:37 Stránka 12 12 1 FORMULATION OF OBJECTIVES AND METHODOLOGICAL APPROACH The population was defined as follows: – companies based in the Czech Republic (territorial aspect), – companies in all industries (industry aspect), – companies with 50 or more employees (size aspect), – joint-stock companies and private limited companies (aspect of a legal form). The empirical survey was conducted in two stages. The subject of the first one included those industries that represent key areas of the business sector of the Czech economy, i.e. the manufacturing and construction industries. In the second stage, other sectors were analysed. Since the number of companies that fit the definition above is too low for a statistical analysis in some of these industries, the size limit was reduced to 10 workers in the second stage. Population specified this way included about 8,000 companies at the time of the survey. There were two primary sources of information for solving the task in question: – data from the questionnaires, capturing the selected characteristics of companies, – data from public databases (Albertina Data), capturing the selected financial indicators. Those companies that were in bankruptcy or liquidation at the time of the survey, as well as those businesses whose data were missing in the database, were excluded from the population. The population that was subsequently used included 4,915 companies. This set was considered the population for the purpose of the research. However, it was obviously impossible to subject such a large set to a questionnaire survey. There were two main reasons: – the over-whelming effort, – the unwillingness of most businesses to participate in the questionnaire survey. Companies from the population were asked to participate in the questionnaire survey in such a way that self-selection determined by the willingness or unwillingness of the surveyed companies could be restricted or eliminated by a quota selection. In accordance with the definition of the population, the quota variables included territory (regions), industry, size, and legal form of business. The percentage representation in the population and the sample showed very close values on average. This indicates that the sample can be considered representative. The questionnaire survey was conducted in collaboration with Augur Consulting. The methodology of the task in question is shown in Figure 1-1. 1. Questionnaire data As previously mentioned, the questionnaire survey was conducted in two stages. In the first sub-stage (2007), which applied to companies from the manufacturing and construction industries, 432 companies participated in the empirical survey. In the second sub-stage (2009), which applied to companies from other industries, the survey included 267 companies.","Castek_1_Castek 23.1.2015 7:37 Stránka 13 1.1 SUMMARY OF PREVIOUS RESEARCH ACTIVITIES 13 Figure 1-1: The methodology of previous research activities 1. Ques onnaire data 2. Accoun ng data 3. Primary analysis 4. Clustering based on economic success 5. Secondary analysis 6. Iden fica on of factors of economic success 7. Clustering based on factors of economic success 8. Economic interpreta on 2. Accounting data In the case of the first sub-stage, data from the financial statements for the years 2002 to 2006 were used; in the second sub-stage, data from the financial statements for the years 2002 to 2007 were used. 3. Primary analysis The primary analysis involved evaluating frequencies of responses to the questions of the questionnaire; the questionnaires produced almost 700 variables. The results were presented in the form of tables and graphs, including comments briefly interpreting the presented numerical and graphical information. The evaluation was performed for the sample as a whole as well as for the sub-samples, categorized by industry, company size, and legal form of business. A number of important pieces of information concerning the businesses were obtained in this way (Blažek et al., 2007, p. 41–287; Blažek et al., 2009, p. 29–262). However, the main output of the primary analysis was the processing of the variable set, entering the secondary analysis. 4. Clustering based on economic success Economic success was evaluated on the basis of two indicators, i.e. return on assets and asset growth indicators. In the first sub-stage, the sample of 432 companies","Castek_1_Castek 23.1.2015 7:37 Stránka 14 14 1 FORMULATION OF OBJECTIVES AND METHODOLOGICAL APPROACH was structured based on the application of a cluster analysis into three groups (A, B and C), from the most efficient to the least efficient (Blažek et al 2008, p. 53–66). In the second sub-stage, this methodology was largely modified. The sample of 267 companies was also structured into three groups (A, B and C); however, in this case the basis was not a cluster analysis but the so-called “coefficient of economic success” (Blažek et al., 2009, p. 53–66). 5. Secondary analysis The purpose of this analysis was to transform the set of variables from the primary analysis into a set of variables as potential factors of economic success, suitable as an input into the methods of multidimensional statistical pattern recognition. Using the application of common statistical methods, the relationship between different variables and economic success of companies was analysed. There was a significant reduction in the number of variables; variables with a low variation and variables with a high proportion of missing values were excluded. In many cases, it was necessary to do a recoding into new variables, suitable for the subsequent mathematical and statistical procedures and interpretation of the obtained results (Blažek et al., 2008, p. 31–39 and p. 67–88, Blažek et al., 2009, p. 275–277). 6. Identification of factors of economic success In order to ensure the exact mastering of this statistically demanding task, selected methods of statistical pattern recognition were used for the multidimensional analysis. Specifically, these methods included Individual Best Search (IBS), Sequential Floating Search (SFS) and Sequential Floating Forward Search (SFFS). The outcome of the application was to identify which variables tested by these methods are the factors that influence decisively the economic success of the companies analysed; moreover, the outcome did not seek to identify the factors that influence the economic activity separately, i.e. each factor individually, but integrally, i.e. interrelated (Blažek et al., 2008, p. 89–93, Blažek et al., p. 275–280). 7. Clustering based on factors of economic success After identifying the factors of economic success, it was necessary to find such typical configurations of values of these factors that are produced by certain types of economic success achieved by companies. Clusters based on factors of economic success were created using the application of the cluster analysis method within each of the groups A, B and C, created on the basis of the economic success of the businesses. (Blažek et al., 2008, p. 93 and 94). 8. Economic interpretation Groups of companies, established on the basis of such clustering, were characterized with values of economic-success factors, and supplemented with several other variables with regard to increasing the rate of illustrativeness. First, the","Castek_1_Castek 23.1.2015 7:37 Stránka 15 1.1 SUMMARY OF PREVIOUS RESEARCH ACTIVITIES 15 interpretation of characteristic features at the level of the entire sample was made, using the comparison of factor values and selected variables between the groups of companies A, B, and C, grouped according to their economic success. Subsequently, a similar interpretation was conducted at a more detailed level – within the group of the economically most successful companies (group A); it included the comparison of values of the same factors and variables as in the previous case, but between groups of companies created by secondary clustering, i.e. by factors of economic success (Blažek et al., 2008, p. 95–120). After some time it can be stated that the research described above represents a challenging and somewhat unique event, both in terms of the methodological approach and the extent of empirical investigation. Despite the obvious benefits, however, many of the problems that occurred during the solution and the complexity of some of the results achieved remained unresolved. This primarily concerns the methodological level. When searching for the best way to apply approaches based on pattern recognition when identifying the factors of economic success, a number of experiments were performed. Although some useful experience was obtained thanks to these experiments, the results obtained did not lead to unambiguous conclusions. The researchers failed to overcome the negative phenomenon of excessive sensitivity of outputs to relatively small changes in input variables as well as a relatively strong influence of the choice of a particular evaluation algorithm and its parameters on the obtained results. The quality of the results, both in the primary and secondary analyses and in identifying the factors of economic success of businesses, was also undoubtedly limited by the character of the industries in question. In the first stage, the investigation focused on manufacturing and construction industries, i.e. industries producing material goods, industries with relatively strong internal homogeneity, exclusive entrepreneurial orientation, and consequently with many companies whose legal form was private limited company or joint-stock company. However, in the case of the second stage, the situation was quite different: it was basically a conglomerate of nine industries, many of which showed diametric differences with respect to one another. The main factor that fundamentally undermined the homogeneity of the sample was the fact that only some of these industries can be considered entrepreneurial, while in others entrepreneurship is practically non-existent. This is closely connected with the representation of private limited companies and joint- stock companies. The representation of the respondents in each industry was therefore very uneven. Three of the nine industries analysed included fewer than 10 respondents. The effort for the widest possible coverage unfortunately resulted in comparing, or evaluating jointly, industries that are difficult to compare by nature, or even incomparable.","Castek_1_Castek 23.1.2015 7:37 Stránka 16 16 1 FORMULATION OF OBJECTIVES AND METHODOLOGICAL APPROACH 1.2 Methodology of current research These facts became a challenge for further research, which was conducted within the project of the Grant Agency of the Czech Republic entitled “Approaches to identification of corporate performance factors with emphasis on methods of feature selection in statistical pattern recognition.” The outcomes of this research are presented in this publication. The objective of the research conducted within this project is formulated as follows: To develop and verify a methodology for identifying and evaluating factors which have a significant effect on the performance of companies. The solution involved these successive steps: a) Search of relevant literature, summary of theoretical knowledge. – Approaches to expressing competitiveness, financial performance measurement methods. – Methods of feature selection in statistical pattern recognition. b) Testing on the task in question – Testing methods for factor identification. Selecting the most appropriate method, improving and adjusting it. Factor identification. – Regression analysis: explaining financial performance on the basis of identified factors. c) Economic interpretation and overall evaluation – Identification and economic interpretation of typical combinations of factor values leading to certain types of financial performance. Step a) formulates theoretical grounds and creates a methodological background for the solution of the given task. It is contained in chapter 2. Competitiveness and its measurement, and in chapter 3. Feature Selection Methods in Statistical Pattern Recognition. Steps b) and c) concerning testing and interpretation were conducted following step a). These research activities and their results are contained in chapters 4. Testing approaches and methods based on learning methods for identifying factors of competitiveness, 5. Identifying factors of competitiveness using bivariate analyses and linear regression analyses, and 6. Interpretation of the results achieved. We will use the comments to Figure 1-2. as the first introduction to the solution of the task in question including its relation to previous research activities. 1. Questionnaire data With respect to the above-mentioned facts, the research used data from the questionnaire survey conducted in 2007, which applied to companies from the manufacturing and construction industries. Within these two major industries, 432 companies participated in the empirical survey. Businesses from other industries were not included.","Castek_1_Castek 23.1.2015 7:37 Stránka 17 1.2 METHODOLOGY OF CURRENT RESEARCH 17 Figure 1-2: Methodology of current research activities 1. Ques onnaire data 2. Accoun ng data 3. Primary analysis 4. Clustering based on financial performance 5. Secondary analysis 6. Bivariate analysis and linear regression 7. Sta s cal pa ern‐recogni on 8. Clustering based on factors of financial performance 9. Economic interpreta on 2. Accounting data The data used came from the financial statements for the years 2004 to 2010. The relationship between cause and effect was substantially respected in this way, where the cause refers to the company characteristics identified through a questionnaire survey, and the result is the development of a company’s financial indicators. Another important reason for choosing this period was the effort to capture the development of these indicators during the crisis when the competitiveness of companies was tested in very difficult conditions. 3. Primary analysis The starting point was the primary analysis from the previous research. With regard to the application of subsequent procedures and methods, the available data were further modified. 4. Clustering based on financial performance The term “economic success” was replaced with the term “financial performance” with regard to better operationalization. The indicators of return on assets and asset growth were used again, but the method of their calculation was changed substantially (see Chapter 2).","Castek_1_Castek 23.1.2015 7:37 Stránka 18 18 1 FORMULATION OF OBJECTIVES AND METHODOLOGICAL APPROACH 5. Secondary analysis Similarly to the primary analysis, the secondary analysis was also based on the previous research. However, the reduction of the number of variables entering the experiments of statistical pattern recognition did not have to be so dramatic because the newly applied DAF algorithm is not so sensitive to the ratio of the number of input variables to the number of available observations (i.e. companies). Therefore, the number of variables from the questionnaire was reduced on the basis of value variability and the number of missing values to 74 variables instead of 37 variables used in the SFFS algorithm. 6. Bivariate analysis and linear regression Usual methods of bivariate analysis, i.e. t-tests, ANOVA and correlation and association analyses were used in parallel with the experiments of statistical pattern recognition. In addition, partial linear regression models were also formulated. These analyses provided a number of findings on the relationships between the aforementioned 74 variables and financial performance (see Chapter 5). When comparing the power of the relationship found by these procedures and the error of estimation of a nonlinear model based on the variables selected by the DAF method, the advantage of using the techniques of statistical pattern recognition is obvious. This section thus serves as a direct comparison of the results of both approaches as well as assistance for economic interpretation (see the final step). 7. Statistical pattern-recognition Unlike in the previous research, a different approach of multidimensional analysis was applied in this fundamental research activity. Specifically, it was the DAF algorithm, which shows a much higher stability of results for the given task than the SFFS algorithm which was originally used. Moreover, the non-linear regression model was employed as a better classifier than the original k-NN classifier; the DAF algorithm was also improved in several aspects, which resulted in an increase in the accuracy of the result. 8. Clustering based on factors of financial performance The procedure was similar to the previous research; however, the difference consisted in clustering that was conducted using different factors of financial performance and within different groups of companies, created based on their financial performance (see Chapter 6). 9. Economic interpretation The researchers again proceeded similarly as in the previous research, but with different variables and a different grouping of companies according to their financial performance. This time, the companies were divided into quartiles. They were labelled as groups A, B, C and D, where group A included businesses with the highest financial","Castek_1_Castek 23.1.2015 7:37 Stránka 19 1.2 METHODOLOGY OF CURRENT RESEARCH 19 performance and group D businesses with the lowest financial performance. First, an interpretation was performed at the level of the entire selected sample, using the comparison of factor values and selected variables between the A, B, C and D company groups. Next, a similar interpretation was performed at a more detailed level – within the group financially most powerful companies, in order to interpret the mainly quantitative representation into the representation of the mainly qualitative nature (see Chapter 6). This book also summarizes and builds on some already published outputs of this project, particularly the following ones. In the work-in-progress paper, Pudil et al. (2012) compared the performance and stability of algorithms SFFS and DAF combined with the k-NN classifier. The result was that DAF algorithm achieves a slightly higher accuracy but significantly higher stability. Částek et al. (2013) proved the necessity of using advanced multivariate methods by comparing different approaches on the same data set and paper. Pokorná and Částek (2013) reviewed methods of corporate performance measurement and experimentally verified a chosen way, longitudinal combination of ROA and Growth of Assets. Pudil et al. (2013) finished the comparison of SFFS and DAF; this time these algorithms were combined with a non-linear regression instead of the k-NN classifier. Again, this approach improved both the accuracy and stability of the results and again the DAF algorithm was found higher performing in both criterions. The final step was taken in the paper by Somol et al. (2014). Here, the distance function and kernel width used were optimized and accuracy has risen once again with no loss of stability. The use of DAF algorithm together with multivariate regression model for predicting financial performance measured by adding up ROA and Growth of Assets became the basis for selecting the factors of corporate performance. Besides that, the reader will also find a procedure for interpreting such factors in this book, including the actual interpretation of the effect of these factors in Czech construction and manufacturing companies.","Castek_1_Castek 23.1.2015 7:37 Stránka 20","Castek_1_Castek 23.1.2015 7:37 Stránka 21 21 2 Competitiveness and its measurement The term competitiveness is widely used but not a uniformly understood one. Authors of various studies often create their own definitions, which imply different approaches to measuring competitiveness. Therefore, in this chapter we focus on the term competitiveness, explain the possibilities of its measurement, and finally, justify the choice of the methodology used to measure competitiveness in this research. 2.1 The term competitiveness Competitiveness can be viewed from a macroeconomic or microeconomic perspective. The macroeconomic view deals with competitiveness of countries and their ability to produce products and services (through businesses) that are successful in international markets. The European Union understands competitiveness as the ability of an economy to increase productivity, as the only way to achieve sustainable growth in per capita income, which subsequently leads to the growth of living standards (European Commission, 2011). The World Economic Forum puts greater emphasis on the economy as a set of institutions and policies in understanding competitiveness (Schwab, 2012), through which the level of productivity of enterprises and thus the entire country is influenced. The microeconomic perspective of competitiveness, which we follow in this research, concerns the competitiveness of smaller units, namely businesses. The Organization for Economic Co-operation and Development (OECD) defines competitiveness of a business as its ability to compete, increase profits and growth. Approaches of other authors (Jirásek, 2000; Blažek, 2009) can be summarized in a widely accepted definition of competitiveness of a business as the ability of a business to operate on the market in the long term and sustainably in competition with other businesses. This view of competitiveness was used by the Research Centre for the Competitiveness of the Czech Economy (Blažek et al., 2009), which this research partly follows. Factors leading to a higher ability of a business to stay in the market are the subject of this research.","Castek_1_Castek 23.1.2015 7:37 Stránka 22 22 2 COMPETITIVENESS AND ITS MEASUREMENT 2.2 Approaches to measuring competitiveness Higher competitiveness of a company is reflected in the long-term growth of its profits or increasing its market share and the consequent growth of the company as such. However, there is no single correct approach to measuring competitiveness, as it is always necessary to particularly consider the purpose and possibilities of this measurement. The research is therefore based on the assumption that in the long run, a company’s competitiveness affects the success or failure of a company in competition (Blažek et al., 2009). From this perspective, the success of a company can be identified with its performance for the purpose of its measuring. Thus, business performance is conditioned by its competitiveness and it should be true that if a company is competitive, it is also efficient (Suchánek, 2005). As a result, we transformed the problem of identifying factors of competitiveness of companies into the form of corporate performance factor identification. Business performance in general represents the degree of achieving the set objectives. Based on an organization’s objectives, we identify financial performance, operational performance and overall effectiveness (Hult et al., 2008). The widest approach is undoubtedly measuring a company’s overall effectiveness. This includes an analysis of achieving the long-term strategy of a company in all areas of its activity. However, measuring it is very problematic and often made more difficult by considerable subjectivity, as it includes variables such as reputation, perceived performance, achievement of goals, or survival. Analysing the overall effectiveness of a company requires extensive knowledge about the organization and its field, and it can be applied in qualitative research. Operational performance focuses on the non-financial dimension of performance, such as measuring the quality of products and processes in a company, the cycle time and productivity. The results are especially useful for business management in supporting the improvement of business processes or even benchmarking. The prevailing approach to performance measurement in quantitative research is the design of financial performance indicators (Hult et al., 2008), measuring the economic situation of a company. Financial analysis indicators based on accounting data and information from financial statements are widely used. Indicators of a company’s market position offer an additional view. The advantage of this approach is especially the easy accessibility of financial statements that are publicly accessible in the Register of Companies or other databases. Indicators use “hard” data from financial statements; therefore, they are not affected by the evaluator’s subjective view. Based on the purpose of the research, a historical view of a company, as well as possible oversimplification or disregarding industry specificities can be seen as a disadvantage. A possible disadvantage of long-term studies is the risk of errors arising from changes in the accounting procedures of the company.","Castek_1_Castek 23.1.2015 7:37 Stránka 23 2.2 APPROACHES TO MEASURING COMPETITIVENESS 23 23 The quantitative nature of this research with a large number of monitored companies allows us to measure performance solely from the financial perspective. Understanding the specific circumstances of each company would not be practically feasible, whereas data from the financial statements are available from databases such as Albertina CZ. Due to the possibility of comparing the results of this research with other studies and also for practical reasons, we decided to measure business performance using financial indicators. Financial performance The most commonly used indicators are the indicators of financial analysis, which can be obtained from the financial statements of a company. This includes indicators of profitability, frequently used by researchers, as well as indicators of debt, liquidity and activity. Their popularity is due to the good availability of data, simplicity of their design, easy interpretability and wide applicability. Ratio indicators allow us to compare companies of different sizes and from different industries, or to compare the results of a company with the recommended values. Another advantage is that the data for calculating these indicators are subject to internal and external checks and they are not influenced by “market sentiment” (Krivogorsky \& Grudnitski, 2010, p. 182). On the other hand, they may be negatively affected by differences in accounting procedures (concerning the comparison between countries) or even manipulations such as undervaluing assets (Sanchez-Ballesta \& García-Meca, 2007). Another group is represented by market measures. Shareholder-value measures are based on the perception of a company as a tool whose purpose is to increase the value of the owners’ investment. These indicators are based on the value of a company’s shares on the market. This, however, may be affected even by external factors besides business performance. These indicators are not useful for quantitative research in the Czech Republic since there are only a few dozen listed companies on the Stock Exchange. The category of market measures also includes the indicators of market share, which, however, are not commonly accessible. Hybrid indicators try to capture business performance from many perspectives, combining multiple indicators from both the previous groups. They include solvency and bankruptcy models that predict the financial health of a company in the future. Another well-known indicator of this field is Economic Value Added. Hybrid indicators often use estimates and work with interest rates of government bonds. Their design is often complex, making them more difficult to interpret. The most common indicators used in the conducted economic studies include variously designed profitability (most frequently Return on Assets) and indicators based on revenues (Hult et al., 2008).","Castek_1_Castek 23.1.2015 7:37 Stránka 24 24 2 COMPETITIVENESS AND ITS MEASUREMENT 2.3 Financial performance indicators used We used the assumption that higher business performance is in the long run reflected in its profitability or growth in market share and the consequent growth of the company as such as the investigation by the Research Centre for the Competitiveness of the Czech Economy used when choosing the appropriate indicator to measure financial performance. The experienced authors of the most successful performance measurement system worldwide, the Balanced Scorecard method, comment: “Financial objectives typically relate to profitability – measured, for example, by operating income and return on investment. Basically, financial strategies are simple; companies can make more money by (1) selling more, and (2) spending less. Everything else is background music. ...Thus, the company's financial performance gets improved through two basic approaches – revenue growth and productivity.” (Kaplan and Norton, 2004, p. 36). The growth strategy usually consists of entering new markets and developing new products that require the introduction of additional production and service processes with which the company has not had substantial experience yet. An effort to standardize and streamline existing processes is the essence of the strategy to increase productivity, which is applied to activities usually performed for a longer period, in a familiar environment, and for well-known customers. Only excellent companies are able to implement both strategies at the same time, and thus achieve exceptionally high financial performance (Blažek et al., 2011). If we want to capture both possible strategies for achieving business performance, i.e. the growth strategy and the strategy of increasing productivity, it is necessary to use at least two indicators. However, a larger number of indicators used increases the time and cost of obtaining and processing data, and the amount of missing data increases as well. In addition, the combination of all indicators becomes more complex and more difficult to interpret. Therefore, it was decided to use two indicators, each of which will represent one of the two mentioned strategies. We decided to measure the strategy of increasing productivity, or profitability, using the indicator Return on Assets (ROA), which expresses the profitability of all resources in the company, regardless of their origin. This approach is in line with the stakeholder view of the company, used in the investigation of potential factors of performance; moreover, its result is not influenced by the type of capital structure of the company. ROA is one of the essential and most frequently used indicators of profitability (Richard et al., 2009). It was designed as follows: where: NOPBT – net operating profit before taxes TA – total assets t – year","Castek_1_Castek 23.1.2015 7:37 Stránka 25 2.4 PERIOD OF PERFORMANCE MEASUREMENT 25 The calculation uses operating profit, which is not affected by supplementary activities of the company and expresses its success on the market. Relating this value to the amount of assets gives the indicator an essential relative form, which allows the comparison of different-sized businesses. To capture the degree of implementing the growth strategy, we used the Assets Growth (AG) indicator. The rising value of the total assets and including a greater amount of resources at the same time is caused by the higher production of the company. During long-term growth in demand for its products, the company has to invest in expanding its production capacity, which increases the overall amount of assets. The second option is to select directly the indicator of sales growth, but it was not possible to calculate this indicator clearly from the database used (due to different procedures used in the database to determine the indicator of total sales, consisting in confusing total sales with revenues from the sale of goods, or products and services, total revenues or the performance indicator). The Assets Growth indicator was designed as follows: TA – total assets t – year 2.4 Period of performance measurement We understand competitiveness as the ability of a company to operate on the market in the long term, sustainably in competition with other businesses. For this reason, it was necessary to measure business performance for a longer period of time. Using a time series also helps to reduce random fluctuations as well as intentional optimizations (such as creating reserves, etc.) in the performance of individual companies. From the perspective of corporate economy, a long period is understood to cover a period longer than 3 or 5 years. Even Krištof (2006) recommends following a five-year and longer series of indicators, with Kirby (2005) requiring an even longer period of about ten years. This requirement is based on the environment of North America, where a decade covers two terms of an executive director of a company on average. However, such research time may be too long in the current turbulent times. Factors applicable before this period may no longer be valid with such a considerable time interval. This is probably the reason why there are not many studies using a longer period than five years (e.g. Artiach (2010) – a five-year period, Abor (2007) – a six-year period, Hansen (1989) – a five-year period). On the other hand, many studies follow performance on the basis of only one-year results, e.g. Andrews (2010), Bottazzi (2008), Kessler (2007) and many others.","Castek_1_Castek 23.1.2015 7:37 Stránka 26 26 2 COMPETITIVENESS AND ITS MEASUREMENT Another fundamental fact of our research aimed at finding competitiveness factors of a company is the existence of a time lag between the potential causes leading to increased performance and its actual rise. For this reason, it is first necessary to determine the possible factors of high performance and measure this performance subsequently. This ideally means determining the values of competitiveness factors in year 0 and then measure performance in years 1 to 10; moreover, we should allow for a delay in the publication of financial statements of the company. Nevertheless, it is not possible to postpone the analysis for so long due to the above-mentioned utilization of research findings in the current turbulent environment. Thus, it was necessary to examine a sufficiently long period of time, but at the same time not to delay the analysis too long after data collection. The questionnaire survey, which investigated the potential performance factors, was conducted in 2007. To analyse the performance, compromise was made: the time series of seven consecutive years for the period 2004–2010 was examined. The data for calculating financial performance for later years were not available for a sufficient number of companies at the time of the analysis, as entering financial data into the Albertina database and the Register of Companies is conducted gradually and with delay. Seven years is a sufficiently long period of time for inferring the long-term business performance. Even older data obtained from the extension of the time series backwards were no longer desirable due to a too long interval from the questionnaire survey. 2.5 The development of performance measurement methodology Selecting two different indicators of financial performance led to the question of whether to analyse a company’s performance separately by ROA and separately by the Assets Growth, or whether to try to aggregate both the results into a single summary benchmark. The first approach is more common as it is used by approximately two thirds of economic studies, while aggregation of multiple indicators into one is used by approximately one third of studies (Richard et al., 2009). Given the purpose of the research, it was desirable for further analysis to group or sort the businesses within a single variable based on their financial performance. Only one performance indicator was required by the intended use of some special statistical methods. For this reason, we decided to merge the two indicators into a single one. Great attention was paid to finding the most appropriate way of analysing financial performance from the beginning of the previous research conducted by the Research Centre for the Competitiveness of the Czech Economy because it significantly determines the results of subsequent experiments. During the research, more possible solutions to the problem outlined were found and the model was revised and adjusted several times so that the selected indicator would reflect the real financial","Castek_1_Castek 23.1.2015 7:37 Stránka 27 2.5 THE DEVELOPMENT OF PERFORMANCE MEASUREMENT METHODOLOGY 27 performance of the company as well as possible. Previous methods of ROA and Assets Growth aggregation into a single indicator are described in detail in the individual monographs summarizing the research conducted by the Research Centre for the Competitiveness of the Czech Economy (Blažek et al., 2008, 2009, 2011), and their comparisons are included in Pokorná and Částek (2013). Therefore, they will be only briefly described here. 2.5.1 Cluster analysis In the first phase of finding methods for analysing business performance, Šiška (2008) first had to deal with the fact that finding the causes for higher business performance in the questionnaire survey, which was conducted in 2007, followed the time period for which it was possible to analyse the financial performance, i.e. 2002-2006. Although we can assume certain sluggishness in established business processes, it was appropriate to take into account the temporal distance of financial data in analysing business performance. Therefore, in order to reinforce the importance of financial data from the years closer to the empirical survey, he applied the weight of 1-2-3-4-5 to both analysed indicators from 2002–2006. Thus, the weight was greatest for the latest data from 2006 and smallest for indicators from 2002. Besides weighing the data by years, they were also standardized using the z-score to eliminate the effects of different scales of both indicators. The first selected tool for analysing the financial performance was cluster analysis. Using 27 iterations of non-hierarchical k-means clustering, Šiška revealed 13 relatively homogeneous clusters associating a different number of businesses with similar values and development of ROA and Assets Growth over the five-year period analysed (graph 2-1). Graph 2-1: Clusters of cluster analysis Source: Šiška, 2008","Castek_1_Castek 23.1.2015 7:37 Stránka 28 28 2 COMPETITIVENESS AND ITS MEASUREMENT Distribution of companies in this number of groups made it considerably difficult to interpret the results and represented a problematic use in the subsequent statistical analysis. The clusters formed a categorical variable, which prevented a further statistical analysis using procedures requiring at least ordinal or cardinal variables. Therefore, this method was modified to cluster grouping based on the quadrant in which the clusters were located. This led to the creation of AA, AB, BA, BB and C groups (containing all clusters in areas marked with the letter C). Five categories of this variable proved more useful, but they still were not ordinal. With the need of at least an ordinal expression of financial performance, we used only businesses located in areas AA, BB and C. This made the dependent variable become a higher level variable (from categorical to ordinal); but it meant that the experiments could not include all businesses – businesses in the areas AB and BA were omitted. For these reasons, this method was not used further. 2.5.2 Hyperbola In the next stage we were looking for a method to evaluate business performance, which would make it possible to put businesses into a small number of cardinal groups for the sake of interpretation – ideally only into two groups: companies economically successful and economically unsuccessful. When designing new methodology, Šiška assumed that the growth strategy and strategy of increasing productivity were mutually exclusive (Kaplan, Norton, 2004). This assumption was expressed in this methodology as an inverse proportion between the indicators used of ROA and Assets Growth. The product of the values of the two indicators represented a single-digit indicator of financial performance of a company, referred to as the coefficient of financial performance. When the value of one or two input variables was negative, i.e. a company was becoming smaller and/or depreciating its assets, the company was rated as economically unsuccessful regardless of the outcome of the mathematical product of both indicators. Ordering companies based on their performance was done using the coefficient of financial performance for the remaining businesses. A company can achieve the same financial performance for different strategies, e.g. if it weakens its focus on productivity and tries to get more customers instead, and expects that an increase in customers will lead to assets growth. We obtained the financial performance curve (graph 2-2) by connecting the points with the same values of the coefficient of financial performance but different combinations of ROA and Assets Growth values. Companies with a higher coefficient of financial performance are located on the higher financial performance curve. Companies located above the curve, which connected businesses with the financial performance coefficient of a median value, were identified as financially efficient; companies under the curve were marked as less financially efficient. Companies with a negative value of one or both indicators of financial performance,","Castek_1_Castek 23.1.2015 7:37 Stránka 29 2.5 THE DEVELOPMENT OF PERFORMANCE MEASUREMENT METHODOLOGY 29 i.e. businesses from the second, third and fourth quadrant of graph 2-2, were marked as clearly inefficient. The name of the method – Hyperbola without weights – was derived from the geometric shape of the financial performance curves. Graph 2-2: Hyperbola Source: Blažek et al., 2009 Similarly to the previous method working with clusters, the indicator values used for calculating the financial performance coefficient were assigned weights reflecting the level of variable significance uniformly decreasing toward the past. In further research the authors stopped using the weights because their application caused large distortion (see experiments in the next chapter); thus, indicators for each year were assigned the same weight. This method was called Hyperbola with weights. Based on previous experience, Šiška (Blažek et al., 2009) tried to take into account the variability vs. stability of the results of companies in the next stage during the period analysed. He assumed that 1) a company with long-term stable values of financial indicators can probably face competition better than a company with the same average performance, which, however, strongly varies over time, and 2) all stakeholders consider a stable company, i.e. a low-risk company with balanced indicators, more attractive. Therefore, he purged risk from the average values of both financial indicators using a standard deviation. The calculation based on the method called Risk-purged hyperbola for a six-year ROA average was as follows (and similarly for Assets Growth):","Castek_1_Castek 23.1.2015 7:37 Stránka 30 30 2 COMPETITIVENESS AND ITS MEASUREMENT When searching for ways to refine the results obtained, the authors also considered and tested the possibility of rating business performance separately in each industry. The results of a method called Sector-hyperbola, however, failed to be more accurate than in previous methods (see section 2.6). The method variants with a hyperbola meet the criteria of a cardinal variable and a small number of the final business groups based on their performance; however, the explanation of the correlation between the two indicators, which was expressed as their product in the calculation of the financial performance coefficient, appeared problematic. This method of calculation provided a substantial undesirable distortion of business analysis, favouring those companies that pursued both strategies simultaneously (e.g. an average company with ROA 5% times Assets Growth 5% = financial performance coefficient of 25) compared to businesses that focused only on one of them although they achieved high success in this one strategy (e.g. ROA 15% times Assets Growth 1% = financial performance coefficient of 15). A fundamental problem was also assessing companies with zero Assets Growth and high profitability, and vice versa, as inefficient due to multiplication of a large number by zero. For these reasons, the method was subsequently used only as a dichotomous variable that separated businesses to those that were found above the hyperbola (successful) and those that were found below the hyperbola (unsuccessful). To highlight the differences between these two categories, borderline businesses, located in close proximity to the hyperbola, were omitted for some experiments. Typically, these were 5% of companies above and 5% of companies below the hyperbola. In graph 2-2 these borderline businesses are identified as Group AB. This made it possible to overcome the aforementioned drawbacks; however, the variable could not be used as a cardinal variable. 2.5.3 Summation In further search for an appropriate methodology, we took into account the shortcomings of the previous methods. The output was still expected to be a continuous variable or a discrete ordinal variable with fine resolution, which could subsequently be divided into a small number of company groups based on financial performance. When working with two indicators it was again necessary to convert the values of ROA and Assets Growth to the same scale. We used the method of standardization, which preserves the information value of the resulting z-score compared to normalization. In the design of a single-digit indicator, we put emphasis on the equivalence of the growth strategy and the strategy of increasing productivity, including any possible combination of these two strategies in achieving performance. Therefore, the standardized indicators of ROA and Assets Growth are of the same weight and they can be simply added up without the risk of distortion. This method eliminates the","Castek_1_Castek 23.1.2015 7:37 Stránka 31 2.5 THE DEVELOPMENT OF PERFORMANCE MEASUREMENT METHODOLOGY 31 aforesaid problems associated with the hyperbola method. The resulting financial performance coefficient is a one-dimensional variable, which can be graphically represented with a curve. In the following text, this method is referred to as Summation. A variant of the summation method is the Exponent method, which we considered and in which we also tried to highlight the differences between companies using a power relationship between the variables: where a>1. Variable a that influences the degree of variance of the financial performance coefficient was arbitrarily set as a=2 in further experiments. 2.5.4 Quintiles Another method was designed by J. Šiška and M. Králová (2012). They operate with a graphical representation of indicators on the two main axes, similarly to the hyperbola or cluster method (Fig. 2-3). They divided the resulting space of the upper right quadrant in particular into several segments and assigned each segment points expressing a company’s performance. Figure 2-3: Quintiles method Source: Pokorná, Částek, 2013, modified according to J. Šiška and Králová, 2012","Castek_1_Castek 23.1.2015 7:37 Stránka 32 32 2 COMPETITIVENESS AND ITS MEASUREMENT We adjusted the proposal to our needs by changing the boundaries between the segments. The lowest score was assigned to companies that represent up to 5% of the sample for both indicators; on the other hand, the highest score was assigned to companies that belong in the top 5% in both indicators. We separated the remaining space between the two segments by quintiles. This method satisfies the ordinal variable requirement although its significant disadvantage is a subjective determination of the boundaries between the individual segments. Due to a large number of categories we can consider the variable quasi-interval. 2.6 Assessing the appropriateness of methods to measure financial performance As seen in the previous section, there are several options in dealing with the selected indicators so that the result would be a one-dimensional variable. Unfortunately, from a theoretical point of view it is not possible to justify one of these options as better than others. Therefore, we decided to use experiments based on the following assumptions: 1. there is a statistically measurable correlation between the factors analysed of competitiveness and financial performance, 2. if the factors and the method used to measure correlation (association, tightness, determination) remain unchanged, then the value of the correlation coefficient for different representations of financial performance (association, tightness, determination) will represent the degree determining the appropriateness of the selected expression of financial performance. Should the aforementioned be true, it is necessary to choose a suitable method of measuring correlations. It is important that such a method in particular: 1. is not affected by the shape of tightness (e.g. by linearity as correlation), because this shape can change if the expression of financial performance is changed, 2. does not assume an a priori model (e.g. structural modelling), which would need to be changed when expressing financial performance, 3. is able to measure the impact of several independent variables with possible interdependencies. For the above reasons, it is therefore possible to exclude the majority of the commonly used methods from application: correlation, linear regression, structural modelling, decision trees, partially methods of multiple regression and multiple correlation. On the contrary, these shortcomings are not present in the selected algorithm of sequential floating forward selection (SFFS) in combination with the k- Nearest Neighbours classification method, presented in next chapters. It should be pointed out that the SFFS algorithm reduces the set of so-called symptoms; in our case, these are the variables describing businesses so as to preserve the most","Castek_1_Castek 23.1.2015 7:37 Stránka 33 2.6 ASSESSING THE APPROPRIATENESS OF METHODS TO MEASURE FINANCIAL PERFORMANCE 33 informative symptoms, i.e. variables contributing most to separating companies into classes. We understand classes as categories of financial performance. This process therefore requires that the variable analysed is categorical, which was convenient in our situation as all types of financial performance evaluation proposed in the previous section could be converted to a categorical variable, while variables with a lower level of quantification could not be converted for example to a cardinal variable if required. For methods that produced cardinal variables, we separated the companies into two classes, i.e. above average and below average, while omitting about 10% of borderline businesses in order to distinguish better more competitive ones from less competitive ones. 2.6.1 Experiment settings To evaluate the informativeness of the tested variable sets, we used the classification method of k-Nearest Neighbours, which groups samples (companies) into individual classes (categories of importance) by k of the nearest neighbours. The number of nearest neighbours k is therefore an optional parameter of the experiments. Since it is not possible to clearly decide what size of k should be used, the experiments were carried out separately with k set to 1, 3 and 5. Even values are not usually used in order to avoid undecided situations (Blažek et al., 2008, pp. 48–49). On the other hand, the number of companies in the individual categories of importance is too low for a value greater than 5. After each experiment, the authors chose the size of k that helped reach the most informative variable set as the most suitable. The same procedure was used in the search for competitiveness factors (Blažek et al, 2008, p. 89). In the following text, the individual k’s used will be referred to as methods 1NN, 3NN and 5NN. Each company was described with 37 variables representing potential competi- tiveness factors, referred to as “symptoms” in the terminology of statistical pattern recognition. More information about working with these variables before the application of statistical methods and about the SFFS algorithm can be found in the article on the benefits of this algorithm in the search for competitiveness factors (Špalek, Částek, 2010). 2.6.2 Experiment output The table below shows the descriptive statistics of informativeness values obtained using different methods of analysing financial performance. The methods are ranked on the basis of average informativeness values achieved. It is clear that the performances produced by the first three methods differ within three per thousand, while the fourth method lags behind the best one by almost two percentage points. Hence, the most accurate method seems to be summation, followed by hyperbola without weights and the exponent method closely trailing.","Castek_1_Castek 23.1.2015 7:37 Stránka 34 34 2 COMPETITIVENESS AND ITS MEASUREMENT Figure 2-4: Descriptive statistics of informativeness values produced by different methods Fin. perf. measurement method Minimum Maximum Mean Std. Devia on Summa on 0.739 0.782 0.765 0.022 Hyperbola without weights 0.747 0.787 0.762 0.022 Exponent 0.751 0.773 0.761 0.011 Quin les 0.745 0.751 0.749 0.003 Hyperbola with weights 0.730 0.755 0.744 0.012 Sector hyperbola 0.709 0.725 0.717 0.008 Risk‐purged hyperbola 0.701 0.715 0.706 0.008 However, besides informativeness, we can use other criteria to assess the efficiency of methods. One of them can be the rate of fluctuation for selected variables. In the next table, we can see how the SFFS algorithm output can be summarized. It lists the values of informativeness, the number of variables in the set of variables with the highest informativeness, and variable codes contained in this most informative set for classifiers 1NN, 3NN and 5NN. Figure 2-5: Summary of the SFFS algorithm output for the exponent method Proper es of the most informa ve variable subset Classifier Informa veness Number of variables Variables 1NN 0.750685 9 3 7 9 14 18 31 32 33 34 3NN 0.772603 13 4 5 12 14 19 24 25 26 27 31 34 35 36 5NN 0.758904 4 8 9 14 17 Note: bold – variables selected once italics – variables selected twice underlined – variables selected always","Castek_1_Castek 23.1.2015 7:37 Stránka 35 2.6 ASSESSING THE APPROPRIATENESS OF METHODS TO MEASURE FINANCIAL PERFORMANCE 35 Ideally, not only should the value of informativeness be high, but different classifiers should also agree on the same variables. However, in the case of the exponent method we see that all three methods selected only one variable, which is number 14, three variables were selected twice (9, 31, 34), but seventeen variables were selected only once. To compare it with the other methods, we can design the following analysis of variable fluctuations in the most informative set: where number of items is the sum of the selected variables; so for the above mentioned case the number of different variables is 1 (number 14) + 3 (numbers 9, 31 and 34) + 17 (other variables) divided by the Number of items, i.e. 9 + 13 + 4, and divided by three (because every time the experiment was conducted for 1NN, 3NN and 5NN). The formula would look as follows: This means that the result equals 1.52; in fact, the result can range within the interval from zero to two, with zero being the lowest fluctuation (every kNN selects the same variables) and 2 meaning the maximum fluctuation (each kNN selects a completely different variable). Fluctuation in this case is actually the equivalent of generalizability: if it is low, we can expect the same result with higher probability in multiple repetitions with a random variable. The table below shows the average data on the most informative variable sets. Figure 2-6: Fluctuation analysis Average values Fin. perf. measurement method Number of variables in the Informa veness Fluctua on most informa ve subset Summa on 0.765 1.25 4.00 Hyperbola without weights 0.762 1.54 8.67 Exponent 0.761 1.52 8.67 Quin les 0.749 1.18 16.00 Hyperbola with weights 0.744 1.02 15.33 Sector hyperbola 0.717 1.36 4.67 Risk‐purged hyperbola 0.706 0.80 3.33","Castek_1_Castek 23.1.2015 7:37 Stránka 36 36 2 COMPETITIVENESS AND ITS MEASUREMENT Let us assume, like in the case of informativeness, that if the experiment setting is the same and differs only in the method used to divide the companies, lower fluctuation of selected variables means that the applied method divides the businesses better. Then we can better identify the success of the best three methods assessed for analysing financial performance, whose informativeness value is very similar. The method of summation, which had the highest informativeness value, is foremost among these three methods even from the perspective of the fluctuation criterion. The difference against the hyperbola without weights and the exponent method is now much stronger than it was for informativeness. The choice of the summation method can be further confirmed by the average number of selected variables in the most informative set, which is less than a half for the summation method compared to the hyperbola without weights and exponent methods. For summation, the kNN classifiers identified a set of six variables one time and a set of three variables two times (i.e. 9 different variables) as most informative, while for the hyperbola without weights it was six, sixteen and four variables (i.e. 22 different variables), and for the exponent method it was nine, thirteen and four variables (i.e. 21 different variables). In the latter two cases, more than half of the input variables were selected as most informative. Given the results of the experiments and other advantages of the summation method (e.g. problem-free work with negative values compared to methods based on a hyperbola), the summation method was selected for further utilization. 2.7 Description of the methodology used to measure performance To analyse the performance, the ROA and Assets Growth indicators were used for a 7-year period from 2004 to 2010. Both the indicators were calculated from data obtained from the Albertina CZ database, which contains data from the financial statements of companies registered in the Czech Republic. Yet, there were missing data in the database that had to be found in other publicly available data sources (such as the Register of Companies, digitized volumes of the Commercial Bulletin, or the Albertina database). Even after this manual completion of the database, some data were still missing. As for the ROA indicator, we accepted a maximum of two missing figures from the seven-year time series; due to their volatile development in time they were not replaced with the average values for the company or for all companies in a given year, but the indicator value was not calculated for the year with the missing data. As a result, the overall ROA value was averaged for the number of years for which it was possible to calculate the indicator (i.e. for a minimum of 5 years in the case of 2 missing figures). As for the Assets Growth indicator, we evaluated those companies whose Assets Growth indicator could be calculated for at least 5 years from the period","Castek_1_Castek 23.1.2015 7:37 Stránka 37 2.7 DESCRIPTION OF THE METHODOLOGY USED TO MEASURE PERFORMANCE 37 analysed of 2004–2010. In practice, however, this criterion did not change the number of un-/analysed businesses: it was possible to calculate Assets Growth for all businesses for which it was possible to calculate the ROA. The analysis of financial performance was conducted for 411 companies out of 432; in the case of the remaining 21 companies, too many figures were missing. Some companies from the basic sample showed an increase in Assets Growth by as much as tens of thousands of a per cent for the observed seven-year period. This increase, however, cannot be considered a long-term success of a company in its growth strategy, since according to the data from the Register of Companies these were mostly newly established businesses, or companies which were involved in a merger in the given year, etc. To prevent such extreme values from distorting the results of the financial analysis, the data on Assets Growth were adjusted so that values above 1000% were disregarded and replaced with an empty value. Such companies did not occur in the selected sample. As for the businesses from the sample that lacked (a maximum of 2) figures representing the amount of total assets at the end of the period analysed, these values were replaced with a value that equalled the value from the previous year increased by the average Assets Growth of the basic sample in a given year. Analogously, the missing values for the beginning of the period were replaced with the total assets value in the following year, reduced by the average Assets Growth in a given year. This adjustment affected 47 companies from the selected sample. In the case of the ROA indicator, the annual values above 200% and below -200% were replaced with an empty value. For the purpose of further data processing, it was necessary to use the same scale for the values of both financial indicators. Standardization was conducted using the following formula: where i=(1, 2, 3, 4, 5, 6, 7) for ROA and the mean value in the formula is the median of the whole population calculated for each year separately from the Albertina CZ database, and the standard deviation is a deviation of the whole population for each year separately. We eliminated the influence of different economic developments in individual years by using the standardizations for every year. The resulting value of the ROA indicator for one company is then the arithmetic average of seven standardized values. The Assets Growth indicator for 7 years represents the standardized geometric 2 mean for the seven-year period of 2004 –2010. The geometric mean is usually used to calculate the average growth rate. 2 The seven-year average of Assets Growth is based on the values of total assets for 2003 and 2010, as the initial value of total assets for 2004 equals the final value of total assets for 2003.","Castek_1_Castek 23.1.2015 7:37 Stránka 38 38 2 COMPETITIVENESS AND ITS MEASUREMENT The geometric average of Assets Growth: TA – Total Assets Standardization was conducted similarly to ROA; the difference was that only one value per company (geometric Assets Growth 2004-2010) was standardized. Using standardization retains the explanatory power of z-score: z-score equals zero for an average company. In the case of the normal distribution, the standardized value of ROA (or Assets Growth) for 68.26% of companies is in the range of <-1; 1>. There are only 13.59% of companies with the standardized ROA value above 1.00, while companies whose value of the standardized ROA is above 2.00 belong to the exceptionally performing businesses; there are only 2.14% of equally good or better companies. Converting values of both indicators to a single scale using standardization makes it possible to compare the extent of implementing the growth strategy and the strategy of increasing productivity, which are regarded as equivalent. Using the above- mentioned method of summation, i.e. a simple summing up of both the standardized indicators, we obtained the desired one-dimensional coefficient of financial performance, which indicates the performance of a company regardless of the strategy chosen. This methodology marks companies whose financial performance coefficient is greater than 0 as above average; this means that the higher the coefficient is, the more efficient a company is (and vice versa). Businesses that notably increase in size (Assets Growth > 0) while showing average profitability (z-score for ROA = 0) will be marked as above-average as well; this is also true for companies whose productivity is substantially increasing (ROA > 0) while showing an average growth (z-score for Assets Growth = 0), or they are above-average in both strategies at the same time (z- score for both indicators > 0). A company that is becoming smaller in the long term may not yet be included among inefficient businesses if the standardized rate of its profitability is higher than the standardized rate of its (negative) growth (i.e. a situation where the absolute value of the negative z-score for Assets Growth is smaller than the absolute value of the positive z-score for ROA).","Castek_1_Castek 23.1.2015 7:37 Stránka 39 39 3 Feature Selection Methods in Statistical Pattern Recognition 3.1 Introduction With the ever-increasing specialisation and diversification of scientific disciplines, it is common that similar problems are being tackled in other branches of science, usually without an awareness of the respective research and application communities. It is just this case where the methods developed in a relatively distant field of statistical pattern recognition (SPR) can be used to solve the earlier defined problem of identifying factors of corporate competitiveness, a problem from the field of business and management. Statistical pattern recognition is a discipline comprised of analytical and adaptive methods for processing large datasets, selecting useful information aimed at reducing the data dimensionality and finally classifying these data. Pattern recognition is actually closely connected with machine learning and therefore it is considered to belong to the field of artificial intelligence. One of the fundamental problems of statistical pattern recognition is representing patterns in a reduced number of dimensions, which means a dimensionality reduction. The methods of feature selection are used in statistical pattern recognition to solve this task. As the methods of statistical pattern recognition represent the essential part of our methodology used for solving the research problems specified in the project, we consider devoting a special chapter to presenting their fundamentals important. A broad class of decision-making problems can be solved by the learning approach. This can be a feasible alternative when neither an analytical solution exists nor the mathematical model can be constructed. In these cases the required knowledge can be gained from the past data, which form the so-called “learning” or “training set”. Then, the formal apparatus of statistical pattern recognition can be used to learn the decision-making. The first and essential step of statistical pattern recognition is to solve the problem of variable (feature) selection or more generally of dimensionality reduction, which can be accomplished either by a linear or nonlinear mapping from the measurement space to a lower dimensional feature space.","Castek_1_Castek 23.1.2015 7:37 Stránka 40 40 3 FEATURE SELECTION METHODS IN STATISTICAL PATTERN RECOGNITION We will examine some of the most popular tools under various settings to point out several pitfalls often omitted in current literature. In the following we will prefer the term feature selection to variable selection in accordance with statistical pattern recognition conventions. 3.1.1 Common Research Issues in Machine Learning and Management Though managers, economists and researchers or practitioners from other fields have different priorities in research issues, issues common to both the fields exist. Such an issue is the problem of selecting only that information which is necessary (and if possible also sufficient) for decision-making. We strongly believe that statistical pattern recognition is the discipline capable of providing a common methodology. A typical problem which managers often encounter is the problem of too many potential inputs into their respective decision-making problems. This phenomenon has been extensively studied in mathematics and in artificial intelligence. The “curse of dimensionality” problem, as coined by the famous American mathematician Richard Bellman, can perhaps be found in all the fields of science and application areas including economics and management. Without going into the details of this phenomenon, it can be stated that in order to make reliable decisions (or more exactly, to learn to make them based on past experience and the available data) the need for the amount of data dramatically grows with the number of inputs. Mathematically it means that the sample size required grows exponentially with the data dimensionality. This problem is very relevant particularly to the field of management and economics, as the process of managerial or economic data acquisition is usually both time- consuming and costly. Consequently, the data sets acquired are usually too small with respect to their dimensionality. Though managers consider their respective problems to be of a somewhat different nature (perhaps understandably from their professional point of view), from the point of view of mathematics, the same problem exists formally. It is just a question of different terminology and abstraction needed to find a unified look at the problem. Let us just give an example what we mean by considering which sets of inputs managers use for their decision-making: economic indices, financial data, time series, prediction estimates, etc. For a mathematician, however, these sets can be looked upon as a set of variables, forming the input vector (in pattern recognition which deals with this problem the term feature vector is used). In the majority of practical cases, the dimensionality (the number of inputs) of the original input space can be rather high. It is just a natural consequence of the well-known fact that in the design phase of any system for supporting decision-making, it is extremely difficult or practically impossible to evaluate directly the “usefulness” of particular input variables. Managers certainly face this problem many times, when having to make the decision. For instance, a manager could have a large number of economic variables","Castek_1_Castek 23.1.2015 7:37 Stránka 41 3.2 DIMENSIONALITY REDUCTION 41 to potentially consider for performing a multiple regression analysis (too many potential regressors). Yet, owing to the often very complex relations and dependencies (sometimes rather strong) among all the respective inputs, economic variables exist (let us speak generally just about variables) which can be left out from decision- making without a great loss of information content. The theory of information, a special mathematical discipline, defines this as the existence of “redundancy” in the set of variables. However, even if managers were aware of this phenomenon, the problem of solving the task of finding the redundant variables for complex problems with many potential inputs would be beyond human capabilities. The reasons for trying to reduce the set of our inputs into the decision-making process by eliminating redundancies have both practical and theoretical foundations. The practical ones perhaps do not need to be discussed – the reduced cost of the data acquisition is sufficient to substantiate the reduction. On the theoretical front we should like to mention a famous theorem by another American mathematician, S. Watanabe. He founded the mathematical theory of cognition by paraphrasing a world- known fairy tale by the Norwegian author Hans Christian Andersen, where he formulated “the ugly duckling theorem”. Roughly stated, it says that no cognition is possible unless our perceptions (input variables) are weighted and, consequently, many of them are given a null weight and thus eliminated from the cognition process. Each of the fields considered (managerial and clinical decision-making) has its own specificity, and accordingly, different ways of treating the problem. On the other hand, with the ever-increasing specialization and diversification of scientific disciplines, it is not an uncommon fact that similar problems are being tackled in other branches of science, usually without the awareness of respective research and application communities. Yet the results and methods from one scientific discipline can be applied not only to solve problems in another quite different discipline, but they can also often enrich its methodology. It is our belief that novel methods developed recently in the field of statistical pattern recognition to solve the problem of feature selection can enrich the methodology of selecting the most useful information for decision-making problems in management. Conversely, solving these problems can enrich the methodology of statistical pattern recognition and feature selection. 3.2 Dimensionality Reduction D We shall use the term “pattern” to denote the D-dimensional data vector x ∈ X ⊆ R of measurements, the components of which are the measurements of the features of the entity or object. We also refer to x as the feature vector. Let Y = { f , • • • , f |Y| } be the set of D = 1 |Y| features, where |•| denotes the size (cardinality). The features are the variables specified by the investigator. Following the statistical approach to pattern recognition, we assume that a pattern x is to be classified into one of a finite set of C different classes","Castek_1_Castek 23.1.2015 7:37 Stránka 42 42 3 FEATURE SELECTION METHODS IN STATISTICAL PATTERN RECOGNITION Ω = {ω , ω ,…, ω }. A pattern x belonging to class wi is viewed as an observation of 2 1 C a random vector X drawn randomly according to the known class-conditional probability density function p(x|ω ) and the respective a priori probability P(ω ). i i One of the fundamental problems in statistical pattern recognition is representing patterns in a reduced number of dimensions. In most practical cases, the pattern descriptor space dimensionality is rather high. It follows from the fact that it is too difficult or impossible to evaluate directly the “usefulness” of particular input in the design phase. Thus, it is important to initially include all the “reasonable” descriptors the designer can think of and reduce the set later on. Obviously, information missing in the original measurement set cannot be substituted later. Dimensionality reduction (DR in the following) is an important step in data pre-processing in pattern recognition and machine learning applications. In general, it can be shown that such tasks as classification or approximation of the data represented by so-called “feature vectors”, can be carried out in the reduced space, more accurately than in the original space, as demonstrated by the so-called “Peaking Phenomenon” (see Fig. 3-7). The aim of dimensionality reduction is to find a set of new d features based on the input set of D features (if possible d<, and defining the distance between nominal values as 0 in case of equality and 1 otherwise. As a search strategy, we chose a) Sequential Forward Floating Search (SFFS) (cf. Pudil et al., 1994), generally known for its good optimization performance but also","Castek_1_Castek 23.1.2015 7:37 Stránka 70 70 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... good search speed, and b) Dependency-Aware Feature Ranking (DAF) (cf. Somol et al., 2011) known to be very robust against over-fitting. To compare the stability of results yielded by the two chosen FS methods, we repeated each experiment for each of the methods 20x, each time using a different randomly chosen 80% part of the available data. In each run, a subset of features was selected, possibly different from those of the other runs. The similarity of the various results is evaluated using various stability measures. Here we evaluate ANHI, ATI and CWr, (for an explanation and details cf. Somol and Novovovicova, 2010) in order to get a broader picture (no single measure is able to capture all aspects of the result observed). The goal is to possibly reveal the principal differences between the two FS methods employed with respect to solving our particular FS problem. The graphs in Figure 4-17 summarize the findings. It can be clearly seen that the stability of SFFS is considerably worse than that of DAF, regardless of the size of subset found and the k parameter of the employed FS criterion. Especially the low CWr values in case of SFFS suggest that SFFS tends to over-fit too much. Figure 4-17: Comparing SFFS and DAF stability and performance when selecting 5 or 15 features out of 37, with k-NN for k=1, 2, 3. Diamonds show k-NN accuracy achieved; other lines show stability Based on the aforementioned observation, we decided to select features using the DAF method. The goal of this second experiment is to identify which features are better than the others, possibly yielding information about how much better they are. The criterion is again the estimated accuracy of k-NN classifier, now for k=5 as higher k values proved better to further prevent over-fitting (cf. Theodoridis, Koutroumbas, 2006).","Castek_1_Castek 23.1.2015 7:37 Stránka 71 4.3 INTRODUCING THE MODIFIED FEATURE SELECTION METHODOLOGY 71 4.3 Introducing the modified feature selection methodology In order to perform a feature selection process (analysis of the dependency between particular competitiveness factors and overall CFP over the available training data) using the chosen feature selection method in our particular case, we first need to impose a suitable model on the data that would describe the underlying data structure as accurately as possible. A good model can then be used to evaluate the quality of candidate subsets of characteristics in the process of feature selection. In the following, we consider two principally different approaches – classification and regression. The difference is in the measure of accuracy to be optimized. In the case of classification, we aim to identify such a subset of competitiveness factors, for which the model proves most accurate in distinguishing among the company groups A, B, and C (cf. Section 4.4). In the case of regression, we avoid the grouping and aim to minimize the prediction error of a dependent variable expressing the overall CFP. Non-Parametric Model From the vast battery of existing models (cf. Theodoridis and Koutroumbas, 2006) our choices are limited due to two specifics of our problem: the data is incomplete (roughly 5% of values are missing for various companies and various characteristics) and some of the available characteristics are non-numeric, i.e., their values are impossible to order. Another concern is the sample-size vs. dimensionality ratio, which in this case prevents application of models requiring large numbers of samples (mixture models or other multi-dimensional models, cf. Theodoridis and Koutroumbas, 2006, where the curse of dimensionality would quickly lead to over-training). As reported in (Pudil et al., 2012), a suitable approach is the application of k- Nearest Neighbor idea (Cover, Hart, 1967). k-NN is the non-parametric classifier that imposes a non-linear model taking direct use of existing samples in the training data set. Its known disadvantage – the necessity to permanently store the complete data set – is not a disadvantage in our case due to the limited data size and off-line nature of our search process. On the other hand its advantage – a good model fit and accuracy in the case of a limited data size – makes it a good choice for our purpose. The only additional tool needed is a suitably defined distance function capable of expressing the distance between any two samples (companies) in the 36-dimensional space of our data set. The commonly used distance is the Euclidean distance. A modification of standard Euclidean distance is the basis of how we handle missing values and non- numeric values. Handling Missing Values and Non-Numeric Values The accuracy of the model is affected not only by its fundamental principle but also by its possible parameters or other setup details. In our case, the handling of missing","Castek_1_Castek 23.1.2015 7:37 Stránka 72 72 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... values and non-numeric values proves important. Both of these concerns are reflected in the definition of distance function used within the applied models. We considered two approaches of missing value handling, as expressed by the definition of pseudo-Euclidean distance: – “standard” one-time substitution of each unknown value by the mean value over all known values for the respective feature in case of numeric features, or substitution by the most frequent value in case of non-numeric values – “pessimistic” handling of unknown values when computing distance; missing values are always interpreted as if the distance was maximum with respect to the respective feature. 4.4 Pattern classification approach In the classification approach, the DAF feature selection method (cf. Chapter 2) is applied to maximize the estimated prediction accuracy of the k-NN classifier. In the course of the search, the DAF method generates candidate feature subsets, evaluates each of them and collects the results to obtain a final feature ranking. Each candidate subset is evaluated as follows. The k-NN classifier is used to classify each single known company to one of the groups A, B, C purely based on the characteristics in the current subset. The result of classification is in each case compared to the known assignment of the respective company to one of the classes. Eventually, the percentage of companies classified correctly (predicted to belong to the same group where they actually belong) is used as the measure of subset quality. 4.5 Regression approach and pseudo-kernel regression model Considering the CFP as the dependent variable, we tried to predict this dependent value by means of a special non-linear regression model with other company characteristics such as independent variables. The error of the regression model with given independent variables (companies characteristics) as regressors, thus the prediction error, serves in our approach as a criterion of how good these independent variables are for approximating the real CFP of companies in the data set. Such a subset of characteristics (features in pattern classification terminology, regressors in mathematical statistics terminology) that provides the lowest regression error can be then regarded as the set of key factors of corporate competitiveness. The only additional tool needed is a suitably defined distance function capable of expressing the distance between any two samples (companies) in the 37- dimensional (or 74-dimensonal) space of our data set. The commonly used distance","Castek_1_Castek 23.1.2015 7:37 Stránka 73 4.5 REGRESSION APPROACH AND PSEUDO-KERNEL REGRESSION MODEL 73 is the Euclidean distance. A modification of standard Euclidean distance is the basis of our handling of missing values and non-numeric values. Applying regression instead of classification, as used in Pudil et al. (2012), has the potential benefit of preventing the impact of possible inaccuracies introduced when pre-processing data (at the moment of initial assignment of companies to classes). The regression model we have used is based on the “Kernel” regressor, where the dependent value is predicted as a linear combination over dependent values of existing samples, where more distant samples get a lower weight. “Kernel” in this context is the weighing function dependent on the distance among the point in question and existing samples (companies). More exactly, the proposed pseudo-kernel regression model is analogous to Parzen kernel models with Gaussian kernels with the main difference being in the fact that we assume only one-dimensional kernels. The unknown dependent value y is k predicted as the weighted average of dependent variables of known companies. The one-dimensional Gaussian kernel in our case serves as the weight in the definition of distance between two companies, w(x , x ). The Gaussian kernel thus helps to i j progressively reduce the influence of more distant companies and emphasizes the importance of close companies when predicting the dependent variable y for k-th k company. The pseudo-kernel regression model predicts the unknown competitiveness value y from neighbouring competitiveness values as follows: k The weighted distance between two companies is computed as follows: where L denotes the L -distance defined as: p p Unless stated otherwise, we assume p=2, i.e., we primarily use the Euclidean distance. Note that σ is estimated from the training set as the standard deviation of pairwise distances of all known companies. The parameter m is to be user-specified, or optimized in the pre-processing stage. Note that a higher value of m leads to a more smoothed-out model; a too high m would lead to a collapse of the model, which would then always predict the same average value of competitiveness. A lower value of m leads to a more detailed model that generalizes less. A too low m would collapse the","Castek_1_Castek 23.1.2015 7:37 Stránka 74 74 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... model to an analogy of 1-Nearest Neighbour. Let us mention that both the m and p parameters can be optimized to minimize the model prediction error (see Section 4.5). Note that the above formulation of pseudo-kernel regression model is still based on the assumption of companies being represented by numerical vectors without missing values. The resolution of this problem consists in redefining the L distance. p The problem of categorical variables can be diminished by assuming all numerical values to be normalized to [0,1]. In case of categorical variables, the contribution of a variable to the overall distance would then easily be 0 in case of equality or 1 in case of inequality of variable’s values. The problem of missing values can then be resolved by restricting the computation of L to only such a subspace on which all values are p known (provided the result is properly weighted). The other option is to interpret the missing value in one of two ways: either to indicate pessimistically that the distance is 1 (maximal), or to use the average distance computed over all cases where the variable is not missing. To explain it less mathematically – we gradually explore increasing subsets of features (companies’ characteristics) and use them as regressors for a special regression model. For a given subset of features, CFP for each company (serving as the dependent variable) is predicted by means of our regression model (see Figure 4-18). Now, let us consider a particular company. When trying to predict its CFP, we calculate it as a weighted sum of all the CFP’s of respective companies. The weight reflects the distance of the point representing our particular company from the point representing the respective company. To summarize the explanation presented above, we consider two regression models following an analogous idea as discussed in Sect. 4.3: 1. 1-NN, where the dependent value for any point in the 36-dimensional space is predicted to be equal to that of such an existing sample that is the nearest to the point in question. 2. “Kernel” regressor, where the dependent value is predicted as a linear combination over dependent values of existing samples, where more distant samples get a lower weight. This model enables prediction based solely on the information about the distance between points in space; no coordinate information is used. This facilitates the definition of custom distance functions to accommodate features of various, even non-numeric types. The principle of the model is illustrated in Figure 4-18. This model has been shown capable of achieving a prediction error lower than 3% in the paper by Pudil et al. (2013), despite the difficult modelled data set. The idea of “kernel” regression can be viewed as analogous to the idea of k-NN classifier (cf. Sect. 3.3.1). Similar to k-NN, our kernel regressor imposes a non- parametric non-linear model that directly uses information from the available data. When compared to the original kernel regression idea (cf. Nadaraya, 1964; Simonoff,","Castek_1_Castek 23.1.2015 7:37 Stránka 75 4.6 EXPERIMENTS AND RESULTS 75 Figure 4-18: Given existing samples S , .., S and distance function d(), the yn value of sample Sn is 3 1 predicted as sum over w * y for i=1, .., 3, where wi reflects d(S ,S ) distance from 1-D Gaussian n i i i kernel centered in S i 1996) we apply a slight modification. In order to accommodate the solutions to problems described in Sect 3.3.2, we apply only 1-dimensional Gaussian kernels on the space of sample distances. Kernel width is then optimized, starting initially from the estimated average distance between any two samples in the available data. In the regression approach, the DAF feature selection method (cf. Chapter 3, Sect. 4.6) is applied to minimize the estimated error of predicted dependent values for each known company. The feature selection procedure is analogous to that described in Section 3.4. 4.6 Experiments and results We primarily considered four different regression models in our key experiments, obtained by combining two types of regression models (1-Nearest Neighbour and Kernel Regressor) and two types of missing value substitution (substitution by mean value, and pessimistic treatment of each missing value as an indicator of maximum distance). Feature ranking by DAF (Somol et al., 2011) has been computed in each of the four cases so as to minimize the average model error. The DAF method produced a weight for each feature, mirroring roughly the average feature ability to improve the criterion value on addition to a subset, evaluated over a large number and variety of contexts (various subsets). Additionally, we performed classification-based experiments for comparison purposes. 4.6.1 Regression-based analysis results The best-achieved models of the four types as described in Section 3.5 are illustrated in Figure 4-19.","Castek_1_Castek 23.1.2015 7:37 Stránka 76 76 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... Figure 4-19: Errors of four regression models – each dot represents a company, positioned according to its Assets Growth (GA) and Return on Assets (ROA) values, a higher dot diameter depicts a higher regression error; black and grey colours depict its positive or negative value, respectively: a) 1-Nearest Neighbour with missing values substituted by mean values, b) Kernel Regressor with missing values substituted by mean values, c) 1-Nearest Neighbour with missing values treated pessimistically, d) (best) Kernel Regressor with missing values treated pessimistically. The single best model identified in our survey – Kernel Regressor with pessimistic missing value substitution – achieved the average error of e= 0.0268903and determination coefficient Dc= 0.781963. The fact that a 1-dimensional kernel instead of a 36-dimensional kernel is used proved advantageous as it prevented over-fitting issues. Kernel regression thus proved more accurate in our experiments than simple 1-NN regression. The highest accuracy was achieved with the set of all features. Removal of any feature led to a degradation of results. The result of feature ranking is thus of interest","Castek_1_Castek 23.1.2015 7:37 Stránka 77 4.6 EXPERIMENTS AND RESULTS 77 to compare the importance of various company characteristics though it has not lead to any decrease of dimensionality. The DAF method (cf. Somol et al., 2011) produced for each feature a weight, mirroring roughly the average feature ability to improve the criterion value on addition to a subset, evaluated over a large number and variety of contexts (various subsets). Figure 4-20 shows the weights obtained in the regression case, ordered in descending order. Note that the DAF weights obtained provide only the information about the relative quality of features when compared to other features. It can be seen that there are roughly up to 8 features that tend to improve criterion value considerably more than the others. Figure 4-20: Importance of single company characteristics according to the best-achieved regression model. The graph represents growing subsets of features, features added according to the highest DAF1 coefficients. Note that model accuracy markedly improves after adding the first 8 features, and then after adding roughly the next 7 features. The best 8 factors (regression based) in decreasing order: Region – DAF1 coefficient 1.11868, regressor error on single feature not computable due to missing values) Ratio of Technical and Administrative Staff – DAF1 coefficient 1.10539, regressor error on 2 best features 0.104015 Strategy – DAF1 coefficient 0.782508, regressor error on 3 best features 0.101098 Legal Form (legal form of the company) – DAF1 coefficient 0.710418, regressor error on 4 best features 0.0983053 Ownership Type (5 types of ownership) – DAF1 coefficient 0.652801, regressor error on 5 best features 0.0920556","Castek_1_Castek 23.1.2015 7:37 Stránka 78 78 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... FDI (domestic owner, foreign, both) – DAF1 coefficient 0.483909, regressor error on 6 best features 0.085747 Ration of Exports – DAF1 coefficient 0.462479, regressor error on 7 best features 0.0698314 Owners in Top Management (yes/no) – DAF1 coefficient 0.408192, regressor error on 8 best features 0.0653676 Regarded as factors of competitiveness within the framework of this pilot study, 5 out of 8 these features (denoted in italics) correspond with the previous results of Spalek and Castek (2010) based on different analyses of similar data. 4.6.2 Classification-based analysis results We performed a series of k-NN classifier based experiments for the values of k=1, 2, 3, 5, and two types of missing value substitution (substitution by mean value, and pessimistic treatment of each missing value as an indicator of maximum distance). The single best classifier identified in our survey – 1-NN classifier with pessimistic missing value substitution – achieved the estimated classification accuracy of 0.873737 for the subset of 32 features out of the complete set of 36. With the complete set, the accuracy is 0.866162. However, as illustrated in Figure 4-21, roughly half of all features are enough to achieve classification accuracy, only negligibly worse than the best achieved. k-NN classifiers for k>1 tend to smooth out the decision boundary, i.e., reduce the influence of outliers at the cost of reducing sensitivity to detail. This effect proved disadvantageous in our case. The DAF method (cf. Somol et al., 2011) produced for each feature a weight, mirroring roughly the average feature ability to improve criterion value on addition to a subset, evaluated over a large number and variety of contexts (various subsets). Figure 4-21 shows the weights in classification case, ordered in descending order; note that the DAF weights provide only information about the relative quality of features when compared to other features. It can be seen that there are roughly up to 8 features that tend to improve the criterion value considerably more than the others. The best 8 factors (classification based) in decreasing order: Ratio of Technical and Administrative Staff – DAF1 coefficient 1.1404, 1-NN accuracy on the single feature 0.472222 Legal Form (legal form of the company) – DAF1 coefficient 0.574625, 1-NN accuracy on best 2 features 0.472222 ISO 14000-certificate holding (yes/no) – DAF1 coefficient 0.42274, 1-NN accuracy on best 3 features 0.472222 Assets Level – DAF1 coefficient 0.284363, 1-NN accuracy on best 4 features 0.494949","Castek_1_Castek 23.1.2015 7:37 Stránka 79 4.6 EXPERIMENTS AND RESULTS 79 Figure 4-21: Importance of single company characteristics according to best-achieved classification using 1-NN classifier and 3-class data. The graph represents growing subsets of features, features added according to highest DAF1 coefficients. Note that model accuracy markedly improves after adding the first 8 features, and then after adding roughly the next 10 features. Owners in Top Management (yes/no) – DAF1 coefficient 0.242903, 1-NN accuracy on best 5 features 0.494949 Software Applications – SCM module (yes/no) – DAF1 coefficient 0.206311, 1-NN accuracy on best 6 features 0.525253 Ethical Code Adoption (yes/no) – DAF1 coefficient 0.177655, 1-NN accuracy on best 7 features 0.563131 Share of Performance-Related Pay – DAF1 coefficient 0.164487, 1-NN accuracy on best 8 features 0.739899 Regarded as factors of competitiveness within the framework of this pilot study, 4 out of 8 these features (denoted in italics) correspond with the previous results of Spalek and Castek (2010) based on different analyses of similar data. 4.7 Comparing Regression-based and Classification-based analysis results When comparing the results achieved by both the approaches discussed, we should bear in mind that our goal is not prediction but identification of “informative” factors. Thus, the error in predicting the CFP or in classifying a company is not critically important; what matters more is which factors are assigned to the group of key factors influencing the CFP and thus corporate competitiveness. In this context, we should not be too surprised that the regression approach and the classification one yield somewhat different results. We should keep in mind that the tasks solved by the","Castek_1_Castek 23.1.2015 7:37 Stránka 80 80 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... respective approaches are not absolutely the same. The way of defining the CFP, implicitly hidden in these tasks, is different. In the regression approach, the CFP is defined by a combination of GA and ROA and attains continuous values. On the other hand, in the classification approach, the CFP is discretized by means of clustering the companies into distinct classes. Clearly, the classification approach does not utilize “fine” information about the value of CFP as the regression one does. On the other hand, as a generally accepted unique definition of CFP does not exist, clustering its values into distinct classes can also be a reasonable solution. This clustering certainly represents a simplification, representing a certain loss of information but at the same time having the potential benefit of giving more comprehensible information to the observer: the financial performance of a given company (and thus its competitiveness) is a good, average or bad one. In any case, by comparing these different approaches and analysing the results, we can get a more robust insight into the task. 4.8 Improved Model for Attribute Selection on High-Dimensional Economic Data This section presents a newly improved procedure for estimating the model parameters, allowing us to achieve a significantly higher precision in feature subspaces. In other words, it improves the results when searching for a small set of factors of competitiveness. The performance of this improved procedure is illustrated in experiments with both the old multidimensional data (D=37) used in previous experiments, and with new data of higher dimensionality (D=74), which enabled us to reveal a not previously considered fact related to the estimates of the multiplication constant of the pseudo-kernel model. 4.8.1 Improvements of the regression model Theoretical considerations verified in the course of experiments proved that the prediction accuracy of the pseudo-kernel non-linear regression model employed can be improved under certain conditions. These improvements are in possible adjustments to the distance function used and in optimizing the kernel width. They will be discussed in the following. Varying the distance function In the paper by Pudil et al. (2013), we assumed the standard L (Euclidean) distance 2 to be used when evaluating distances between samples. Let us recall that for k =1, 2, … the higher order distance in D-dimensional space is defined as","Castek_1_Castek 23.1.2015 7:37 Stránka 81 4.8 IMPROVED MODEL FOR ATTRIBUTE SELECTION ON HIGH-DIMENSIONAL ECONOMIC DATA 81 In Figure 4-22 we illustrate that for the specific problem of corporate competitiveness, the L distance data is not necessarily the best option. 2 Figure 4-22: Comparing Euclidean distance to alternative distances. Higher-order distance functions improve the accuracy of the kernel-based regression model on subspaces roughly of up to D/2 where D is the total number of features Figure 4-23: Comparing accuracy of a pseudo-kernel regression model with a globally optimized multiplication constant m to a model with a locally optimized m. Note that a local optimization of m (optimization on subspace instead on full space) improves accuracy when a small number of features is used","Castek_1_Castek 23.1.2015 7:37 Stránka 82 82 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... Note that higher-order distances of up to L proved capable of improving the 7 model accuracy when a small number of features is used. We could not improve the best-achieved prediction accuracy, which remains roughly the same when all 74 features are used. However, if the number of features considered is restricted to 25, the improvement in accuracy is clearly visible. The impacts of varying the distance function are presented in the following Table 4-4. The “DAF” columns contain the values of DAF coefficients of corresponding features where these features are ordered in the decreasing order of their DAF Table 4-4: Comparing feature orderings yielded by model using Euclidean distance and L distance 7 74‐dimensional data, Euclidean distance 74‐dimensional data, L distance 7 Error on Error on DAF Feature DAF Feature subset subset 0.00378051 Span of Control 0.0884389 0.00869157 Strategy 0.0885776 Mo va on of 0.00377241 Ra o of Women 0.0924165 0.00868319 0.0872776 Workers Ra o of Technical Mo va on of Top 0.00369803 and Administra ve 0.0897509 0.0085613 Management 0.0867333 Staff Mo va on of 0.00363094 0.073188 0.00740714 FDI 0.0856955 Workers 0.00350262 Ra o of Workers 0.0769846 0.00706415 Ra o of Exports 0.0783415 Mo va on of Top 0.00350031 0.0730957 0.00692042 Span of Control 0.0632577 Management 0.00330279 Span of Control 0.0730181 0.00635151 Ra o of Women 0.0551528 Ra o of Technical 0.00326192 Ra o of Exports 0.0674859 0.00625303 and Administra ve 0.0527764 Staff Company Size 0.00324774 (number of 0.0695187 0.0062338 Ra o of Workers 0.0539102 employees) Reasons for Staff 0.00318828 Ra o of Graduates 0.0668979 0.00591669 0.0488559 Turnover Number of 0.00310982 Strategy 0.0563008 0.00566361 management levels 0.0476115 per employee Intensity of 0.00269444 Mo va on Means 0.0572132 0.00551711 Ra o of Graduates 0.0471201 Use 0.00255556 Specificity of Supplies 0.0521837 0.00540293 Foreign Supplies 0.0458394","Castek_1_Castek 23.1.2015 7:37 Stránka 83 4.8 IMPROVED MODEL FOR ATTRIBUTE SELECTION ON HIGH-DIMENSIONAL ECONOMIC DATA 83 coefficients. The “Error on subset” means the error of the pseudo-kernel regression model that is based on all the features in the preceding rows including the current feature. The first three columns represent the values for the Euclidean (L ) distance 2 while the last three columns show the results for the L distance used. 7 The higher the DAF coefficient (explained in the previous chapter) of a feature is, the more informative (thus more important) the feature is. In the regression approach, the DAF feature selection method is applied to minimize the estimated error of predicted dependent values for each known company. Kernel width is then optimized, starting initially from the estimated average distance between any two samples in the available data. Table 4-4 shows the impact of the improved model on the results of feature selection. The ordering of selected features changes notably, but generally those features identified as important using Euclidean distance (L ) are identified as 2 important also when using L distance, with most of the difference being in relatively 7 limited order shifts. L distance outperforms L in very small subsets (up to 5 features). 7 2 After that, L is distinctively better than L . This can be observed in up to roughly 25 2 7 features, after which the error on subsets becomes very similar (these results are not shown due to the paper length limit). The DAF coefficient is, on the other hand, much higher while using L distance from the very beginning. 7 Kernel width multiplication constant In the paper by Pudil et al. (2013), the default width of all kernels was set to the estimated standard deviation of the distances between all pairs of samples in the available data set. In Figure 4-18 we suggest that kernel width can be adjusted by adding the multiplication constant m. Note that for values m>1 the model gets smoother, less prone to over-fitting, but also less sensitive to detail. In contrast, values m<1 lead to a more detailed model capable of capturing more detail but at the same time with degraded generalization ability. In the above-specified paper the constant m was optimized on a full set of features and then fixed globally at the beginning of the training process. In Figure 4-23 (and also Table 4-5) we show that it is possible to improve the accuracy of the model when only a smaller subset of features is used (or searched for), if the constant is optimized specifically for the model on the considered subspace. The reason for the positive effect of m optimization on subspaces is simple; the average distance between points in space increases with increasing dimensionality. Applying a sub-space model with an m constant optimized on a full feature set leads to a model that is smoother than expected, thus reducing its sensitivity to detail. Note that marked improvement by a local optimization of m can be expected only if the difference between the total number of features and the number of features in the considered subspace is large. Only then the inherent difference in distances comes","Castek_1_Castek 23.1.2015 7:37 Stránka 84 84 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... into play. This effect is observable for subsets of roughly up to 25 features with our 74-dimensional data. The impacts of optimizing the kernel width are presented in the following Table 4-5. The “DAF” columns and the “Error on subset” have the same meaning as in Table Table 4-5: Comparing feature orderings yielded by a model using a globally versus locally optimized kernel width multiplier constant. Euclidean distance is used in both cases. 74‐dimensional data, locally 74‐dimensional data, globally op mized kernel width mul plier constant m=3.06 op mized kernel width mul plier constant m Error on subset Error on subset Locally containing all containing all DAF Feature op mized features up to features up to m value the current the current 0.00378051 Span of Control 0.0884389 0.0882787 5 0.00377241 Ra o of Women 0.0924165 0.0878285 5 Ra o of Technical and 0.00369803 0.0897509 0.0834008 1.14 Administra ve Staff 0.00363094 Mo va on of Workers 0.073188 0.0705265 3.67 0.00350262 Ra o of Workers 0.0769846 0.0714226 4.22 0.00350031 Mo va on of Top Management 0.0730957 0.0685935 3.98 Number of management levels 0.00330279 0.0730181 0.0650337 4.34 per employee 0.00326192 Ra o of Exports 0.0674859 0.0608841 4.06 Company Size (number of 0.00324774 0.0695187 0.0600489 4.4 employees) 0.00318828 Ra o of Graduates 0.0668979 0.0600023 4.37 0.00310982 Strategy 0.0563008 0.0533632 4.08 Intensity of Mo va on Means 0.00269444 0.0572132 0.0532684 4.23 Use 0.00255556 Specificity of Supplies 0.0521837 0.0506469 3.99","Castek_1_Castek 23.1.2015 7:59 Stránka 85 4.8 IMPROVED MODEL FOR ATTRIBUTE SELECTION ON HIGH-DIMENSIONAL ECONOMIC DATA 85 4-4. The first three columns represent the values for the kernel width multiplier constant optimized globally while the last three columns show the results for the same multiplier constant optimized locally. 4.8.2 Optimized model performance on 37- and 74-dim data We applied the optimization described above to two datasets: the 37-dimensional data used in a paper by Pudil et al. (2013) and to newly acquired 74-dimensional data representing the same domain by a wider set of differently defined features. In Figure 4-24 we provide two comparisons. First, we compare model accuracy on a 37- dimensional dataset to the accuracy on a 74-dimensional dataset. Second, we compare the model accuracy for each of the two datasets in the default setting as described in (Pudil et al., 2013) to the model tuned both with respect to the employed distance measure and with respect to the locally adjusted multiplication constant (see Section 4.8.1). It is apparent from Figure 4-24 that the proposed improvements to pseudo-kernel regression model did not improve the best achievable prediction error, but it did notably improve model accuracy on subspaces. With 37-dimensional data, all results for subsets of up to 30 features proved to be notably better when the improved model was used. With the 74-dimensional data, the most notable improvement took place on subspaces of up to roughly half of the total number of features. This means the Figure 4-24: Comparing the accuracy of default model to that of the improved model. Comparison provided separately for 37- and 74-dimensional dataset representing the same domain. Default configuration compared to the best configuration identified in this paper","Castek_1_Castek 23.1.2015 7:37 Stránka 86 86 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... improvement is achieved when a smaller set of key factors of corporate competitiveness is to be identified. Also note in Figure 4-24 that the best achievable model setting produces practically equal accuracy if all new 74 features are used or if all old 37 features are used. This can be explained by mutual statistical dependencies of features which become redundant; thus, adding more and more features does not increase the overall information value. The new feature set, however, allows the selection of a small subset that performs better than an equally sized selected subset of the old set. This is apparent in Figure 4-24 for subsets of up to 4 features where the model on 74- dimensional data gives a lower prediction error. The impacts of extending the original 37-dimensional dataset into a 74- dimensional dataset are presented in the following Table 4-6, with the columns having the same meaning as in the preceding tables. Table 4-6: Top 25 features selected using the optimized model on 74-dim and 37-dim data 74‐dimensional data, L7 distance, localized 37‐dimensional data, L7 distance, localized kernel width mul plica on constant kernel width mul plica on constant Error on Error on DAF Feature DAF Feature subset subset Ra o of Technical and 0.00869157 Strategy 0.0885253 0.0225594 0.113963 Administra ve Staff Mo va on of Expenses on 0.00868319 Workers 0.0849643 0.0157939 Employee`s 0.106365 Benefits Expenses on Mo va on of 0.0085613 0.0837816 0.0141935 Employee`s 0.0997067 Management Educa on Mo va on of Top 0.00740714 FDI 0.0788026 0.0141366 0.0732506 Management 0.00706415 Ra o of Exports 0.0749111 0.0134088 Strategy 0.0609535 0.00692042 Span of Control 0.0632048 0.0118486 FDI 0.0540098 Ra o of 0.00635151 Ra o of Women 0.0551528 0.0116693 0.0503014 Graduates Ra o of Technical and 0.00625303 0.0527764 0.0114994 Ra o of Exports 0.0467132 Administra ve Staff Number of management 0.0062338 Ra o of Workers 0.0539102 0.0100634 0.0434888 levels per employee","Castek_1_Castek 23.1.2015 7:37 Stránka 87 4.8 IMPROVED MODEL FOR ATTRIBUTE SELECTION ON HIGH-DIMENSIONAL ECONOMIC DATA 87 Reasons for Staff Specificity of 0.00591669 0.0488559 0.010026 0.0414148 Turnover Supplies Number of 0.00566361 management Specificity of 0.0476115 0.00971683 0.0405086 levels per Products employee Company Size Ra o of 0.00551711 0.0471201 0.00963302 (number of 0.0397118 Graduates employees) Owners in Top 0.00540293 Foreign Supplies 0.0458394 0.00932846 0.0349433 Management 0.005298 Specificity of Stability of 0.0431305 0.00908389 0.0356843 Supplies Customers 0.00515788 Access to Rate of Staff 0.0415497 0.00904028 0.0359644 Funding Turnover Supplier Company Size Selec on 0.00509582 (number of 0.0411758 0.00899967 0.0345245 Criterion: CSR employees) Compa bility 0.00498899 Specificity of 0.0404037 0.00886835 Foreign Supplies 0.0341523 Products 0.00488013 Level of 0.0390621 0.00855376 Region 0.0344136 Corrup on 0.00477958 Maturity of the 0.038239 0.00845204 Legal Form of the 0.0342419 Company Company Supplier Selec on 0.00475404 CSR 0.0367709 0.00844075 0.0342001 Criterion: Quality Cer ficate OHSAS Importance of 0.00450035 0.0356523 0.00828733 Cer ficate 0.0312972 Creditors Holding 0.00447213 Stability of 0.0343655 0.00822393 CSR 0.030517 Customers Effect of 0.00442313 Assets Level Employee`s 0.0337949 0.0079642 0.0299106 Benefits Intensity of 0.00442278 Stability of Mo va on 0.0339987 0.00782862 0.0304842 Suppliers Means Use 0.0042628 Other Costs 0.0331099 0.00754582 Ownership Type 0.0308387","Castek_1_Castek 23.1.2015 7:37 Stránka 88 88 4 TESTING APPROACHES AND METHODS BASED ON LEARNING METHODS... 4.9 Conclusions It can be stated that the results presented here confirm the claim of the usefulness of applying learning approaches in searching for factors of corporate competitiveness, which was stated at ICMLG 2013 in Bangkok (Pudil, 2013). Though the methods of machine learning and feature selection (FS) are related to statistical methods in particular, they are less strict with respect to assumptions about the data analysed. The core idea of how to derive factors of corporate competitiveness is based on selecting those features (variables characterizing companies) which have the highest importance in a special multivariate regression model predicting the CFP and minimizing the error between the predicted and real value of CFP. Therefore, the factors of corporate competitiveness have a straightforward meaning – they are just those features which have the highest impact on corporate financial performance and thus on corporate competitiveness. We have described several modifications to the pseudo-kernel regression model used in Pudil et al. (2013) for evaluating the importance of corporate competitiveness factors. The modifications proved capable of notably improving model accuracy if applied on subspaces. In other words, if a small subset of factors is to be evaluated, then the improved model allows selecting factors that provide more accurate information than subsets selected in the paper by Pudil et al. (2013). We have also compared the results on original 37-dimensional data and newly acquired 74-dimensional data obtained on the same domain. Though the effort invested to define more features failed to result in direct improvement of the best achievable accuracy in the full feature space, some newly included features helped to yield better results in very small subspaces. The implication of our findings for the search for the factors of competitiveness is as follows. One cannot expect competitiveness on the corporate level to be a result of one particular cause. The whole body of literature on this topic works with the notion that competitiveness is a complex phenomenon with multiple causes (Beranova, 2008, Andrews et al., 2010 and others). Our results point to this with the error on subsets decreasing to the very last feature included and evaluated. This can be interpreted in the way that each of the features (company characteristics) has its contribution to the prediction of CFP. However, in practice it would be difficult if not next to impossible to make changes to a business according to e. g. 74 characteristics as in our regression model. A smaller set of characteristics is suitable for this, making the improved model more useful as it selects features with better prediction power. This means that these characteristics are most important for a given set of companies and thus can be regarded as factors of their competitiveness. To generalize the contribution of our findings, for a researcher it lies in the possibility of processing high-dimensional data in an easier manner and with a better reliability of the result. If applied in the field of competitiveness, then the result points","Castek_1_Castek 23.1.2015 7:37 Stránka 89 4.9 CONCLUSIONS 89 at characteristics important for the practitioner. The practitioner can therefore adjust his or her own business according to the values of the characteristics found to have the strongest influence on the business performance.","Castek_1_Castek 23.1.2015 7:37 Stránka 90","Castek_1_Castek 23.1.2015 7:37 Stránka 91 91 5 Identifying factors of competitiveness using bivariate analyses and linear regression analyses This chapter explores bivariate associations of the individual variables included in the experiments with the financial performance of companies. In addition, it formulates partial linear regression models. Both are arranged in units corresponding to the division of the original questionnaire survey that followed the logic of the stakeholder view of an enterprise. This means that the corporate characteristics are analysed in groups related to the owners, employees, customers, suppliers, and business environment. However, they are preceded by sections regarding potential internal and external factors of competitiveness and a section on the overall approach to the system of business management. Each of these units represents an independent sub-chapter and a separate partial linear regression model in the following text. These analyses serve as the basis for interpreting the results of the DAF experiments, since these do not provide simple instructions explaining the influence of the predictors examined, as is the case of e.g. linear regression models or t-tests, analyses of variance, or correlation analyses. Therefore, we will use independent samples t-tests, an analysis of variance, correlation analysis, and linear regression models in this chapter, using Pearson’s r, Spearman’s r and Kendall’s τ coefficients. While Pearson’s r is preferably used when c s assumptions of this test are met, due to the direct comparability of the results of the correlation analysis with regression models, Spearman’s r is used when the sample s does not follow bivariate normal distribution but still the relationship between two variables is monotonous. Kendall’s τ is used in the analyses where the independent c variable is an ordinal variable with fewer categories, resulting in tied ranks. Kendall’s τ should be used in such situations (deVaus, 2002; Field, 2009). Coefficient η will be c used to report the effect size of associations examined by t-test and by ANOVA and is calculated as follows for t-test: where t denotes t-test statistic df denotes degrees of freedom","Castek_1_Castek 23.1.2015 7:37 Stránka 92 92 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... for ANOVA: where SSA denotes Sum of squares between groups SST denotes Total sum of squares) All associations examined are evaluated at the level of significance of α = 10%, thus reducing the chance of making a Type I error. However, the actual p-values observed are reported for each test together with the number of observations involved in the particular test, together with the effect size found by this test. Thus, our reader can kindly evaluate our results. 5.1 General characteristics Before we start examining the influence of the variables concerned, it is also necessary to consider the influence of the general characteristics of enterprises that can affect the particular variables or that can modify or disguise their impact on financial performance, or those which give the appearance of such an action. These are particularly the industry in which the company operates, and the size of the company. These two characteristics are usually considered control variables in examining the factors of competitiveness or the financial performance of a company. This is true for a wide range of studies on completely different areas of financial performance factors. We can mention, for example, works by Coles et al. (2012) Structural Models and Endogeneity in Corporate Finance: The Link between Managerial Ownership and Corporate Performance, Youndt et al. (1996) Human Resource Management, Manufacturing Strategy, and Firm Performance, or Powell and Dent-Micallef (1997) Information Technology as Competitive Advantage: The Role of Human, Business and Technology Resources. Effect of company size on the financial performance Company size is usually measured by the number of employees, the volume of total capital or the volume of total sales. In our case we consider total capital or sales to be rather variables describing the response variable, i.e. business performance. Therefore, a good representative of company size is the number of employees. However, in our sample (as well as in the population), the number of employees is significantly skewed","Castek_1_Castek 23.1.2015 7:37 Stránka 93 5.1 GENERAL CHARACTERISTICS 93 with a predominance of values on the left, i.e. in smaller businesses and with significant outliers towards large enterprises. Thus, extreme values (up to 5,600 employees) were Winsorized. In some cases it may be preferable to use categorized company sizes, dividing enterprises into categories of 50 to 99 employees, 100 to 249 employees, and 250 or more employees. As the following table shows, such a division is more uniform. Table 5-7: Frequency distribution of a categorized company size Valid Frequency Percent Valid Percent Cumula ve Percent 50–99 128 29.6 29.6 29.6 100–249 175 40.5 10.5 70.1 >250 129 29.9 29.9 100.0 Total 432 100.0 100.0 Since the vector of financial performance variable and size of a company variable measured by the number of employees strongly violates the assumption of two- dimensional normality, we use Kendall’s coefficient τ to measure the association c between the size and the financial performance. The influence of categorized company size is measured by an analysis of variance and its effect by coefficient η. Table 5-8: The association between the company size and the financial performance Variable Coefficient Associa on p (two‐tailed) Categorized size η 0.122 0.046 Number of τ ‐0.068 0.032 employees c Both ways of expressing company size were significantly related to financial performance. In both cases, the measure of association is low (one might say trivial in the case of the number of employees) and in both cases it is also true that larger companies show worse financial performance. The average performance of small businesses in the sample is m = 0.416, m = 0.247 for medium-sized businesses, and m = 0.117 for large enterprises.","Castek_1_Castek 23.1.2015 7:37 Stránka 94 94 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Effect of industry on the financial performance The industry in which a company operates can be expressed in our sample at the level of the first two digits of CZ NACE, or at a less detailed level as manufacturing versus construction. The first division is of course finer, but even for the sample size of 408 enterprises there are industries in which the number of observations is very low – in up to seven categories it is lower than 10. Therefore, the bigger part of our analyses uses the less detailed division to manufacturing and construction. However, the table below shows associations and p-values for both independent samples t-test (manufacturing/construction categories) and analysis of variance (CZ NACE categories). Table 5-9: The association between the industry and the financial performance Variable Coefficient Associa on p (two‐tailed) Industry: η Manufacturing/Construc on 0.101 0.039 Industries by the first two digits η of CZ NACE 0.325 0.001 It is obvious that in both approaches the association with financial performance is statistically significant. For the less detailed classification of manufacturing and construction, the effect of the industry is not very large but still significant. We find a higher performance in construction (m = 0.468) than in manufacturing (m = 0.217). The effect is larger for the finer classification and it can be described verbally at least as medium. If we examined the differences between particular categories, we could find statistically significant differences only between certain categories. For example, in the field of “Production of electrical machines and equipment” or “Production of radio, television and communication equipment and devices”, the average financial performance is relatively high, about m = 0.90. On the other hand, in the field of “Production of basic metals and fabricated metal products” and “Production of textiles and textile products” the average financial performance is relatively low, m = -0.386, or m = -0.267, respectively. Interaction of company size and industry The data in our sample clearly show that there is an association between control variables. In construction, the companies are considerably smaller (m = 173) than in manufacturing (m = 267), which is a statistically significant difference (p = 0.001). Naturally, the question arises whether the zero order relationships, i.e. the size affecting financial performance, and industry affects financial performance, are true. This can be verified by examining the conditional associations and correlations. Let us assume","Castek_1_Castek 23.1.2015 7:37 Stránka 95 5.1 GENERAL CHARACTERISTICS 95 that the industry affects the size but not vice versa. In this case, we must examine the authenticity of the association between the industry and financial performance by dividing the sample based on the categorized company size (if the variable describing industry was an interval, we could use partial correlations), and the authenticity of the association between the size and financial performance by dividing the sample based on the industry. The following table shows the former conditional associations. Table 5-10: The association between the industry and the financial performance, controlling for the size of the company: statistics Company size / industry => financial performance Associa on p 50–90 Nominal η n.s. Number of observa ons 120 100–249 Nominal η 0.138 0.071 Number of observa ons 171 250+ Nominal η n.s. Number of observa ons 120 The zero order relationship examined is maintained only in the category of medium-sized businesses whereas in the remaining categories it is statistically insignificant. For medium-sized businesses it is slightly stronger than it seemed to be in the whole sample (η = 0.138 vs. η = 0.101). We can assess the substantive differences in the performance in various industries and size categories in the following table. Table 5-11: The association between the industry and the financial performance, controlling for the size of the company: differences in CFP Industry Company (Manufacturing N Mean CFP Std. Std. Error size Devia on Mean /Construc on) 92 0.338 1.044 0.109 50–99 Manufacturing Construc on 28 0.552 0.751 0.142 134 0.193 1.006 0.087 100–249 Manufacturing Construc on 37 0.531 0.991 0.163 105 0.144 0.944 0.092 >250 Manufacturing Construc on 15 0.156 0.641 0.166","Castek_1_Castek 23.1.2015 7:37 Stránka 96 96 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... The biggest differences in performance based on industry were observed for medium-sized companies, while the difference for small companies is less than half compared to medium-sized enterprises, and there is almost none difference for large companies. Therefore, we cannot say that company size would explain the variability of financial performance better than the industry. More likely, the industry explains the financial performance only for medium-sized companies. We can illustrate this using a model in the chart below. Figure 5-25: The effect of industry on financial performance controlled for the company size As mentioned above, it is also necessary to examine the validity of association between the company size and financial performance. In the entire sample, this associtation was rather weak. The following table shows the conditional association between company size and financial performance, if the sample is divided according to the industry. Table 5-12: The influence of company size on financial performance, controlling for the industry: statistics Industry / Company size => financial performance Correla on p Manufacturing Ordinal τ c ‐0.031 0.386 Number of observa ons 329 Construc on Ordinal τ c ‐0.122 0.089 Number of observa ons 79","Castek_1_Castek 23.1.2015 7:37 Stránka 97 5.1 GENERAL CHARACTERISTICS 97 The company size, measured by the number of employees, shows a statistically significant relationship with financial performance only in the construction industry. Here, however, it is about twice as strong as it was originally throughout the sample. On the other hand, this relationship is statistically insignificant in the manufacturing industry. Again, we cannot say that the zero-order correlation is false. The third variable, the industry, only modifies the effect of the variable originally examined. The model in the chart below shows the relationships mentioned above. Figure 5-26: The effect of company size on financial performance controlled for the industry When we combine what was mentioned above, our findings then support the model of a direct effect of the company size and industry on financial performance, which is in both cases modified by a third variable. All the statistically significant associations found are shown by the model in the chart below. Figure 5-27: The combined effect of company size and industry on financial performance","Castek_1_Castek 23.1.2015 7:37 Stránka 98 98 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... 5.2 Internal competitiveness factors of a company The first part of the questionnaire focused on how respondents subjectively perceive the potential internal factors of competitiveness. In nine closed questions, the respondents rated their company in comparison with its direct competitors, using a scale from 1 – a distinctively lower value – to 5 – a distinctively higher value. The following is a list of questions with their average values and correlation with the financial performance of the company: Table 5-13: Variables describing the internal environment of a company: bivariate correlations with CFP How do you rate, compared Average p Pearson’s r N to your compe tors, your… value (two‐tailed) Innova on Ac vity 3.37 0.157 0.002 399 Product Adjustment 3.89 0.100 0.044 404 Product Quality 3.74 0.129 0.010 403 Labour Costs 2.94 0.089 0.076 398 Other Costs 2.95 ‐0.081 0.110 392 Labour Qualifica on 3.35 0.122 0.014 398 Customer Care 3.59 0.048 0.343 400 Access to Funding 3.33 0.150 0.000 379 Company’s Goodwill 3.52 0.070 0.165 400 Scale from 1 – a distinctively lower value – to 5 – a distinctively higher value. The p (two-tailed) column in the table above shows p-values of two-tailed tests. However, the research in this part evaluates directional hypotheses that seek to demonstrate that the increasing value of the variable also increases the financial performance. Exceptions are Labour Costs and Other Costs, which were expected to demonstrate the negative direction of the correlation, i.e. that the financial performance increases with the decreasing values of these variables. Nevertheless, this trend was not confirmed for Labour Costs, so it is necessary to use the one-tailed p- value of p = 0.962 (1-p/2). On the contrary, for other variables, it is possible to divide the two-tailed p-value by two, which will show Other Costs and Company’s Goodwill statistically significant at the level of α = 10% also.","Castek_1_Castek 23.1.2015 7:37 Stránka 99 5.2 INTERNAL COMPETITIVENESS FACTORS OF A COMPANY 99 Regarding the size effect of relationships that can be generalized, it ranges from trivial/none (Other Costs r = 0.081) to low (Access to Funding r = 0.180). So, it is not possible to talk about any important factor of financial performance of a company at the level of simple bivariate correlation in this area of company characteristics. Multidimensional model To assess the degree of the joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model in this form: Financial performance = β + β *Innovation Activity + β *Product Adjustment + 1 0 2 β *Product Quality + β *Labour Costs + β *Other Costs + β *Labour Qualification 3 4 5 6 + β *Customer Care + β *Access to Funding + β *Company’s Goodwill + β *Industry 9 8 7 10 + β *Company Size Medium-sized + β *Company Size Large 12 11 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The summary of the model obtained from the forced entry linear regression is shown in the following table. Table 5-14: Summary of the internal factors regression model Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.338 0.114 0.083 0.892 0.114 3.666 12 342 0.000 The value of multiple correlation could be considered moderate, as the model explains about 11% of the variability of financial performance. Still, the model is significant. Estimates of the parameters of this model are listed in the following table. In this case there are fewer variables that statistically significantly affect the financial performance at the level of α = 10%. This is true for Innovation Activity, Other Costs, and Access to Funding; in addition, it includes all the control variables. The higher financial performance – if the values of the other variables remain unchanged – is related with by higher Innovation Activity, lower Other Costs, and better Access to Funding. Furthermore, the construction industry is more profitable than manufacturing (the reference value of zero is represented by the manufacturing industry) and the larger the company is, the lower its financial performance is","Castek_1_Castek 23.1.2015 7:37 Stránka 100 100 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... (measured by relative indicators of ROA and asset growth). Access to Funding and Company Size are the strongest predictors in this model, followed by Innovation Activity and Industry, being approximately at the same level, while the weakest predictor is the level of Other Costs. Table 5-15: Coefficients of the internal factors regression model Unstandardized Standardized Coefficients Coefficients Model t p B Std. Error Beta (Constant) ‐0.583 0.368 ‐1.587 0.113 Innova on Ac vity 0.088 0.049 0.098 1.777 0.076 Product Adjustment 0.053 0.061 0.049 0.873 0.383 Product Quality 0.082 0.074 0.068 1.111 0.267 Labour Costs 0.072 0.056 0.073 1.285 0.200 Other Costs ‐0.120 0.065 ‐0.102 ‐1.852 0.065 Labour Qualifica on 0.062 0.072 0.051 0.867 0.386 Customer Care ‐0.054 0.059 ‐0.053 ‐0.916 0.360 Access to Funding 0.163 0.049 0.187 3.308 0.001 Company’s Goodwill ‐0.061 0.059 ‐0.064 ‐1.037 0.300 Industry 0.213 0.125 0.091 1.705 0.089 Company size: Medium‐sized vs. ‐0.275 0.116 ‐0.145 ‐2.377 0.018 Small Company size: Large vs. Small ‐0.375 0.126 ‐0.184 ‐2.982 0.003 5.3 External competitiveness factors of a company The following set of eight variables focused on how respondents subjectively perceive the potential external competitiveness factors. The respondents used another eight closed questions to rate their company in comparison with its direct competitors, using a scale from 1 – a very low value – to 5 – a very high value. The following is a list of questions with their average rating and correlation with the financial performance of the company:","Castek_1_Castek 23.1.2015 7:37 Stránka 101 5.3 EXTERNAL COMPETITIVENESS FACTORS OF A COMPANY 101 Table 5-16: Variables describing the external environment of a company: bivariate correlations with CFP Average p How do you rate… Pearson’s r N value (two‐tailed) Competitive Rivalry 4.10 ‐0.133 0.008 404 Power of Buyers 3.87 ‐0.065 0.194 404 Power of Suppliers 3.27 ‐0.096 0.053 404 Interest in Employment 2.99 0.211 0.000 401 Level of Corruption 2.28 ‐0.002 0.964 389 Support from the State 1.67 0.133 0.008 403 Support from Municipality/Local Administration 2.01 0.067 0.181 400 Market Progress 3.60 0.151 0.003 386 Scale from 1 – a very low value – to 5 – a very high value. For Market Progress: 1 – very narrowing to 5 – very widening. The tested hypotheses were not directional for the variables Level of Corruption and Market Progress. However, for all the others the direction of the correlation found is in line as expected. For this reason, it is again possible to divide the p-values by two, and Power of Buyers and Support from Municipality/Local Administration will also become statistically significant correlations at the level of α = 10%. Only Level of Corruption shows no statistically significant correlation with financial performance. We can also note that the factual relationship between both the variables with higher p-values and the financial performance is so weak that we cannot talk about influencing the financial performance by these variables. As for other variables, their effect can be interpreted as very low in the case of Power of Suppliers (r = -0.096), low for Competitive Rivalry (r = -0.133), Support from the State (r = 0.133) and Market Progress (r = 0.151), and low to medium for Interest in Employment (r = 0.211). A higher level of Competitive Rivalry or bargaining Power of Buyers and Suppliers hurts the company’s financial performance, while higher Interest in Employment and Support from the State and from Municipality/Local Administration as well as the widening Market Progress help the financial performance. Again, we cannot talk about any important factor of financial performance at the level of a simple bivariate correlation in this area of company characteristics.","Castek_1_Castek 23.1.2015 7:37 Stránka 102 102 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model in this form: Financial Performance = β + β *Competitive Rivalry + β *Power of Buyers + 2 1 0 β *Power of Suppliers + β *Interest in Employment + β *Level of Corruption + 4 3 5 β *Support from the State + β * Support from Municipality/Local Administration + 6 7 β *Market Progress + β *Industry + β *Company size Medium-sized + β *Company 11 10 8 9 size Large The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The summary of the model obtained from the forced entry linear regression is shown in the following table. Table 5-17: Summary of the external factors regression model Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.343 0.118 0.090 0.849 0.118 4.213 11 347 0.000 As in the case of the internal factors, we could consider the value of multiple correlation of 0.343 moderate, as the model explains almost 12% of the variability of financial performance. Also, the model is significant. Estimates of the parameters of this model are listed in the following in the Table 5-18. In this case, most of the variables statistically significantly affect financial performance. However, this is not true for Power of Buyers, Power of Suppliers, Level of Corruption, and Support from Municipality/Local Administration. As for the control variables, the impact of Industry is statistically insignificant. The higher financial performance – if the values of the other variables remain unchanged – is related with lower Competitive Rivalry, higher Interest in Employment, higher Support from the State, and widening Market Progress. Increasing Company Size also means a lower financial performance (measured by relative indicators of ROA and asset growth). Relatively strong predictors are Interest in Employment, Competitive Rivalry, and a shift from a small business to a large enterprise. The lowest relative effect can be observed for the bargaining Power of Suppliers, followed by Support from the State. Market Progress and a shift from small","Castek_1_Castek 23.1.2015 7:37 Stránka 103 5.3 EXTERNAL COMPETITIVENESS FACTORS OF A COMPANY 103 to a medium-sized business have a relatively moderate influence. As mentioned above, however, the explanatory power of the model as a whole is rather low or medium. Table 5-18: Coefficients of the external factors regression model Standardized Unstandardized Coeffi‐ Coefficients Model cients t p Std. B Beta Error (Constant) 0.152 0.421 0.361 0.719 Competitive Rivalry ‐0.162 0.060 ‐0.146 ‐2.714 0.007 Power of Buyers ‐0.003 0.060 ‐0.003 ‐0.052 0.959 Power of Suppliers ‐0.090 0.059 ‐0.080 ‐1.533 0.126 Interest in Employment 0.179 0.062 0.151 2.893 0.004 Level of Corruption 0.012 0.042 0.016 0.290 0.772 Support from the State 0.102 0.055 0.106 1.850 0.065 Support from Municipality/Local ‐0.027 0.052 ‐0.030 ‐0.513 0.608 Administration Market Progress 0.153 0.058 0.137 2.646 0.009 Industry 0.198 0.125 0.088 1.583 0.114 Company size: Medium‐sized vs Small ‐0.242 0.110 ‐0.134 ‐2.198 0.029 Company size: Large vs Small ‐0.282 0.121 ‐0.145 ‐2.342 0.020 5.4 Stakeholder orientation and characteristics of an organizational structure This part of the questionnaire examined a company as a whole; it did not go into the details of individual stakeholder groups, as was the case with the other sections. It rather focused on the structure or relations between stakeholder groups in terms of company management. If management were singled out as a separate stakeholder group, these questions would then belong to the section investigating just","Castek_1_Castek 23.1.2015 7:37 Stránka 104 104 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... management. The inclusion of this section was based on both the selected stakeholder approach to analysing an enterprise, and research into stakeholder management and company performance (Berman, 1999) or social and financial performance of a company (Preston, O’Bannon, 1997). Table 5-19: Contents of the variables describing the stakeholder orientation and characteristics of an organizational structure Question Variable name Values How important is/are for your company… Importance 1 – unimportant group/5 owners? of owners – very important group Importance 1 – unimportant group/5 employees? of employees – very important group Importance 1 – unimportant group/5 customers? of customers – very important group Importance 1 – unimportant group/5 suppliers? of suppliers – very important group Importance 1 – unimportant group/5 – creditors? of creditors very important group Importance 1 – unimportant group/5 – the state? of the state very important group Importance 1 – unimportant group/5 – communities? of communities very important group What is the number of independent Number of Czech branches of your enterprise in the Czech Interval Branches Republic? What is the number of independent Number of foreign Interval branches of your enterprise abroad? Branches What is the number of relatively Number autonomous organizational units of your of autonomous units Interval enterprise? What is the average autonomy of the organizational units within Autonomy Interval the defined categories? What is the number of management levels Number Interval in the main line of management? of management levels What is the span of control? Span of control Interval Number What is the number of management levels of management levels Interval per one employee? per employee","Castek_1_Castek 23.1.2015 7:37 Stránka 105 5.4 STAKEHOLDER ORIENTATION AND CHARACTERISTICS OF AN ORGANIZATIONAL STRUCTURE 105 The variables included here first assess the importance of the defined interest groups (stakeholders) for an enterprise, using a scale from 1 – an unimportant group – to 5 – a very important group. Furthermore, the enterprise is described with the number of independent branches in the Czech Republic, the number of independent branches abroad, the number of relatively autonomous organizational units and their average autonomy, the number of management levels, span of control, and the number of management levels per one employee. The autonomy of organizational units was rated on a scale from 1 – low autonomy to 5 – high autonomy in these areas: production program planning, dealings with customers, planning material inputs, dealings with suppliers, workforce development planning, and staff selection and recruitment. The following is a list of variables with their average rating and correlation with the financial performance of the enterprise: Table 5-20: Variables describing stakeholder orientation of an enterprise and characteristics of an organizational structure: bivariate correlations with CFP Average p Variable Kendall’s τ c N value (two‐tailed) Importance of Owners 4.48 0.023 0.510 408 Importance of Employees 4.05 0.043 0.271 410 Importance of Customers 4.68 0.013 0.706 410 Importance of Suppliers 3.79 ‐0.065 0.103 409 Importance of Creditors 2.73 ‐0.054 0.217 396 Importance of the State 2.42 0.092 0.021 408 Importance of Communities 2.58 ‐0.004 0.923 406 Number of Czech Branches 2 1.23 0.027 0.390 400 Number of Foreign Branches 2 0.25 ‐0.005 0.812 389 Number of Autonomous Units 2 1.24 0.025 0.425 386 Autonomy 2 1.20 0.036 0.252 403 Number of Management Levels 2 3.18 ‐0.009 0.808 401 Span of Control 2 13.22 0.017 0.612 396 Number of Management Levels per Employee 2 0.02 0.051 0.116 401 1 Scale from 1 – very low to 5 – very high. 2 Variables with strongly abnormal two-dimensional distribution.","Castek_1_Castek 23.1.2015 7:37 Stránka 106 106 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Due to the distribution of some variables that strongly fail to meet the two- dimensional normality condition, Kendall’s coefficient τ was used to assess the c bivariate correlations. This coefficient is nonparametric, immune to tied ranks (deVaus, 2002), and according to Field (2002) it is a more appropriate estimate of the correlation in population than Spearman’s Rho. Out of all the fourteen variables tested, only Importance of the State has a statistically significant effect on the financial performance. However, its effect on financial performance is on the borderline between trivial to no effect and a very low effect. Again, we cannot talk about any important factor of financial performance at the level of a simple bivariate correlation in this area of enterprise characteristics. Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on 3 financial performance, we can try to formulate a linear regression model : Financial performance = β + β *Importance of Owners + β *Importance of 1 0 2 Employees + β *Importance of Customers + β *Importance of Suppliers + 4 3 β *Importance of Creditors + β *Importance of the State + β *Importance of 5 7 6 Communities + β *Number of Management Levels + β *Span of Control + β * 9 8 10 Number of Management Levels per Employee + β *Industry 11 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. Predictors Size Medium-sized and Size Large were removed because of collinearity with Number of Management Levels, Span of Control, and Number of Management Levels per Employee. The summary of the model obtained from the forced entry linear regression is shown in the following table. Table 5-21: Summary of the regression model describing stakeholder orientation of an enterprise and characteristics of its organizational structure Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.261 0.068 0.039 0.887 0.068 2.349 11 353 0.008 3 It includes Number of Independent Branches in the Czech Republic, Number of Independent Branches Abroad, Number of Relatively Autonomous Organizational Units, Average Autonomy of the Organiza- tional Units.","Castek_1_Castek 23.1.2015 7:37 Stránka 107 5.4 STAKEHOLDER ORIENTATION AND CHARACTERISTICS OF AN ORGANIZATIONAL STRUCTURE 107 The multiple correlation value of 0.261 is below the medium effect size, as the model explains about 7% of the financial performance variability. Still, the model is significant. Estimates of the parameters of this model are listed in the following table. Table 5-22: Coefficients of the regression model describing stakeholder orientation of an enterprise and characteristics of its organizational structure Unstandardized Standardized Model Coefficients Coefficients t p B Std. Error Beta (Constant) ‐0.695 0.494 ‐1.406 0.161 Importance of Owners 0.059 0.054 0.057 1.092 0.276 Importance of Employees 0.022 0.062 0.021 0.346 0.729 Importance of Customers 0.171 0.087 0.114 1.962 0.051 Importance of Suppliers ‐0.111 0.060 ‐0.113 ‐1.842 0.066 Importance of Creditors ‐0.053 0.040 ‐0.072 ‐1.319 0.188 Importance of the State 0.127 0.045 0.171 2.849 0.005 Importance of Communities ‐0.057 0.047 ‐0.070 ‐1.208 0.228 Number of Management Levels ‐0.005 0.054 ‐0.005 ‐0.093 0.926 Span of Control 0.003 0.005 0.035 0.675 0.500 Number of Management Levels 5.567 3.203 0.092 1.738 0.083 per Employee Industry 0.223 0.124 0.099 1.801 0.073 Similarly to the zero order correlations, only a few variables statistically significantly affect financial performance. These include Importance of Customers, Importance of the State, Number of Management Levels per Employee, and Industry, with only Importance of the State at the level of statistical significance α = 0.05. Both higher Importance of Customers and higher Importance of the State imply higher financial performance of a company in case the values of the other variables remain unchanged. Even the variable Number of Management Levels per Employee has a positive impact on financial performance. Furthermore, the construction industry is more profitable than manufacturing (the reference value of zero is represented by the manufacturing industry). The relatively strongest predictor is Importance of the State,","Castek_1_Castek 23.1.2015 7:37 Stránka 108 108 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... followed by Importance of Customers, while the impact of Industry and Number of Management Levels per Employee is similarly strong. 5.5 Owners This part of the questionnaire dealt with the ownership and property structure of the company. As Šedová (2007) stated, “a company’s ownership structure significantly affects the creation of a certain model of ownership. The individual ownership models, applied in countries with market economies, allow different mechanisms of promoting the interests of owners.” The relationships and interests of the owners towards the company were examined at two levels, as proposed by Kučera (2005): the relationship between the owners and other stakeholders, and the relationship of owners with one another. The definition of the different types of ownership took into account the specifics of exerting ownership in the Czech Republic. The following is a list of variables with an explanation of their content: Table 5-23: Contents of variables describing the ownership and property structure Question Variable Values Is your enterprise part Concern Yes/No of a concern? What is the legal form Natural person/Legal Owner’s Legal Form of the owner? person/Both What is the ownership Ownership Concentration Sole owner/Majority concentration? owner/More big owners Are there any foreign direct FDI Yes/No investments in the enterprise? Are owners whose stake in the company exceeds 5% present Owners in Board of Directors Yes/No in the board of directors? Are owners whose stake in the company exceeds 5% present Owners in Supervisory Board Yes/No in the supervisory board? Are owners whose stake in the company exceeds 5% present Owners in Top Management Yes/No in the top management? What is the level of the principal Assets Level 1 – obsolete/5 – top‐level components of tangible assets?","Castek_1_Castek 23.1.2015 7:37 Stránka 109 5.5 OWNERS 109 The characteristics of the variables are significantly heterogeneous here. Part of the variables are dichotomous, part are nominal variables of three values, and one of them is ordinal, which we will consider quasi-interval for the purpose of linear regression. Bivariate associations between the variables of this section and the financial performance of a company follow here, and tests of their significance using ANOVA and t-test: Table 5-24: Variables describing the ownership and property structure: bivariate relationships with CFP Variable Coefficient Association p (two‐tailed) Concern η ‐ n.s. Owner’s Legal Form: Natural Person vs. Legal η 0.138 0.028 Person Owner’s Legal form: Natural Person \& Legal η Person vs. Natural Person ‐ n.s. Owner’s Legal Form: Natural Person \& Legal η ‐ n.s. Person vs. Legal Person Ownership Concentration η ‐ n.s. FDI η ‐ n.s. Owners in Board of Directors η 0.104 0.035 Owners in Supervisory Board η 0.169 0.001 Owners in Top Management η 0.116 0.019 Assets Level τ c 0.190 0.001 The table shows that there is a statistically significant difference between companies whose owner is a natural person (m = 0.386) and companies whose owner is a legal person (m = 0.128). Higher performance was found in enterprises owned by a natural person. The effect size is, however, low (η = 0.138). What also plays a role is how the owners are involved in the management of the companies. If the owners are present in the board of directors or the supervisory board, the performance of such enterprises is statistically significantly lower (m = 0.365 vs. m = 0.125 for the board of directors, and m = 0.376 vs. m = -0.005 for the supervisory board). Conversely, if the owners are present in top management, the financial performance of such enterprises is higher (m = 0.129 vs. m = 0.355). However, once again the effect size is rather low. The positive effect of Assets Level on financial performance is low to moderate. Again, it is not possible to talk about any important factor of financial performance in the area of enterprise characteristics at the level of a simple bivariate association.","Castek_1_Castek 23.1.2015 7:37 Stránka 110 110 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model in this form: Financial performance = β + β *Concern + β *Owner’s Legal Form Legal Person + 2 0 1 β * Owner’s Legal Form Legal Person \& Natural Person + β *Majority Owners + 4 3 β *Big Owners + β *FDI + β *Board of Directors + β *Supervisory Board + β *Top 8 7 6 9 5 Management + β *Assets Level + β *Industry + β *Company Size Medium-sized 10 11 12 + β *Company Size Large 13 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. Predictors Size Medium-sized and Size Large were removed because of collinearity with Number of Management Levels, Span of Control, and Number of Management Levels per Employee. The summary of the model obtained from the forced entry linear regression is shown in the following table. Table 5-25: Summary of the regression model describing the ownership and property structure Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.431 0.186 0.153 0.851 0.186 5.644 13 321 0.000 In the case of this model, the multiple correlation value of 0.431 ranges from medium to a substantial effect. The model explains almost 19% of the variability of financial performance. Also, the model is significant. Estimates of the parameters of this model are listed in the Table 5-26. Ownership concentration and the presence of the FDI are statistically insignificant in this model. Furthermore, Industry and the shift from natural persons as owners to the mix of both natural and legal persons are also statistically insignificant. In the case of the latter, it is probably caused by a small number of companies with this mix among limited liability companies and joint stock companies. Even this model shows that the larger the company is, the worse its financial performance becomes (measured by the relative indicators of ROA and asset growth). Notions from bivariate associations that the presence of the owners in the board of directors hurts financial performance while the presence of the owners in top management helps it are confirmed here (in both cases the reference value of zero is","Castek_1_Castek 23.1.2015 7:37 Stránka 111 5.5 OWNERS 111 represented by the absence of the owners). In addition, financial performance is enhanced by higher Assets Level, inclusion in a Concern Structure (the reference value of zero means that the company is not part of a concern), and the ownership by natural persons only (the reference value of zero is represented by businesses owned by natural persons only). The relatively strongest predictor is Assets Level, followed at a considerable distance by the inclusion in the Concern structure, Company Size, Owners in Top Management, Owners in Board of Directors, and Natural Persons as Owners. Table 5-26: Coefficients of the regression model describing the ownership and property structure Unstandardized Standardized Coefficients Coefficients Model t p B Std. Error Beta (Constant) ‐0.519 0.246 ‐2.111 0.036 Concern 0.372 0.146 0.181 2.547 0.011 Owner’s Legal Form: Natural Per‐ ‐0.255 0.148 ‐0.135 ‐1.722 0.086 son vs Legal Person Owner’s Legal form: Natural Person ‐0.161 0.189 ‐0.047 ‐0.851 0.395 \& Legal Person vs Natural Person Ownership Concentration: Majority 0.014 0.136 0.006 0.101 0.919 Owner vs. Sole Owner Ownership Concentration: Multiple ‐0.099 0.140 ‐0.053 ‐0.704 0.482 Big Owners vs. Sole Owner FDI ‐0.041 0.144 ‐0.020 ‐0.283 0.777 Owners in Board of Directors ‐0.208 0.114 ‐0.112 ‐1.817 0.070 Owners in Supervisory Board ‐0.169 0.123 ‐0.084 ‐1.373 0.171 Owners in Top Management 0.300 0.135 0.158 2.215 0.027 Assets Level 0.277 0.054 0.268 5.162 0.000 Industry 0.105 0.122 0.046 0.861 0.390 Company Size: Medium‐Sized vs. ‐0.238 0.114 ‐0.126 ‐2.085 0.038 Small Company Size: Large vs. Small ‐0.304 0.134 ‐0.150 ‐2.261 0.024","Castek_1_Castek 23.1.2015 7:37 Stránka 112 112 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... 5.6 Employees This section of the questionnaire focused on several framing questions: what is the structure of employees, what is their turnover and what are the reasons for the Table 5-27: Contents of variables describing the stakeholder group of employees Question Variable Types and values What is the number of employees? Number of Employees Interval What is the ratio of women? Ratio of Women Interval What is the ratio of university graduates? Ratio of Graduates Interval What is the ratio of technical and administrative Ratio of Technical and Interval staff? Administrative Staff What is the ratio of workers? Ratio of Workers Interval Staff Turnover Rate: Below 2 %/2 – What is the rate of staff turnover? Low vs. High 10 %/above 10 % What is the average frequency of these reasons for staff turnover: reorganization, low wage or salary, Reasons for Staff uninteresting work, lack of career development op‐ Turnover Interval portunities, poor workplace relationships, or per‐ sonal reasons? Do you monitor systematically the reasons for Staff Turnover Moni‐ Yes/No employees’ leaving? toring What is the proportion of the variable component Motivation of Top Interval of salary for top management? Management What is the proportion of the variable component Motivation of Workers Interval of wage for workers? What is the impact of the variable component Effect of Motivation 1 – insignificant/5 – of wage or salary on staff motivation? Component of Wage very high What is the average intensity of using motivation Intensity of Using Mo‐ devices, such as stock options, transport allowances, tivation Devices Interval etc.? What resources does the company spend on Expenses on Employee employee benefits (as a percentage of labour Interval Benefits costs)? What is the impact of these employee benefits on Effect of Employee 1 – insignificant/5 – staff motivation? Benefits very high What resources does the company spend on Expenses on Employee Interval employee training (as a percentage of labour costs)? Training","Castek_1_Castek 23.1.2015 7:37 Stránka 113 5.6 EMPLOYEES 113 turnover, and how are they motivated? The previous Table 5-27 presents a list of variables with an explanation of their contents. Most of the variables are interval, three of them are ordinal, and one is dichotomous. Interval variables paired with financial performance for the most part fail to meet the condition of two-dimensional normality of distribution; therefore we use Spearman’s rs coefficient. Only the effect of the Staff Turnover Rate on financial performance (this variable takes three values) will be analysed using ANOVA, and the effect of Staff Turnover Monitoring on financial performance (this variable takes two values) will be analysed with the t-test. The following are bivariate associations between variables of this area and financial performance of a company: Table 5-28: Variables describing the stakeholder group of employees: bivariate relationships with CFP Variable Coefficient Association p (two‐tailed) N Number of Employees r s ‐0.101 0.041 405 Ratio of Women r s ‐0.229 0.001 390 Ratio of Graduates r s 0.129 0.010 393 Ratio of Technical and Administrative Staff r s 0.114 0.023 394 Ratio of Workers r s ‐0.114 0.023 394 Staff Turnover Rate: Low vs. High η 0.125 0.039 152 Reasons for Staff Turnover r s ‐0.138 0.007 375 Staff Turnover Monitoring η ‐ n.s. 394 Motivation of Top Management r s ‐0.023 0.676 345 Motivation of Workers r s 0.065 0.231 345 Effect of Motivation Component of Wage r s 0.094 0.064 391 Intensity of Using Motivation Devices r s 0.146 0.003 408 Expenses on Employee Benefits r s 0.010 0.859 334 Effect of Employee Benefits r s 0.018 0.715 393 Expenses on Employee Training r s 0.065 0.235 336","Castek_1_Castek 23.1.2015 7:37 Stránka 114 114 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Most of the variables show a statistically significant relationship with the financial performance of a company. This is true for Number of Employees, Ratio of Graduates, Ratio of Technical and Administrative Staff, Ratio of Workers, Staff Turnover Rate, Reasons for Staff Turnover, Effect of Motivation Component of Wage, and Intensity of Using Motivation Devices. Although these variables have a statistically significant effect on financial performance, these effects are factually weak (r = 0.094 to 0.146). s Only in the case of Ratio of Women we can talk about an almost medium effect on financial performance. This effect is negative (r = -0.229). As far as Staff Turnover s Rate is concerned, a lower Staff Turnover Rate implies a higher average financial performance than a higher Staff Turnover Rate (m = 0.358 vs. m = 0.081). Again, we cannot talk about any important factor of financial performance at the level of a simple bivariate association in this area of enterprise characteristics. Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model. The variable Staff Turnover Rate, which is ordinal, will be treated as nominal since it takes only three values. The model will be as follows: Financial Performance = β + β *Ratio of Women + β *Ratio of Technical and 1 2 0 Administrative Staff + β *Low Staff Turnover Rate + β *High Staff Turnover Rate + 3 4 β *Reasons for Staff Turnover + β *Staff Turnover Monitoring + β *Motivation of Top 7 6 5 Management + β *Effect of Motivation Component of Wage + β *Intensity of Using 8 9 Motivation Devices + β *Effect of Employee Benefits + β *Industry + β *Company 10 11 12 Size Medium-sized + β *Company Size Large 13 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The predictor Number of Employees was removed because of heterogeneity of variance, and it was replaced by dummy variables Company Size Medium-sized and Company Size Large. Predictors Ratio of Graduates, Ratio of Technical and Administrative Staff, Ratio of Workers, Motivation of Workers, Expenses on Employee Benefits, and Expenses on Employee Training were also removed because of variance heterogeneity. The summary of the model obtained from the forced entry linear regression is shown in the Table 5-29. In the case of this model, the multiple correlation value of 0.335 ranges from medium to substantial. The model explains about 11% of the variability of financial performance and is significant. Estimates of the parameters of this model are listed in the Table 5-30.","Castek_1_Castek 23.1.2015 7:38 Stránka 115 5.6 EMPLOYEES 115 Table 5-29: Summary of the regression model describing the stakeholder group of employees Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.335 0.112 0.077 0.901 0.112 3.177 13 326 0.000 Table 5-30: Coefficients of the regression model describing the stakeholder group of employees Unstandardized Standardized Coefficients Coefficients Model t p B Std. Error Beta (Constant) ‐0.437 0.518 ‐0.844 0.399 Ratio of Women ‐0.007 0.003 ‐0.160 ‐2.827 0.005 Ratio of Technical 0.864 0.321 0.145 2.687 0.008 and Administrative Staff Staff Turnover Rate: Low 0.032 0.123 0.015 0.261 0.794 Staff Turnover Rate: High ‐0.017 0.158 ‐0.006 ‐0.106 0.916 Reasons for Staff Turnover ‐0.198 0.163 ‐0.070 ‐1.220 0.223 Staff Turnover Monitoring ‐0.090 0.105 ‐0.047 ‐0.854 0.394 Motivation of Top Management 0.000 0.003 ‐0.007 ‐0.130 0.896 Effect of Motivation Component 0.082 0.055 0.081 1.493 0.136 of Wage Intensity of Using Motivation 0.531 0.194 0.157 2.741 0.006 Devices Effect of Employee Benefits ‐0.043 0.055 ‐0.045 ‐0.789 0.431 Industry 0.036 0.135 0.015 0.268 0.789 Company Size: Medium‐sized vs. ‐0.115 0.125 ‐0.060 ‐0.926 0.355 Small Company Size: Large vs. Small ‐0.226 0.136 ‐0.110 ‐1.653 0.099","Castek_1_Castek 23.1.2015 7:38 Stránka 116 116 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Only four predictors are statistically significant in this model, and one of them – the shift in Company Size to a large enterprise – on the borderline of an acceptable p- value (p = 0.099). The other three statistically significant predictors are Ratio of Women, Ratio of Technical and Administrative Staff, and Intensity of Using Motivation Devices. If the values of other variables remain constant, a higher Ratio of Women decreases financial performance of a company, whereas a higher Ratio of Technical and Administrative Staff and higher average Intensity of Using Motivation Devices increases it. Company size has, as usual, a negative impact on financial performance. As far as the relative effect on financial performance is concerned, Ratio of Women, Intensity of Using Motivation Devices, and Ratio of Technical and Administrative Staff have a very similar effect. On the other hand, Company Size has a lower relative effect. 5.7 Customers Customers as a stakeholder group are mentioned in all the literature of stakeholder theory. In the vast majority of empirical studies they are referred to as the most important stakeholder group (e.g. Šimberová, 2008), which is also one of the findings of this research. As Klapalová (2008) noted, many authors have worked on customer orientation as a factor of long-term business success since 1950s, including Peter Drucker among others. As for more recent authors focusing on market-orientation, i.e. including customer-orientation, we can name e.g. Kohli et al. (1993) or Narver and Slater (1990). In this section, the questionnaire provided us with 70 variables focused on the following areas: business strategy, long-term relationships with customers and their stability, their territorial structure, and specificity of company products. It can also be repeated from the other sections of the questionnaire that the researchers were determining the importance of customers for a company, level of customer care, their bargaining power, degree of flexibility in adjusting products to customer demands, level of innovation activity, quality assessment, and evaluating goodwill or brand of a company. The above-mentioned variables from this section of the questionnaire were transformed into six key variables, whose meaning is explained in the Table 5-31. Three of the variables are nominal: Strategy, Strategy Focus and Exports. The other variables are interval: Stability Buyers, Share of Exports and Product Specificity. Because the distribution of their values is very different from normal (bimodal for Product Specificity, multimodal for Stability of Customers, and very positively skewed for Exports), we measure their relationship with financial performance using the τ c coefficient. Since a large number of companies do not export, we also test the effect on financial performance based on the division of businesses into two groups: exporting and non-exporting. In this case and in the case of the variable Strategy we will use the t-test and ANOVA. The following are bivariate associations between the variables from this field and financial performance of a company:","Castek_1_Castek 23.1.2015 7:38 Stránka 117 5.7 CUSTOMERS 117 Table 5-31: Contents of variables describing customers of a company Question Variable Types and values What business strategy does Strategy Differentiation/Differentiation your company follow? focus/Cost leader/Cost leader focus Strategy Focus Yes/No Strategy Focus Yes/No What is the stability of your Stability of Customers Interval customers? What is the share of your Share of Exports Interval foreign customers? Does your company have any Exports Yes/No foreign customers? What is the ratio of specific Product Specificity Interval products to all your products? Table 5-32: Variables describing customers: bivariate relationships with CFP p Variable Coefficient Association N (two‐tailed) Strategy: Cost Leadership vs. η 0.157 0.057 182 Differentiation Focus Strategy: Differentiation vs. η 0.157 0.085 204 Differentiation Focus Strategy: Cost Focus vs. Differentiation Focus n.s. Strategy: Focus Yes/No η 0.138 0.010 351 Strategy: Cost/Differentiation n.s. Stability of Customers τ c 0.136 0.001 393 Share of Exports τ c ‐0.056 0.124 376 Exports: Yes/No η 0.128 0.013 376 Product Specificity τ c 0.119 0.001 387 Besides the original form of the variable measuring the Share of Exports on all products, all the other variables show a statistically significant association with financial performance of the company. It can be considered even in the case of Share of Exports if it is measured only Yes/No (the number of completely non-exporting","Castek_1_Castek 23.1.2015 7:38 Stránka 118 118 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... companies is 89 out of 376, for which this information is available). The average performance of non-exporting companies (m = 0.454) is significantly higher than the average performance of exporting companies (m = 0.170). As far as Strategies are concerned, the highest performance can be observed in companies that pursue the strategy of Differentiation Focus (m = 0.464), followed by companies pursuing the strategy of Cost Focus (m = 0.261), then companies with the Differentiation Strategy (m = 0.144), and the lowest performance in our sample was found in companies pursuing the strategy of Cost Leadership (m = 0.103). There are statistically significant differences between strategies Cost Leadership and Differentiation Focus, and between strategies Differentiation and Differentiation Focus. No statistically significant difference was found between the two strategies oriented on costs and the two strategies oriented on differentiation; however, there is a statistically significant difference between businesses pursuing attention-focused strategies Focus, which have a higher average performance (m = 0.384) than the businesses pursuing strategies without this focused attention, whose average performance is lower (m = 0.126). Other variables that affect financial performance positively include Stability of Customers and Product Specificity. Nevertheless, it should be noted that although these associations are statistically significant, their factual significance, effect size, is very low. Again, we cannot talk about any important factor of financial performance at the level of simple bivariate associations in this area of enterprise characteristics. Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model. The variable Exports will be included in its dichotomous form. The variable Strategy will be included in the form of Strategy Focus Yes/No, since although its impact on financial performance is lower, it is statistically significant at the more broadly accepted statistical significance level (α = 10%). Hence, the model will be as follows: Financial Performance = β + β *Strategy Focus + β *Stability of Customers + 1 0 2 β *Exports Yes/No+ β *Product Specificity + β *Industry + β *Company Size 3 5 6 4 Medium-sized + β *Company Size Large 7 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The summary of the model obtained from the forced entry linear regression is shown in the following table.","Castek_1_Castek 23.1.2015 7:38 Stránka 119 5.7 CUSTOMERS 119 Table 5-33: Summary of the regression model describing the stakeholder group of customers Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.311 0.097 0.076 0.903 0.097 04.641 7 303 0.000 The multiple correlation value of 0.311 is medium in the case of this model. The model explains less than 10% of the financial performance variability. Still, the model is significant. Estimates of the parameters of this model are listed in the following table. Table 5-34: Coefficients of the regression model describing the stakeholder group of customers Standardized‐ Unstandardized Coefficients p Coefficients Model t (two‐ tailed) B Std. Error Beta (Constant) ‐0.327 0.299 ‐1.094 0.275 Strategy Focus Yes/No ‐0.137 0.108 ‐0.073 ‐1.271 0.205 Stability of Customers 0.201 0.064 0.180 3.146 0.002 Exports Yes/No ‐0.278 0.155 ‐0.125 ‐1.788 0.075 Product Specificity 0.003 0.001 0.118 2.091 0.037 Industry 0.182 0.161 0.077 1.128 0.260 Company Size: Medium‐sized ‐0.162 0.123 ‐0.086 ‐1.312 0.190 vs. Small Company Size: Large vs. ‐0.205 0.140 ‐0.097 ‐1.469 0.143 Small Statistically significant predictors in this model include Stability of Customers, Exports and Product Specificity. Both Stability of Customers and Product Specificity have a positive impact on the financial performance of a company if the values of the other variables remain unchanged. Conversely, if a company exports, its financial performance is hurt (the reference value of zero is represented by non-exporting companies). The relative power of the individual predictors is highest for Stability of Customers and lowest for Exports and Product Specificity.","Castek_1_Castek 23.1.2015 7:38 Stránka 120 120 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... 5.8 Suppliers Supplier relationships management is very important in strategies aimed at costs (Gaddle, Hakansson, 2002) as well as strategies aimed at quality. There are studies examining the relationship of supplier stability and corporate competitiveness (O’Toole, Donaldson, 2000), supplier specialization and competitiveness (Hope, Spencer, 2001), or the method of selecting suppliers and company competitiveness (Kannan, Tan, 2004). Therefore, the variables verified here examine – as an analogy to the group of customers – the long-term nature of the relationship, its stability, territorial structure of suppliers, and specificity of supplies. Furthermore, criteria for selecting new suppliers were also examined. This section of the questionnaire provided us with the total of 97 variables that were transformed into 11 variables listed below, whose meaning is explained in the following table. Table 5-35: Contents of variables describing suppliers of a company Question Variable Types and values What is the stability of your suppliers? Stability of suppliers Interval What is the percentage of supplies Foreign Supplies Interval from abroad? What is the proportion of specific Specificity of Supplies Interval supplies to total supplies? How important is price as a criterion Supplier Selection Criterion: 1 – totally unimportant/5 for selecting suppliers? Price – very important How important are the payment terms Supplier Selection Criterion: 1 – totally unimportant/5 as a criterion for selecting suppliers? Payment Terms – very important How important are other terms of Supplier Selection Criterion: 1 – totally unimportant/5 delivery as a criterion for selecting Other Terms of Delivery – very important suppliers? How important is product quality as Supplier Selection Criterion: 1 – totally unimportant/5 a criterion for selecting suppliers? Product Quality – very important How important is a quality certificate Supplier Selection Criterion: 1 – totally unimportant/5 as a criterion for selecting suppliers? Quality Certificate – very important How important is a supplier’s history Supplier Selection Criterion: 1 – totally unimportant/5 as a criterion for selecting suppliers? Supplier’s History – very important How important are references as Supplier Selection Criterion: 1 – totally unimportant/5 a criterion for selecting suppliers? Reference – very important How important is the compatibility Supplier Selection Criterion: 1 – totally unimportant/5 of CSR activities as a criterion CSR Compatibility – very important for selecting suppliers?","Castek_1_Castek 23.1.2015 7:38 Stránka 121 5.8 SUPPLIERS 121 The first three variables are interval while the others are ordinal. Because the distribution of values of the interval variables is very different from normal, similarly to the variables describing the stakeholder group of customers (bimodal for Specificity of supplies, multimodal for Stability of Suppliers, and very positively skewed towards Foreign Supplies), we will use the τ coefficient to measure their correlation with c financial performance. Since many businesses have no supplies from abroad, we will also test the relationship of financial performance and businesses classified into two groups: with foreign supplies and without them. In this case, the t-test will be used. The following are bivariate associations between the variables of this area and financial performance of a company: Table 5-36: Variables describing suppliers of a company: bivariate relationships with CFP p (two‐ Variable Coefficient Association N tailed) Stability of Suppliers τ c 0.064 0.058 402 Foreign Supplies τ c ‐0.022 0.517 390 Foreign Supplies: Yes/No η n.s. Specificity of Supplies τ c 0.013 0.707 390 Supplier Selection Criterion: Price r ‐0.110 0.027 403 Supplier Selection Criterion: Payment r ‐0.162 0.001 401 terms Supplier Selection Criterion: Other r ‐0.037 0.455 401 Terms of Delivery Supplier Selection Criterion: Product r 0.083 0.098 403 Quality Supplier Selection Criterion: Quality Cer‐ r ‐0.010 0.846 401 tificate Supplier Selection Criterion: Supplier’s r 0.049 0.331 401 History Supplier Selection Criterion: Reference r 0.108 0.031 401 Supplier Selection Criterion: CSR com‐ r 0.147 0.003 398 patibility Financial performance statistically significantly correlates with most of the Supplier Selection Criteria, i.e. Price, Payment Terms, Reference, and CSR Compatibility. A worse but still acceptable p-value is found in Product Quality as a Supplier Selection Criterion, and in Stability of Suppliers. Product Quality is the least significant even in the factual aspect (r = 0.083), which also applies to Stability of","Castek_1_Castek 23.1.2015 7:38 Stránka 122 122 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Suppliers (τ = 0.064). Other associations can be described as at least factually weak. c Unlike in the stakeholder group of customers, if we recode the interval variables of Stability of Suppliers, Foreign Supplies and Specificity of Supplies into Yes/No or High/Low, we do not reach statistically significant associations for these recoded variables describing the stakeholder group of suppliers. Again, we cannot talk about any important factor of financial performance at the level of a simple bivariate association in this area of enterprise characteristics. Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model. The variable Foreign Supplies will be included in a dichotomous form, as the original variable does not show a linear correlation to financial performance. For the same reason, the 4 variable Specificity of Supplies will be included in the form of Low/High . Hence, the model will be as follows: Financial performance = β + β *Stability of Suppliers + β *Foreign Supplies + 1 0 2 β *Specificity of Supplies + β *Supplier Selection Criterion: Price + β *Supplier 3 5 4 Selection Criterion: Payment Terms + β *Supplier Selection Criterion: Other Terms 6 of Delivery + β *Supplier Selection Criterion: Product Quality + β *Supplier Selection 8 7 Criterion: Quality Certificate + β *Supplier Selection Criterion: Supplier’s History + 9 β *Supplier Selection Criterion: Reference + β *Supplier Selection Criterion: CSR 11 10 Compatibility + β *Industry + β *Company Size Medium-sized + β *Company 12 13 14 Size Large The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The summary of the model obtained from the forced entry linear regression is shown in the following table. Table 5-37: Summary of the regression model describing the stakeholder group of suppliers Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.387 0.149 0.093 0.968 0.149 2.636 14 210 0.001 4 Low specificity = 0 – 10 % of specific supplies, High specificity = 90 – 100 % of specific supplies.","Castek_1_Castek 23.1.2015 7:38 Stránka 123 5.8 SUPPLIERS 123 The multiple correlation value of 0.387 ranges from medium to substantial in the case of this model, which explains little bit less than 15% of financial performance variability. Also, the model is significant. Estimates of the parameters of this model are listed in the following table. Table 5-38: Coefficients of the regression model describing the stakeholder group of suppliers Unstandardized Standardized Coefficients Coefficients Model t p B Std. Error Beta (Constant) ‐1.054 0.787 ‐1.339 0.182 Stability of Suppliers 0.155 0.094 0.112 1.648 0.101 Foreign Supplies: Yes/No ‐0.155 0.174 ‐0.067 ‐0.891 0.374 Specificity of Supplies ‐0.044 0.139 ‐0.021 ‐0.315 0.753 Supplier Selection Criterion: ‐0.182 0.101 ‐0.133 ‐1.805 0.073 Price Supplier Selection Criterion: ‐0.078 0.081 ‐0.072 ‐0.955 0.341 Payment Terms Supplier selection criterion: ‐0.143 0.084 ‐0.126 ‐1.714 0.088 Other Terms of Delivery Supplier Selection Criterion: 0.384 0.143 0.190 2.684 0.008 Product Quality Supplier Selection Criterion: ‐0.020 0.066 ‐0.021 ‐0.310 0.757 Quality Certificate Supplier Selection Criterion: ‐0.015 0.080 ‐0.013 ‐0.182 0.856 Supplier’s History Supplier Selection Criterion: 0.139 0.078 0.125 1.792 0.075 Reference Supplier Selection Criterion: 0.150 0.057 0.181 2.646 0.009 CSR Compatibility Industry 0.246 0.205 0.090 1.203 0.230 Company size: Medium‐sized ‐0.247 0.157 ‐0.120 ‐1.576 0.117 vs. Small Company size: Large vs. Small ‐0.317 0.177 ‐0.139 ‐1.792 0.075","Castek_1_Castek 23.1.2015 7:38 Stránka 124 124 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Similarly to the bivariate associations, the effect of these supplier selection criteria is statistically significant: Price, Product Quality, Reference, and CSR Compatibility. The effect of Payment Terms as a selection criterion is no longer statistically significant; instead, the effect of Other Terms of Delivery is statistically significant. These Other Terms of Delivery and Price have a negative impact on financial performance, in contrast to e.g. Product Quality, which is relatively the strongest predictor of financial performance. Stability of Suppliers, which was statistically significant (even though at a lower level of statistical significance, p = 0.064), is now narrowly unacceptable even at the level of α = 10%. Out of the general characteristics of enterprises, financial performance is influenced in this model also by a shift of the company size from medium-sized to large, while even here the direction is negative. In terms of the relative power of predictors, the strongest predictor is Quality as a supplier selection criterion, followed closely by CSR compatibility as a supplier selection criterion, and the remaining statistically significant predictors follow after approximately the same interval. 5.9 Corporate social responsibility The last section of the questionnaire examined what forms of social responsibility enterprises use to become engaged in what areas, and also what codes (e.g. ethical) they have. This was actually a way of expressing corporate social performance, which has been demonstrated many times as a factor of competitiveness (e.g. a meta-analysis by Allouche, Laroche, 2005). The variables from this section were transformed into two, i.e. how many forms of CSR activities a company is engaged in, and whether a company has at least one code. No company indicated that it was not engaged in any form of CSR activities; the maximum was six different CSR activities. Whether a company has at least one code is a dichotomous variable; therefore, a t-test will be used. Table 5-39: Variables describing CSR activities of a company: bivariate relationships with CFP Variable Coefficient Association p (two‐tailed) N CSR r 0.153 0.002 392 Ethical code η n.s. 394 Only the number of CSR activities is statistically significantly related to financial performance. The effect of the number of CSR activities on financial performance can be described as weak, so even in this area there is no important factor of corporate competitiveness.","Castek_1_Castek 23.1.2015 7:38 Stránka 125 5.9 CORPORATE SOCIAL RESPONSIBILITY 125 Multidimensional model To assess the degree of joint effect of the aforementioned independent variables on financial performance, we can try to formulate a linear regression model. The model will be as follows: Financial Performance = β + β *CSR + β *Ethical Code + β *Industry + β *Company 4 3 0 2 1 Size Medium-sized + β *Company Size Large 5 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The summary of the model obtained from the forced entry linear regression is shown in the following table. Table 5-40: Summary of the regression model describing corporate social performance Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.235 0.055 0.043 0.928 0.055 4.388 5 375 0.001 The multiple correlation value of 0.235 ranges from low to medium in the case of this model, with the model explaining less than 6% of financial performance variability. Still, the model is significant. Estimates of the parameters of this model are listed in the following table. Table 5-41: Coefficients of the regression model describing corporate social performance Unstandardized Standardized Coefficients Coefficients Model t p B Std. Error Beta (Constant) ‐0.009 0.150 ‐0.060 0.952 CSR 0.158 0.046 0.182 3.431 0.001 Ethical Code ‐0.029 0.100 ‐0.015 ‐0.287 0.774 Industry 0.178 0.120 0.075 1.478 0.140 Company Size: Medium‐sized vs. Small ‐0.194 0.114 ‐0.101 ‐1.694 0.091 Company Size: Large vs. Small ‐0.375 0.128 ‐0.178 ‐2.922 0.004","Castek_1_Castek 23.1.2015 7:38 Stránka 126 126 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Similarly to the bivariate associations, only the effect of the number of CSR activities is statistically significant. If the values of the other variables remain unchanged, a higher number of CSR activities has a positive effect on the financial performance of a company. The effect of Company Size is also statistically significant, since it is still true that the larger a company is, the lower its financial performance becomes. The number of CSR activities is a relatively slightly stronger predictor than a shift in Company Size to large and considerably stronger than a shift in Company Size to medium-sized. 5.10 Comprehensive model In the preceding section we examined the sub-models of financial performance that always focused on one area of a business. Now we will try to construct a model of the predictors that were statistically significant in the individual sub-models. The number of such predictors exceeds 33, which is a relatively high number for a regression model using the available number of measurements. Therefore, we will treat missing values excluding missing values pairwise and not list wise as we have thus far. We will also rely on assertions by Green (1991) concerning the minimum accepted sample size. Green postulates that for an overall fit of a regression model, the minimum sample size is 50 + 8*p, where p is the number of predictors. In our case this would mean the need for 314 observations. Another rule set by Green is that if we want to test individual predictors, which is our case, there is a need for 104 + p observations. This would make the need for 137 observations in our case. A more sophisticated rule is set by Miles and Shevlin (2001). This takes into account the expected effect of the model. A higher expected effect results in a need of a lower number of observations. We expect a rather large effect, which would require approximately 100 observations, according to Miles and Shevlin. However, the actually obtained effect is rather medium-sized, resulting in a need for approximately 250 observations. All of these rules are met, as there are between 351 and 409 observations in this regression model. The discussed model will therefore be tested as follows: Financial performance = β + β *Innovation Activity + β *Other Costs + β *Access 3 2 1 0 to Funding + β *Competitive Rivalry + β *Interest in Employment + β *Support from 5 4 6 the State + β *Market Progress + β *Importance of Customers + β *Importance of 8 9 7 the State + β *Concern + β *Owner’s Legal Form: Legal Person + β *Owner’s Legal 11 12 10 Form: Natural Person \& Legal Person + β *Owners in Board of Directors+ 13 β *Owners in Top Management + β *Assets Level + β *Ratio of Women + β *Ratio 16 17 14 15 of Technical and Administrative Staff + β *Intensity of Using Motivation Devices + 18 β *Strategy Focus + β *Stability of Customers + β *Exports Yes/No + β *Product 21 19 20 22 Specificity + β *Supplier Selection Criterion: Price + β *Supplier Selection Criterion: 23 24 Other Terms of Delivery + β *Supplier Selection Criterion: Product Quality + 25 β *Supplier Selection Criterion: Reference + β *Supplier Selection Criterion: CSR 27 26","Castek_1_Castek 23.1.2015 7:38 Stránka 127 5.10 COMPREHENSIVE MODEL 127 Compatibility + β *CSR + β *Industry + β *Company Size Medium-sized + 30 29 28 β *Company Size Large 31 The following assumptions were assessed: appropriate variable type, non-zero variance of predictors, no strong multicollinearity, homoscedasticity, independent errors, normally distributed errors, independent values of outcome, and linear relationship between the outcome and predictors. The predictor Number of Management Levels per Employee was removed because of collinearity with Company Size. Two observations were removed because of an undue large influence on the regression model. Their standardized residual was more than 3.5. The summary of the model obtained by the forced entry linear regression is in the next table. Table 5-42: Summary of the comprehensive regression model Change sta s cs Std. Error R Adjusted R Square F R of the df1 df2 p Square R Square Change Change Es mate 0.551 0.304 0.235 0.832 0.304 4.399 31 312 0.000 The multiple correlation value of 0.551 is substantial in the case of this model. The model explains over 30% of financial performance variability. Also, the model is significant. When compared with the sub-models, it is obvious that this comprehensive model can explain a much larger proportion of the variability of the response variable. Estimates of the parameters of this model are listed in the Table 5-43. Approximately half of the predictors tested have statistically significant effects on financial performance of a company. In all of them, the directions of action are consistent with those found in the sub-models. Hence, if the values of the other variables remain unchanged, greater financial performance is caused by lower Competitive Rivalry, widening Market Progress, greater Importance of the State, Owner’s Legal Form as a Natural Person, the absence of Owners in Board of Directors, higher Assets Level, lower Ratio of Women, higher Ratio of Technical and Administrative Staff, higher Intensity of Using Motivation Devices, emphasis on Strategy Focus (either Cost Leadership Focus or Differentiation Focus), higher Stability of Customers as long as the company does not export its products, if it does not consider Other Terms of Delivery important as a supplier selection criterion, and if it is rather smaller than greater. The relatively strongest predictor is clearly Assets Level. The other predictors, whose relative strength is approximately equal, follow after a considerable interval.","Castek_1_Castek 23.1.2015 7:38 Stránka 128 128 5 IDENTIFYING FACTORS OF COMPETITIVENESS USING BIVARIATE ANALYSES... Table 5-43: Coefficients of the comprehensive regression model Unstandardized Standardized Coefficients Coefficients Model t p B Std. Error Beta (Constant) ‐1.845 0.743 ‐2.485 0.013 Innovation Activity ‐0.014 0.049 ‐0.016 ‐0.288 0.773 Other Costs ‐0.098 0.060 ‐0.080 ‐1.623 0.106 Access to Funding 0.048 0.047 0.055 1.025 0.306 Competitive rivalry ‐0.130 0.061 ‐0.112 ‐2.131 0.034 Interest in Employment 0.071 0.064 0.057 1.100 0.272 Support from the State 0.061 0.054 0.058 1.137 0.256 Market progress 0.113 0.058 0.098 1.945 0.053 Importance of Customers 0.023 0.079 0.015 0.290 0.772 Importance of the state 0.069 0.041 0.088 1.675 0.095 Concern 0.156 0.130 0.074 1.199 0.232 Owner’s legal form: Natural Person vs. Legal Person ‐0.221 0.131 ‐0.115 ‐1.681 0.094 Owner’s Legal Form: Natural Person \& Legal Person vs. ‐0.004 0.195 ‐0.001 ‐0.020 0.984 Natural Person Owners in Board of Directors ‐0.183 0.100 ‐0.096 ‐1.838 0.067 Owners in Top Management 0.187 0.121 0.097 1.548 0.123 Assets Level 0.189 0.062 0.175 3.065 0.002 Ratio of Women ‐0.004 0.003 ‐0.097 ‐1.748 0.081 Ratio of Technical and Administrative Staff 0.685 0.307 0.114 2.231 0.026 Intensity of Using Motivation Devices 0.330 0.185 0.096 1.787 0.075 Strategy FOCUS ‐0.169 0.098 ‐0.089 ‐1.714 0.088 Stability of Customers 0.116 0.059 0.106 1.979 0.049 Exports Yes/No ‐0.268 0.138 ‐0.120 ‐1.936 0.054 Product Specificity 0.001 0.001 0.029 0.551 0.582 Supplier Selection Criterion: Price 0.030 0.071 0.022 0.415 0.679 Supplier Selection Criterion: Other Terms of Delivery ‐0.113 0.056 ‐0.107 ‐2.013 0.045 Supplier Selection Criterion: Product Quality 0.078 0.094 0.043 0.831 0.406 Supplier Selection Criterion: Reference 0.040 0.054 0.039 0.741 0.460 Supplier Selection Criterion: CSR Compatibility 0.040 0.044 0.050 0.910 0.363","Castek_1_Castek 23.1.2015 7:38 Stránka 129 129 6 Interpretation of the results achieved This chapter focuses on the characteristics of the sample and its individual subgroups depending on the selected variables. There are 10 variables that were selected on the basis of the application of the DAF 74 method as factors that have the greatest impact on the financial performance of companies. These variables were supplemented by an additional 6 variables with regard to the interpretation needs. The order of these sixteen variables was adjusted with respect to the logic of interpretation. They are listed in the Table 6-44 As mentioned in Chapter 2, the sample of companies analysed was evaluated using the summation of two indicators, i.e. ROA and Assets Growth. Out of the 432 companies for which data were obtained from the questionnaire survey, 408 could be evaluated, while the remaining 24 had insufficient data. We consider this set of 408 companies a sample for the needs of this chapter. These businesses were ranked using the sum value of the indicators, ranging from companies with the highest value, i.e. businesses with the highest financial performance, to companies with the lowest financial performance. The sample was then divided into quartiles comprising 102 companies each. In the following text, these quartiles are indicated as a group of companies A, B, C and D. Companies with the highest financial performance are in group A, while companies with the lowest financial performance are included in group D.","Castek_1_Castek 23.1.2015 7:38 Stránka 130 130 6 INTERPRETATION OF THE RESULTS ACHIEVED Table 6-44: Overview of variables Variable Note Variable type P Industry 1= manufacturing, 2 = construction industry nominal P Legal form 1 = Ltd., 2 = joint‐stock company nominal P Company size 1 = 50‐99, 2 = 100‐249, 3 = 250 or more employees ordinal P Concern 1 = no, 2 = yes dichotomous F4 FDI 1= without foreign capital, 2 = with foreign capital nominal 1=the only owner, 2=majority owner, 3=several big P Ownership concentration nominal owners 1 = cost leadership, 2 = differentiation, 3 = cost F1 Strategy nominal focus, 4 = differentiation focus F5 Share of exports the share of exports on sales (%) interval F6 Span of control an average number of employees per manager (%) interval the ratio of the average upper limit of the motivation F2 Motivation of workers interval component of workers’ wage to their base wage (%) the ratio of the average upper limit of the motivation F3 Motivation of top component of top managers’ salary to their base interval management salary (%) F8 Ratio of technical the ratio of technical and administrative staff to the interval and administrative staff total number (%) F9 Ratio of workers the ratio of workers to the total number of staff (%) interval F7 Ratio of women the ratio of women to the total number of staff (%) interval F10 Reasons for staff 1= less than 2%, 2 = 2‐10%, 3 = more than 10% ordinal urnover rate 1= low (obsolete machinery), 2, 3, 4, 5 = high (top‐of‐ P Assets level ordinal the‐range machinery)","Castek_1_Castek 23.1.2015 7:38 Stránka 131 6.1 OVERALL CHARACTERISTICS OF THE SAMPLE 131 6.1 Overall characteristics of the sample In this section, the characteristics of the sample as a whole is described partially, i.e. according to each of the sixteen variables mentioned above; moreover, groups A, B, C and D are described within the sample. Industry The data were obtained from 408 companies, i.e. 100% of the sample. Companies from the manufacturing industry represent 80.4% of the sample, while companies from the construction industry account for 19.6% of the sample. As shown in Graph 6-3, companies from the construction industry show a significantly higher financial performance than companies from the manufacturing industry. Out of the total number of manufacturing companies, 22.9% of them belong to group A, 23.2% to group B, 27.4% to group C and 26.5% to group D. Out of the total number of construction companies, 33.8% of them belong to group A, 32.5% to group B, 15.0% to group C and 18.8% to group D. In other words: the percentage of manufacturing companies that are included in the two top quartiles, i.e. in groups A and B, is only about 46% of their total number. On the other hand, the percentage of construction companies included in groups A and B exceeds 66%. Graph 6-3: Relative frequency of quartiles in manufacturing and construction Legal form The data were obtained for 408 companies, i.e. 100% of the sample. The sample includes 55.4% of private limited companies and 44.6% of joint-stock companies. As shown in Graph 6-4, private limited companies show a significantly higher financial performance than joint-stock companies.","Castek_1_Castek 23.1.2015 7:38 Stránka 132 132 6 INTERPRETATION OF THE RESULTS ACHIEVED Out of the total number of private limited companies whose financial performance was analysed, 31.0% of them belong to group A, 27.4% to group B, 20.4% to group C and 21.2% to group D. Out of the total number of joint-stock companies, 17.6% of them belong to group A, 22.0% to group B, 30.8% to group C and 29.7% to group D. In other words: the percentage of private limited companies that are included in the two top quartiles, i.e. in groups A and B, exceeds 58% of their total number. On the other hand, the percentage of joint-stock companies included in groups A and B accounts for less than 40%. Graph 6-4: Relative frequencies of quartiles in private limited companies and in joint stock companies Company size The data were obtained from 408 companies, i.e. 100% of the sample. The sample includes 29.2% of small companies with 50 to 99 employees, 41.7% of medium-sized companies with 100 to 249 employees, and 29.2% of large companies with at least 250 employees. As shown in Graph 6-5, smaller companies show a higher financial performance than larger ones. Out of the total number of companies with 50 to 99 employees, 29.4% of them belong to group A, 28.6% to group B, 18.5% to group C and 23.5% to group D. Out of the total number of companies with 100 to 249 employees, 27.1% of them belong to group A, 20.6% to group B, 24.7% to group C and 27.6% to group D. Out of the total number of companies with at least 250 employees, 17.6% of them belong to group A, 27.7% to group B, 31.9% to group C and 22.7% to group D. The percentage of small businesses, which are included in the two top quartiles, i.e. in groups A and B, accounts for 58% of the total number, while the percentage of medium-sized companies is less than 48%, and the percentage of large companies is only about 45%.","Castek_1_Castek 23.1.2015 7:38 Stránka 133 6.1 OVERALL CHARACTERISTICS OF THE SAMPLE 133 Graph 6-5: Relative frequencies of quartiles according to company size Concern The data were obtained from 401 companies, i.e. 98.3% of the sample. The percentage of companies from the sample that belong to a concern is less than a third – namely 28.7% – while companies that are not part of a concern represent 71.3% of the sample. As shown in Graph 6-6, companies that belong to a concern show a slightly higher financial performance. Out of the total number of companies that are part of a concern, 25.2% of them belong to group A, 29.6% to group B, 27.8% to group C and 17.4% to group D. Out of the total number of companies that are not part of a concern, 25.2% of them belong to group A, 22.7% to group B, 24.1% to group C and 28.0% to group D. The percentage of businesses belonging to a concern, which are included in the two top quartiles, i.e. in groups A and B, accounts for almost 55% of the total number; on the other hand, the percentage of companies from groups A and B that are not part of a concern is less than 48%. Graph 6-6: Relative frequencies of quartiles according to concern membership","Castek_1_Castek 23.1.2015 7:38 Stránka 134 134 6 INTERPRETATION OF THE RESULTS ACHIEVED FDI The data were obtained from 347 companies, i.e. 85.0% of the sample. 72.2% of companies from the sample are owned locally while in the remaining 27.4% of companies foreign capital is involved in their ownership; in most cases, these situations include big majority or even 100% ownership. As shown in Graph 6-7, there is almost no difference in financial performance between companies that are owned locally and companies with foreign owners. Out of the total number of companies that are owned only locally, 24.2% of them belong to group A, 26.6% to group B, 22.6% to group C and 26.6% to group D. Out of the total number of companies with foreign owners, 25.3% of them belong to group A, 26.3% to group B, 27.4% to group C and 21.1% to group D. The percentage of locally owned businesses, which are included in the two top quartiles, i.e. in groups A and B, accounts for approximately 51% of the total number, while the percentage of companies from groups A and B with foreign owners is about 52%. Graph 6-7: Relative frequencies of quartiles in companies with and without FDI Ownership concentration The data were obtained from 395 companies, i.e. 96.8% of the sample. The sample includes 38.5% of companies with a single owner, 21.8% of companies with a majority owner and 39.7% of companies owned by several big owners. As shown in Graph 6-8, companies with several big owners show the highest financial performance while companies with a majority owner show the lowest financial performance. Out of the total number of companies with a single owner, 27.0% of them belong to group A, 23.0% to group B, 23.7% to group C and 26.3% to group D. Out of the total number of companies with a majority owner, only 18.6% of them belong to group A, 25.6% to group B, 29.1% to group C and 26.7% to group D. Out of the total number of companies owned by several big owners, 26.1% of them belong to group A, 27.4% to group B, 22.9% to group C and 23.6% to group D.","Castek_1_Castek 23.1.2015 7:38 Stránka 135 6.1 OVERALL CHARACTERISTICS OF THE SAMPLE 135 The percentage of companies with a single owner that are included in the two top quartiles, i.e. in groups A and B, accounts for 50% of their total number; the percentage of companies with a majority owner included in groups A and B accounts for only about 44%. On the other hand, the percentage of companies owned by several big owners included in groups A and B exceeds 53%. Graph 6-8: Relative frequencies of quartiles according to the ownership concentration Strategy The data were obtained from 351 companies, i.e. 86.0% of the sample. The sample includes 23.1% of companies using the cost-leadership strategy, 29.3% of companies using the differentiation strategy, 18.8% of companies using the cost- focus strategy, and 28.8% of companies using the differentiation-focus strategy. As shown in Graph 6-9, an association between an applied strategy and financial performance is very weak. Yet, there is some effect of strategy: companies using the differentiation strategy show lower financial performance. As for the companies using one of the three remaining strategies, their financial performance is higher and they do not differ very much. Out of the total number of companies using the cost-leadership strategy, 18.5% of them belong to group A, 33.3% to group B, 17.3% to group C and 30.9% to group D. Out of the total number of companies using the differentiation strategy, 24.3% of them belong to group A, 20.4% to group B, 26.2% to group C and 29.1% to group D. Out of the total number of companies using the cost-focus strategy, 28.8% of them belong to group A, 24.2% to group B, 19.7% to group C and 27.3% to group D. Out of the total number of companies using the differentiation-focus strategy, 28.7% of them belong to group A, 25.7% to group B, 27.7% to group C and 17.8% to group D. The percentage of companies using the cost-leadership strategy that are included in the two top quartiles, i.e. in groups A and B, accounts for almost 52% of their total","Castek_1_Castek 23.1.2015 7:38 Stránka 136 136 6 INTERPRETATION OF THE RESULTS ACHIEVED Graph 6-9: Relative frequencies of quartiles according to generic strategy number. The percentage of companies using the differentiation strategy included in groups A and B accounts for less than 45%. The percentage of companies using the cost-focus strategy included in groups A and B accounts for 53%. Finally, the percentage of companies using the differentiation-focus strategy included in groups A and B accounts for approximately 54%. Share of exports The data were obtained from 376 companies, i.e. 92.2% of the sample. Companies with predominant exports represent a minority in the sample. The percentage of companies whose share of exports on sales is more than fifty per cent accounts for 34.3% in the sample. There are only 4% of companies that export all their production, while the percentage of companies that do not export at all is 23.6%. It should be pointed out that there are striking differences between the manufacturing industry and construction: in the manufacturing industry, the percentage of companies with more than a 50% export share on sales is 43.1% of their total number, there are no such companies in the construction industry. The largest share of exports on sales is 30% here and only three companies show this figure. It appears that the export orientation of companies is generally in opposition to their financial performance. As shown in Graph 6-10, group A includes only 30.3% of companies with more than a 50% share of exports on sales, group B includes 34.0% export-oriented companies, in group C their representation is highest – 40.9%, and group D includes 36.1% of such companies.","Castek_1_Castek 23.1.2015 7:38 Stránka 137 6.1 OVERALL CHARACTERISTICS OF THE SAMPLE 137 Graph 6-10: Share of companies with predominant exports in the quartiles Span of control The data were obtained from 393 companies, i.e. 96.3% of the sample. The average number of employees who report to one manager in a company is 13.1 for all companies from the sample. The average number of employees who report to one manager in a company is 12.5 for all companies from group A, 15.0 for companies from group B, 12.2 for companies from group C and 13.1 for companies from group D. Graph 6-11: The average span of control in the quartiles","Castek_1_Castek 23.1.2015 7:38 Stránka 138 138 6 INTERPRETATION OF THE RESULTS ACHIEVED We can conclude that there is no zero-order relationship between the span of control and financial performance. Motivation of workers The data were obtained from 345 companies, i.e. 84.6% of the sample. The average upper limit of the motivation components of workers’ wages is 26.7% of the basic wage for all companies from the sample. The results shown in Graph 6-12 imply a certain, though very weak correlation: companies with a higher motivation component of workers’ wages tend to be the companies with higher financial performance. The average upper limit of the motivation components of workers’ wages is 28.7% of the basic wage in companies from group A, 26.2% in companies from group B, 26.3% in companies from group C and 25.2% in companies from group D. Graph 6-12: The average upper limit of the motivation components of workers` wages in the quartiles Motivation of top management The data were obtained from 345 companies, i.e. 84.6% of the sample. The average upper limit of the motivation components of top-managers’ salaries is 34.4% of the basic salary for all companies from the sample. The average upper limit of the motivation components of top-managers’ salaries is 35.9% of the basic salary in companies from group A, 31.2% in companies from group B, 39.5% in companies from group C and 33.0% in companies from group D. We can conclude that there is no zero-order relationship between the motivation component of the salaries and financial performance.","Castek_1_Castek 23.1.2015 7:38 Stránka 139 6.1 OVERALL CHARACTERISTICS OF THE SAMPLE 139 Graph 6-13: The average upper limit of the motivation components of top-managers’ salaries in the quartiles Ratio of technical and administrative staff The data were obtained from 394 companies, i.e. 96.6% of the sample. The average percentage of technical and administrative staff from the total number of employees is 28.7% for all companies from the sample. Using the perspective of a particular industry, the average percentage of technical and administrative staff is lower (28.0%) in manufacturing companies than in construction ones (31.6%). Graph 6-14: The average percentage of technical and administrative staff in the quartiles","Castek_1_Castek 23.1.2015 7:38 Stránka 140 140 6 INTERPRETATION OF THE RESULTS ACHIEVED The results shown in Graph 6-14 imply a certain, though very weak correlation: companies with a higher ratio of technical and administrative staff tend to be the companies with higher financial performance. The average percentage of technical and administrative staff is 32.9% in companies from group A, 27.8% in companies from group B, 29.2% in companies from group C and 25.6% in companies from group D. Ratio of workers The data were obtained from 394 companies, i.e. 96.6% of the sample. The average percentage of workers from the total number of employees is 71.3% for all companies from the sample. The results shown in Graph 6-15 imply a certain, though very weak correlation: companies with a lower ratio of workers tend to be the companies with higher financial performance. The average percentage of workers is 67.1% in companies from group A, 72.2% in companies from group B, 70.8% in companies from group C and 74.4% in companies from group D. Graph 6-15: The average percentage of workers in the quartiles Ratio of women The data were obtained from 390 companies, i.e. 95.6% of the sample. The average percentage of women from the total number of employees is 26.9% for all companies from the sample. The results showed in Graph 6-16 suggest that companies with a lower ratio of women tend to be the companies with higher financial performance.","Castek_1_Castek 23.1.2015 7:38 Stránka 141 6.1 OVERALL CHARACTERISTICS OF THE SAMPLE 141 The average percentage of women is 21.5% in companies from group A, 26.1% in companies from group B, 27.2% in companies from group C and 32.8% in companies from group D. Graph 6-16: The average percentage of women in the quartiles Reasons for the staff turnover rate The data were obtained from 375 companies, i.e. 91.9% of the sample. The average frequency of reasons for the staff turnover rate (from the employee`s initiative) is 1.73 points for all companies from the sample. Graph 6-17: The average frequency of reasons for the staff turnover rate (from the employee`s initiative) in the quartiles","Castek_1_Castek 23.1.2015 7:38 Stránka 142 142 6 INTERPRETATION OF THE RESULTS ACHIEVED The results showed in Graph 6-17 suggest that there is a lower frequency of reasons for the staff turnover rate (from the employee`s initiative) in companies with higher financial performance. The average frequency of reasons for the staff turnover rate (from the employee`s initiative) is 1.68 points in companies from group A, 1.70 points in companies from group B, 1.73 points in companies from group C and 1.78 points in companies from group D. Assets level The data were obtained from 400 companies, i.e. 98.0% of the sample. The average tangible assets level is 3.54 points for all companies from the sample. The results in Graph 6-18 suggest that companies with a higher tangible assets level show higher financial performance. The average tangible assets level is 3.84 points in companies from group A, 3.58 points in companies from group B, 3.54 points in companies from group C and 3.22 points in companies from group D. Graph 6-18: The average tangible assets level in the quartiles","Castek_1_Castek 23.1.2015 7:38 Stránka 143 6.2 IDENTIFICATION OF TYPICAL COMBINATIONS OF FACTOR VALUES LEADING ... 143 6.2 Identification of typical combinations of factor values leading to certain types of financial performance As we mentioned in the previous chapter, the entire sample of companies analysed was divided according to their financial performance into four quartiles containing 102 companies each, labelled as either group A, B, C or D. In the subsequent considerations, we assume that the financial performance typical for each group can be achieved in several ways, i.e. combining different values of those variables that crucially affect the financial performance of companies. In order to identify these typical combinations, businesses in each of the groups A, B, C and D were subjected to a cluster analysis, performed using sixteen of the above-mentioned variables. While respecting the reasonable range of the text, we focus only on one of the groups when presenting the interpretation of the obtained results. With regard to the principle of “best practices”, we chose group A. We want to use it to demonstrate what different combinations of variable values (characteristics) lead to high financial performance of a company. First, in relation to the previous subchapter, we characterize group A as a whole, then the individual subgroups A 1 to A 5, which result from the application of the cluster analysis. Group A as a whole Group A shows a relatively high proportion of companies from the construction industry. Out of their total number in the sample, nearly 34% of them belong here, while the proportion of companies from the manufacturing industry is only less than 23% of their total number. Limited liability companies also have a relatively higher representation in this group. Out of their total number, 31% of them are represented in this group, while the proportion of joint-stock companies is less than 18% of their total number in group A. Companies with 50–99 employees have the relatively highest representation in group A: out of their total number more than 29% of businesses belong here. In the case of companies with 100–249 employees, about 27% from their total number belong here. As for companies with at least 250 employees, less than 18% belong to group A. The representation of companies in group A does not differ depending on whether a company is or is not part of a concern. In both cases the percentage is the same and amounts to slightly more than 25%. The representation of companies depending on whether they are owned locally or by a foreign owner is very similar. Out of the total number of businesses that are owned only locally, approximately 24% belong to group A, while a little more than 25% of companies that are partly or fully owned by a foreign owner belong here.","Castek_1_Castek 23.1.2015 7:38 Stránka 144 144 6 INTERPRETATION OF THE RESULTS ACHIEVED However, after assessing the ownership structure more comprehensively, the relationships sought turn out to be clearer. In group A, two types have the largest relative proportion, namely: a) companies that have a single owner and are part of a concern (75% of the total number of companies that are part of a concern), b) companies that have several large owners and are not part of a concern (almost 56% of the total number of enterprises that are not part of a concern). These two types of ownership structure are present in nearly 60% of the companies included in group A. In group A there are relatively more frequently companies exercising the cost focus and differentiation focus strategies. In the case of cost focus, this group makes up nearly 29% of the total number of companies from the sample that exercise this strategy; in the case of differentiation focus, the figure is also almost 29%. On the contrary, companies using the cost leadership and differentiation strategies achieve lower financial performance on average. As for cost leadership, group A contains less than 19% of the total number of businesses applying this strategy, while in the case of differentiation, it is about 24%. The dominant strategies of cost focus and differentiation focus are closely related with a strong concentration on specific products. Of the total number of companies in group A, 71% has a prevailing focus on specific products. Companies in group A are characterized by low export orientation, which is somewhat surprising. As for group A, the representation of companies in which the share of exports in sales is predominant, i.e. higher than 50%, is the lowest compared with the other three groups (B, C and D). Only 30% of the companies from group A have a predominant share of exports in sales, i.e. they supply more products on the foreign market than the domestic one. This situation could be explained by the fact that in the period analysed there was a long-term appreciation of the domestic currency against the euro and the dollar. A certain influence can also be attributed to the relatively high representation of companies from the construction industry, whose exports are limited for technical reasons, typical for this sector. The construction companies from the sample do not contain a single business whose share of exports in sales would exceed the 50% threshold. The value of span of control is 12.5 for businesses from group A. It is almost equal to the value valid for group C and it is lower than in the other two groups. Systems of motivation in companies from group A provide a relatively large space for the motivational wage component both for workers – nearly 29% in average – and top management – almost 36% in average. In other words, according to the wage regulations, the upper limit of workers’ wages is on average about 1.29 times the base salary; in the case of top management, it is 1.36 times the base salary. Businesses represented in group A have – in comparison with those in groups B, C and D – a relatively higher proportion of technical and administrative staff of the total number of employees (almost 33%). It can be assumed that this situation is","Castek_1_Castek 23.1.2015 7:38 Stránka 145 6.2 IDENTIFICATION OF TYPICAL COMBINATIONS OF FACTOR VALUES LEADING ... 145 related to the specificity of products, the predominant focus on differentiation strategies, and consequently to the higher proportion of intellectual work (research, development, marketing, and management). The second option may be relatively high labour productivity in manufacturing, caused by the fact that blue-collar occupations are largely equipped with technology. Companies from group A show a relatively low proportion of women in the total workforce. The fact that the figure is on average less than 22% can be explained by a relatively high proportion of construction companies in group A, where the employment of women is low. Similar reasons can be seen also in the structure of the categories in the manufacturing industry, where companies operating in the fields traditionally dominated by women, such as textile manufacturing, food production, etc., are included only to a limited extent. Compared with the other groups (B, C and D), companies in group A show the relatively lowest staff turnover initiated by employees. It amounts to 1.68 points on a three-point scale (1 – lowest to 3 – highest). This is undoubtedly related to the excellent economic results of companies in group A and the consequent pay conditions. The level of the significant components of tangible assets amounts to the value of 3.84 points on a five-point scale (1 – lowest to 5 – highest) in companies from group A, and it is the highest compared with the other three groups. This suggests an influence of this factor on the financial performance of the companies analysed. Now, let us focus our attention in more detail on the particular groups of A 1 to A 5. Group A 1 It includes 15 companies. They all belong to the manufacturing industry. This group includes 60% of private limited companies and 40% of joint-stock companies. All size categories are represented: small companies (50-99 employees), accounting for approximately 47%, medium-sized companies (100-249 employees), accounting for 20%, and large companies (at least 250 employees) with a share of about 33%. Most companies (about 73%) are not part of a concern. About half of the companies in this group are owned by local owners, while the other half is partly or fully owned by foreign owners. As for the business strategies, the vast majority of these businesses implement strategies aimed at low cost (cost leadership and cost focus). These companies account for about 73% in group A 1. Only a small proportion of companies from group A 1 (about 15%) is export- oriented, i.e. their share of exports in sales is higher than 50%. The average share of exports in sales in a company amounts to approximately 33%. The span of control varies significantly from about four to almost twenty-two in the individual companies.","Castek_1_Castek 23.1.2015 7:38 Stránka 146 146 6 INTERPRETATION OF THE RESULTS ACHIEVED In companies from group A 1, the average motivation component of workers’ wages is almost 20%, which is less than the average for the whole group A (about 29%). Similar findings hold for the average motivation component of top management salaries. The share of this component is about 32% in companies from group A 1, while the average for the whole group A is nearly 36%. The ratio of technical and administrative staff to the total number of employees differs substantially for each company. On average it reaches nearly 37% and it is higher than the average of group A (about 33%) as well as the entire sample (less than 29%). Considerably volatile values are also observed for the indicator ratio of workers to the total number of employees. The nature of the calculation shows that it is complementary to the previous indicator: on average it amounts to about 63% and it is lower than the average of group A (approximately 67%) as well as the entire sample (about 71%). The ratio of women to the total number of employees is very high in the individual companies from group A 1. It ranges from 33% to 80%, and the average reaches to almost 55% of the total number of company employees. It is significantly higher than the average for the whole group A (about 22%) as well as the entire sample (27%). Industry specifics seem to have the main effect: the highest ratio of women (75 to 80%) occurs here in four companies, operating in the field of baking, toy manufacturing, textile production, and book publishing. Staff turnover initiated by employees, measured on a scoring scale of 1 – lowest to 3 – highest, reaches the average value of 1.73 points. It is equal to the average value of the entire sample and it is higher than the average value for the whole group A (1.68). The level of the significant components of tangible assets, measured on a scoring scale of 1 – lowest to 5 – highest, reaches the average value of 3.67 points. It is lower than the average value in group A (3.84), but higher than the value of the entire sample (3.54). Group A 2 It includes 28 companies. They all belong to the manufacturing industry. In terms of the legal form of business, group A 2 contains significantly more private limited companies. There are 20 of them; thus, their proportion is more than 71%. The rest are joint-stock companies. Size categories are represented equally: small companies (50-99 employees) with a share of about 32%, medium-sized companies (100-249 employees), accounting for approximately 36%, and large companies (at least 250 employees) with a share of about 32%. The share of companies that are not part of a concern is higher, but not significantly. There are 16 companies (more than 57%) compared with 12 companies (less than 43%) that are part of a concern.","Castek_1_Castek 23.1.2015 7:38 Stránka 147 6.2 IDENTIFICATION OF TYPICAL COMBINATIONS OF FACTOR VALUES LEADING ... 147 Local owners own approximately two-thirds of companies from this group, and one-third is fully or partly owned by foreign owners. Foreign owners own almost all the businesses that are part of a concern. Businesses in group A 2 predominantly apply strategies aimed at the specific nature of their products with the option to demand higher prices. Application of strategies aimed at low cost is rare. Specifically: of the twenty-six companies (two did not provide information), 16 (i.e. approximately 62.5%) pursue the strategy of differentiation focus, and 3 companies (approximately 11.5%) apply the differentiation strategy. In contrast, 6 companies apply the cost focus strategy (about 23%) and only one company (less than 4%) applies the cost leadership strategy. Group A 2 is typical for its exceptionally high export orientation. The average share of exports in sales is nearly 85% for businesses in this group. All 23 companies that answered this question report an at least 50% share of exports in sales. Of these companies, 12 report that their export share is 90% or more; 4 of these businesses export their entire production. Similarly to the previous group, the span of control is very volatile in these companies, ranging from about three to almost twenty-seven. In companies from group A 2, the average motivation component of workers’ wages is almost 24%. This is less than the average for group A as a whole (approximately 29%) and also less than the average of the sample (nearly 27%). Similar findings also hold for the average motivation component of top management salaries. The share of this component is about 29% in companies from group A 2, while the average for the whole group A is almost 36% and the average for the entire sample is about 34%. Similarly to the previous group, even here the ratio of technical and administrative staff and the ratio of workers to the total number of employees in each company differ substantially. On average, the ratio of technical and administrative staff is higher than 31%. This value is lower than the average of group A (about 33%), but higher than the average of the entire sample (less than 29%). The nature of the calculation shows that the ratio of workers is complementary to the previous indicator and it amounts to about 69% for companies in group A 2. It is higher than the average of group A (approximately 67%), but lower than the average of the entire sample (about 71%). A more detailed analysis shows that companies that are part of a concern have a smaller share of technical and administrative staff than companies that are not part of a concern. This is undoubtedly related to the centralization within the concern when a number of administrative, technical, or other similar activities that enterprises not included in a concern must perform themselves, are performed by the headquarters or a specialized organization for the concern units. The average ratio of women to the total number of employees is about 25% in the companies from group A 2. This value is significantly lower than the average value for the companies in group A 1 (nearly 55%). However, it is higher than the average for the whole group A (about 20%), but slightly lower than the average of the entire","Castek_1_Castek 23.1.2015 7:38 Stránka 148 148 6 INTERPRETATION OF THE RESULTS ACHIEVED sample (27%). The highest ratio of women (80%) was found in a company engaged in the production of textiles. Staff turnover initiated by employees, measured on a scoring scale of 1 – lowest to 3 – highest, reaches an average of 1.7 points. It is higher than the average for the whole group A (1.68), but lower than the average value of the entire sample (1.73). The level of the significant components of tangible assets, measured on a scoring scale of 1 – lowest to 5 – highest, reaches an average value of 4.11 points. It is the highest value in group A as well as in the remaining three groups B, C and D. For the sake of comparison, the average value of the companies in group A is 3.84 points, while the average value of the entire sample is 3.54 points. Upon closer analysis we find that the average value of the level of tangible assets in companies that are part of a concern is 4.16 points. It can be assumed that this is due to inflows of investment from the concerns’ headquarters. Group A 3 It includes only three companies. They all belong to the manufacturing industry and related categories: two belong to category 28 – Manufacture of metal constructions and fabricated metal products, and one to category 29 – Manufacture and reconstruction of machinery and equipment. In terms of their legal forms of business, they are private limited companies. One of the companies belongs to the size group of 50 to 99 employees and two companies belong to the size group of 100-249 employees. The smaller company is not part of a concern and it is owned by a local owner, while both larger businesses are part of a concern and they have foreign owners. The companies mainly apply strategies focused on the specific nature of their products, with the possibility of achieving higher prices (differentiation and differentiation focus strategies). All the businesses have a strong export orientation. The share of exports in sales amounts to 70% for one of the companies and 95% for the two remaining companies. The span of control is clearly higher (16.8) for the company that is not part of a concern than for the companies that are part of a concern (7.48 and 9.92). The ratio of technical and administrative staff amounts to 16%, 23% and 24% in these companies. The values are significantly lower than the average of group A (about 33%), and the average of the entire sample (less than 29%). The ratio of women to the total number of employees is low, which is undoubtedly caused by the industry focus (manufacture of fabricated metal products). In one of the companies it is only 5%, while in the remaining two companies it is about 13% and 15%. Staff turnover initiated by employees, measured on a scoring scale of 1 – lowest to 3 – highest, which amounts to 1.4, 1.2 and 2.2 points in the individual companies, can be considered average, on the whole.","Castek_1_Castek 23.1.2015 7:38 Stránka 149 6.2 IDENTIFICATION OF TYPICAL COMBINATIONS OF FACTOR VALUES LEADING ... 149 The level of the significant components of tangible assets, measured on a scoring scale of 1 – lowest to 5 – highest, reaches an average value of 3 points. It is lower than both the average value in group A (3.84) and the average value of the entire sample (3.54). Group A 4 It includes 43 companies. While in the previous three groups, there were no businesses from the construction industry, there are many of them in this group. The number of construction companies is 23, which represents 27.7% of their total number in the sample (83) of. There are 20 companies from the manufacturing industry, but this represents only 5.7% of the total number of these companies in the sample (349). In terms of the legal form of business, this group includes 27 private limited companies and 16 joint-stock companies. These businesses are mainly small and medium-sized companies. The group includes 15 companies from the category of 50 to 99 employees (about 35%), 22 companies from the category of 100 to 249 employees (about 51%), and only 6 companies with at least 250 employees (14%). The share of companies that are part of a concern is not big: there are 10 companies of this type from the total number of 43 companies included in this group, i.e. about 23%. Only a few enterprises (6) are owned by foreign owners. These are the businesses that are part of a concern. As in the case of companies from groups A 2 and A 3, the strategies of companies in this group focus on the specific nature of their products with the possibility to demand higher prices rather than on achieving low cost. However, compared with the previous two groups, this orientation is slightly less significant in group A 4. 14 companies (i.e. about 35.9%) of 39 (4 did not respond) pursue the differentiation strategy while 10 companies (about 25.6%) follow the differentiation focus strategy. In contrast, the strategy of cost leadership is followed by 8 companies (approximately 20.5%) and 7 companies (about 17.9%) implement the strategy of cost focus. If we compare this strategic orientation in manufacturing and construction companies, we find that this orientation is more significant in the construction industry. While approximately 61.5% of the total number of companies in group A 4 pursue the strategies of differentiation and differentiation focus, the percentage of manufacturing companies applying these strategies is less than 59%; on the other hand, the percentage of construction companies amounts to nearly 64%. This fact can be attributed to industry specificities. Export orientation of companies in group A 4 is very low; more than half of the businesses do not export at all. Of the 40 companies that answered the respective question, there are only 9 (i.e. 22.5%) whose share of exports in sales is 10% or more. The maximum export share is 40% and it is reported by one company. “Non-export” orientation is significant especially for construction companies, which is typical for","Castek_1_Castek 23.1.2015 7:38 Stránka 150 150 6 INTERPRETATION OF THE RESULTS ACHIEVED the construction industry. Zero export is reported by 16 of the 23 construction companies. In contrast, there are only two construction companies whose share of exports in sales amounts to 10% or more. The span of control is very volatile in the individual businesses, even more than in groups A 1 and A 2. It can be said that it has virtually no explanatory value as a characteristics of the companies from this group. The average motivation component of workers’ wages is lower than 25% in the companies of group A 4. This is less than the average for the whole group A (approximately 29%) and also less than the average of the sample (nearly 27%). Similar figures are found even for the average motivation component of top management salaries. In companies from group A 4, the share of this component is less than 33%, while the average for the whole group A is almost 36% and for the entire sample it is about 34%. Similarly to the previous groups, even here the ratio of technical and administrative staff and the ratio of workers to the total number of employees differ substantially in the individual companies; on average, they reach more than 36%. This value is higher than the average for group A (about 33%) and significantly higher than the average for the entire sample (less than 29%). Eight companies from the total number of 42 in group A reported that their ratio of technical and administrative staff is 50% and higher; three of these businesses are manufacturing companies and five are construction companies. In addition to their production activities, these companies are to a large extent engaged in engineering and design activities. Businesses in group A 4 have on average a low ratio of women to the total workforce; the value is about 13%. This value is lower than the average for the whole group A (about 22%), and significantly lower than the average for the entire sample (27%). It is interesting that there is no big difference between the manufacturing and construction companies. The average ratio of women is about 14% in the manufacturing companies and about 12% in the construction companies. A company engaged in the production of computers and medical devices reported the highest ratio of women, reaching the value of 43%. Staff turnover initiated by employees, measured on a scoring scale of 1 – lowest to 3 – highest, reaches an average value of 1.66 points. Is approximately the same as the average value for the whole group A (1.68), but lower than the average value for the entire sample (1.73). The level of the significant components of tangible assets, measured on a scoring scale of 1 – lowest to 5 – highest, reaches an average value of 3.81 points. This value is similar to the average for the whole group A (3.84). However, it is significantly higher than the average value for the entire sample (3.54). Group A 5 It includes 13 companies. Similarly to the previous group of A 4, this group includes both manufacturing companies (9 businesses) and construction companies (4). It represents about 2.6%","Castek_1_Castek 23.1.2015 7:38 Stránka 151 6.2 IDENTIFICATION OF TYPICAL COMBINATIONS OF FACTOR VALUES LEADING ... 151 of the total number of manufacturing companies in the sample and approximately 4.8% of the total number of construction companies. With the exception of two businesses that are joint-stock companies, they are private limited companies (11 businesses). They are mostly medium-sized companies, i.e. they belong to the category of companies with 100 to 249 employees; their number in this group is 9 (about 70%). The number of small businesses, i.e. the category of companies with 50 to 99 employees, is 3 (about 23%). There is only one large company (about 7%), whose number of employees is at least 250. Only one of the companies in this group is part of a concern. Most businesses are owned by local owners. The differentiation strategy is the most commonly applied business strategy; of the ten companies that responded to this question, six pursue it. As for the other strategies, two businesses apply the cost focus strategy and two businesses use the cost leadership strategy. Export orientation of the companies is low. Of the ten companies that responded to this question, 4 companies do not export at all, 3 of which belong to the construction industry. For the remaining six companies, the share of exports in sales is up to 50%. The number of workers who report to one superior in the businesses of group A 5 is considerably volatile, like in the previous groups. It ranges from about five to about thirty-six. Nevertheless, the explanatory power of this indicator is problematic. The average motivation component of workers’ wages is extremely high in the businesses of group A 5 as it amounts to more than 55%. This is substantially more than the average for the whole group A (approximately 29%) as well as the average for the sample (nearly 27%). Similarly, the average motivation component of top management salaries is also extremely high. In companies from group A 5, the amount of this component is more than 68%, while the average for the whole group A is about 36% and for the entire sample it is about 34%. The ratio of technical and administrative staff to the total workforce is not considerably volatile, as it was in the previous group; it amounts to 25% on average. This value is significantly lower than the average for group A (about 33%), and lower than the average for the entire sample (less than 29%). Similarly to the businesses in group A 4, the average ratio of women to the total number of company employees is also low in the companies from group A 5; the value is less than 13%. This value is lower than the average for the whole group A (about 22%), and significantly lower than the average for the entire sample (27%). The ratio of women in construction companies is lower than it is in manufacturing companies as it ranges from 5 to 10%. Staff turnover initiated by employees, measured on a scoring scale of 1 – lowest to 3 – highest, reaches an average value of 1.68 points. It is identical with the average for the whole group A (1.68), but lower than the average value for the entire sample (1.73).","Castek_1_Castek 23.1.2015 7:38 Stránka 152 152 6 INTERPRETATION OF THE RESULTS ACHIEVED The level of the significant components of tangible assets, measured on a scoring scale of 1 – lowest to 5 – highest, reaches an average of 3.7 points. This value is lower than the average for the whole group A (3.84); however, it is higher than the average value for the entire sample (3.54). 6.3 Summary Based on the conducted analysis and its subsequent interpretation, we will now focus on those variables that best describe the specifics of companies included in the above groups. They are listed in table 6-45 and table 6-46. In the following explanation to table 6-45 and table 6-46, we deal with only four of the five groups listed above, namely groups A 1, A 2, A 4 and A 5. Group A 3, containing only three businesses, will not be dealt with due to its insignificant size. It may be said, though, that this group is very close to group A 2 in terms of the variables that we are focusing on here. It is typical for group A 1 that it includes only businesses from the manufacturing industry. However, there is considerable branch heterogeneity within this industry. Compared with the other three groups, it has the highest share of businesses owned by foreign owners (46% of companies). Unlike in the other three groups, businesses in this group, very significantly prefer those strategies that are aimed at low cost (72% of companies). These are mainly domestic-market suppliers. The average share of exports in company sales amounts to 33%. The level of the significant components of tangible assets is slightly below average; compared with companies in the other three groups it is the lowest. It is also typical for group A 2that it includes only businesses from the manufacturing industry. Branch heterogeneity is slightly lower than in the case of group A 1. Compared with the other three groups, the share of businesses that are part of a concern owned by foreign investors is the highest here. Companies in this group typically pursue the differentiation-type strategies, namely the differentiation focus (62.5%) and differentiation (11.5%) strategies. Companies whose strategies are aimed at low cost represent a minority. Businesses in group A 2 have an extremely high export orientation. The average share of exports in company sales in this group amounts to as much as 85%. A high level of the significant components of tangible assets relates with the implementation of the differentiation-type strategies and the ability to succeed in foreign markets; this level is significantly higher than in the other three groups. It can be assumed that in many cases these are companies included in concerns that were equipped with high-quality machinery and equipment by their headquarters. Unlike groups A 1 and A 2, group A 4 includes construction companies as well as manufacturing companies.","Castek_1_Castek 23.1.2015 7:38 Stránka 153 6.3 SUMMARY•• 153 Whole Sample % N 80 328 20 80 73 252 27 95 29 115 71 286 42 147 58 204 Whole Sample % N 37 3.54 % % 74 26 71 29 29 71 39 61 A 34 3.83 A N N 75 27 60 25 29 72 34 54 % % 69 31 75 25 8 92 40 60 A 5 20 3.70 A 5 N Table 6-45: Summary of frequencies of categorical variables in the sample, group A and its subgroups Table 6-46: Summary of average values of chosen variables in the sample, group A and its subgroups N 9 4 6 2 1 11 4 6 % % 47 53 83 17 23 77 38 62 A 4 5 3.81 A 4 N N 20 23 30 6 10 33 15 24 % % 100 0 33 67 67 33 0 100 A 3 87 3.00 A 3 N N 3 0 1 2 2 1 0 2 % % 100 0 64 36 43 57 27 73 A 2 85 4.11 A 2 N N 28 0 16 9 12 16 7 19 % % 100 0 54 46 27 73 73 27 A 1 33 3.67 A 1 N N 15 0 7 6 4 11 8 3 Manufacturing Construction Without FDI With FDI Yes No Low Cost (CL+CF) Differentiation (DIF+DF) Share on Sales (%) ASSETS LEVEL 1 ‐ lowest, 5 ‐ highest INDUSTRY FDI CONCERN MEMBERSHIP STRATEGY EXPORT","Castek_1_Castek 23.1.2015 7:38 Stránka 154 154 6 INTERPRETATION OF THE RESULTS ACHIEVED A large majority of the companies (83%) are owned by a local owner, which is the biggest share of all the groups analysed. The share of companies in the group that are part of a concern (23%) can be considered slightly below average. Companies tend to prefer the differentiation-type strategies. The differentiation focus strategy is applied by 35.9% of the companies, while the differentiation strategy is used by 25.6% of companies. In most cases the companies are domestic-market suppliers. The average share of exports in company sales is only 5% in this group. None of the exporting companies shows a higher share of exports in sales than 40%. The level of significant components of tangible assets is close to the average for the whole group A. Group A 5 also includes both manufacturing and construction companies. Most companies (75%) are owned by a local owner. With the exception of one company, the businesses in this group are not part of a concern. Slightly more than half of the companies implement the differentiation-type strategies while the remaining businesses focus on low-cost strategies. The companies tend to be domestic-market suppliers. The average share of exports in company sales in this group is 20%. The level of the significant components of tangible assets is lower than the average for the whole group A. The conducted analyses suggest that the characteristics of groups A 1 and A 2 are distinct and mutually very different. The business philosophy of most companies in group A 1 could be “Keeping costs low and using the know-how of foreign owners”. In contrast, the business philosophy for most businesses in group A 2 might be “Meeting the specific requirements of foreign customers. Focusing on quality, using high-quality machinery and equipment”. The characteristics of groups A 4 and A 5 are neither as distinct nor different. These groups are quite similar, and to some extent they can be located between groups A 1 and A 2. Their common business philosophy can be formulated as follows: “Using namely good knowledge and contacts in the domestic markets in business activities”.","Castek_1_Castek 23.1.2015 7:38 Stránka 155 155 7 Conclusion The experience stemming from the research conducted confirmed the justified assumption that the task has been nontrivial and the methodology examined for its solution rather ambitious. We focus on the following areas in the final recapitulation, including problems associated with their solution. a) Input data Input data were obtained from two sources: financial statements published in the Albertina database, and questionnaires that the participating companies completed during the empirical investigation. Data from the financial statements can be considered relatively accurate. However, the term “relatively” refers to the fact that in some cases all the data in the respective time series were not available; moreover, even various inaccuracies and distortions which the accounting practice can never completely exclude could occur here. Principally, the character of the information was objective. Nevertheless, the situation was different for the data obtained from the questionnaires. Although the respondents were asked for objective information, such as the number of employees in each category, the ratio of each type of customers on sales, etc., it was clear that the data will be predominantly estimates, given the difficulty of obtaining or calculating such indicators. However, respondents were also asked in ways that anticipated response subjectivity, as it resulted directly from the questions asked. For instance, one of these questions was “What is the level of tangible assets of your business ?” and the answer was supposed to be recorded on a scale of 1 to 5, where “1” meant obsolete equipment and “5” state-of-the-art equipment. It was left up to the respondents how they would assess the equipment on the scale. Another important question concerned, for example, the business strategy applied by a company. Four strategies were characterized very briefly (Cost Leadership, Differentiation, Cost Focus and Differentiation Focus). It was left up to the respondents under which of these generic strategies they will subsume the strategy of their company.","Castek_1_Castek 23.1.2015 7:38 Stránka 156 156 7 CONCLUSION It is obvious that in many cases the researchers had to rely on experience, qualifications and the respondents’ responsibility. However, it should also be noted that the answers to those questions that were of a subjective nature eventually produced nominal and ordinal variables; these variables proved to be more useful for further processing with statistical methods and for the final interpretation than the numeric figures in the form of interval variables. b) Preparation of variables for entering the process of identifying financial performance factors based on learning approaches Using a questionnaire, the researchers obtained data which were expected to describe the business in a complex way if possible. The questionnaire was prepared in previous research. On the one hand, it was a compromise between the complexity and detail of the data, and on the other hand, between acceptability of the questionnaire length in terms of its completion by respondents and the subsequent analysis of the data obtained. After determining what specific requirements the methods of feature selection from the field of statistical pattern recognition impose on the input variables, a substantial modification of the set of these variables started. The needed size of the so-called “training set” is a problem that occurs when using learning approaches, with respect to the dimensionality of the space of features, i.e. their number. In our particular case, the ratio is the number of observations that are available to the number of tested (or explanatory in the case of a regression model) variables. A frequently used rule of thumb says that this ratio should be at least 10:1. There are more sophisticated rules formulated for specific methods and for different uses of these methods; but even if we accept this general rule, it is true that if we wanted to use all of the almost 700 variables obtained from the questionnaires, it would be necessary to have questionnaires from at least 7,000 companies. In a situation where the population included less than half this number of companies, it was, of course, an impossible requirement. Therefore, within a series of experiments that were conducted during the research, repeated selection and in many cases also re-codification of the original variables was gradually used. This measure undoubtedly had a very positive impact on solving the given task. c) Identification of financial performance factors through statistical methods The original idea was to look for factors of corporate performance without an a priori model. In literature we commonly read about bivariate tests of correlations between potential performance factors and performance. In this case, however, the correlations found tend to be rather weak, which reflects the reality where these factors do not act separately. Thus, effects that can occur only with the involvement of the influence of other factors, which may even have no direct influence on performance, are not taken into account. Although it is possible to use these bivariate tests for the search of conditional or partial correlations, it is feasible only for a small number of","Castek_1_Castek 23.1.2015 7:38 Stránka 157 7 CONCLUSION 157 input variables and/or when testing a priori notions about these relationships. Another option is to use multidimensional models, with the most common ones being multiple regression models (general linear models), path analysis or structural equation models. They already take into account the interaction of multiple variables and are able to test complex causal models. Nevertheless, they still require a precise a priori idea of the tested correlations. Therefore, most studies of corporate performance focus on a specific area of a business, such as HRM, IT, or ownership structure etc. Although this covers the complicated links within the explanatory variables and between the explanatory variables and the dependent variables, it is always only for this one area and not across the areas. However, even variables from an “adjacent” area may play a role in explaining the effects of another variable, which also occurred in the results of our research. Thus, we pursued an exploratory type of research investigating the characteristics of businesses from various areas without the pre-formulated way of their anticipated effect. This, however, led to a vast increase in the number of variables required to sufficiently describe each company, i.e. in the dimensionality of the problem solved. Without an a priori model, which would at least partially reduce the number of possible interactions between the input variables, it was impossible to solve such a task in its completeness using the above-mentioned methods. This was the cause of the need to use different methodology. Methods of feature selection from the field of statistical pattern recognition proved suitable. Feature selection methods typically consist of an optimization procedure and criterion of quality of the examined subset of features. The optimization procedure searches the space of the subsets gradually with respect to the criterion values evaluated on the previously investigated configurations. Criterial functions may be based on parametric models of data; however, they can also be defined without the need to include model assumptions. In our context, we primarily examined non- parametric modelling; in this case, the criterial function worked directly with the data in the data file instead of selecting a model explicitly. This was achieved by introducing a criterial function evaluating only the statistics of metric distances of samples above the currently evaluated subspace. The defined solution was on the borderline between the “nearest neighbour” methods and methods of kernel estimates, and it made it possible to elegantly solve the problem of occasionally missing values of some features. By applying the feature selection methods, we were able to reduce the input variables to those that could either assign a company into the correct performance group in the best way, or predict this performance best. The disadvantage of the process bringing the highest accuracy and stability of this prediction (combination of the DAF method and a non-linear regression model) was the absence of guidelines to interpret the variables selected in this way. While the correlation coefficient shows the direction of the effect, or a linear regression model offers an equation explaining the dependent variable, a non-linear regression model, which we used as a measure of prediction accuracy, provided no such global guideline. The nonlinear model offered higher accuracy thanks to the ability to capture context-dependent","Castek_1_Castek 23.1.2015 7:38 Stránka 158 158 7 CONCLUSION relationships between the features, i.e. different in various parts of the space. Observation of local dependencies between the features did not allow easy generalization leading to global conclusions about the features; it turned out that the same features could often manifest themselves as significant, insignificant, or even counterproductive, always depending on other currently selected features. d) Economic interpretation Interpretation of variables selected as “most informative” or best predicting corporate performance became a separate subsequent step. It is clear from the nature of the task solved that the purpose of interpretation is not to quantify the individual locally-dependent characteristics of the variables, but a meaningful generalization of this information towards an easily interpretable result. It involves a description of the space of variables, which has the potential to further clarify the features or feature groups defined by variables of companies, and in particular their mutual relations. For this purpose, the companies were grouped into smaller groups within the existing company groups formed according to their financial performance; the grouping was based on the similarity of values of the identified factors of corporate performance. Consequently, these groups, which were created using the quantitative approach, had to be interpreted using a qualitative approach. It was necessary to synthesize the values of the individual factors and other explanatory variables, taking into account the variability of these values, so that a representative of each of these groups of companies could be characterized in the terminology of business studies. Such a process is obviously challenging in terms of the substantive knowledge of the issues, inventiveness and imagination. Nevertheless, it was this step that finished the implementation of the entire task, and it was possible to economically assess the correctness and relevance of the results achieved. In conclusion, we can say that even with the great complexity of the task whose solution was very laborious, the results obtained can be regarded as beneficial. The objective laid down in the project can be considered satisfied, as a series of experiments verified the effective applicability of feature selection methods from the field of statistical pattern recognition in identifying the factors of financial performance of companies.","Castek_1_Castek 23.1.2015 7:38 Stránka 159 159 8 Bibliography [1] ABOR, J., BIEPKE, N. (2007). Corporate Governance, Ownersip Structure and Performance of SMEs in Ghana: Implications for Financing Opportunities. Corporate Governance, Vol. 3, Issue 7, pp. 288–300. DOI: 10.1108/14720700710756562 [2] ALEXE, G., ALEXE, S., HAMMER, P. L., VIZVARI, B. (2006). Pattern-based feature selection in genomics and proteomics. Annals of Operations Research, Vol. 148, Issue 1, pp. 189–201. DOI: 10.1007/s10479-006-0084-x [3] ALLOUCHE, J., LAROCHE, P. (2005). A Meta-Analytical Investigation of the Relationship Between Corporate Social And Financial Performance. Revue de Gestion des Ressources Humaines, Vol. 2005, Issue 57, pp. 18–41. [4] AMBASTHA, A., MOMAYA, K. (2004). Competitiveness of firms: Review of theory, frameworks, and models. Singapore Management Review, Vol. 26, Issue 1, pp. 45–61. [5] ANDREWS, R., BOYNE, G. A. (2010). Capacity, leadership, and Organizational Performance: Testing the Black Box Model of Public Management. Public Administration Review, Vol. 2010, Issue May/June, pp. 443–454. [6] ARTIACH, T., LEE, D., NELSON, D., WALKER, J. (2010). The determinants of corporate sustainability performance. Accounting and Finance, Vol. 50, Issue 1, pp. 31–51. DOI: 10.1111/j.1467-629X.2009.00315.x [7] BERANOVÁ, M. (2008). Modelling of Knowledge as an Instrument to Improve Retail Business Competitiveness. Trendy ekonomiky a managementu, Vol. 2, Issue 2, pp. 13–19. [8] BERMAN, S., WICKS, A., KOTHA, S., JONES, T. (1999). Does stakeholder orientation matter: The relationship between stakeholder management models and firm financial performance. Academy of Management Journal, Vol. 42, Issue 5, pp. 488–506. DOI: 10.2307/256972 [9] BLAŽEK, L. et al. (2007). Konkurenční schopnost podniků (Primární analýza výsledků empirického šetření). Brno: Masarykova univerzita, 302 p. ISBN 978-80-210-4456-2. [10] BLAŽEK, L. et al.. (2008). Konkurenční schopnost podniků: Analýza faktorů hospodářské úspěšnosti. (in Czech), Brno: Masarykova univerzita, 211 p. ISBN 978-80- 210-4734-1. [11] BLAŽEK, L. et al.. (2009). Konkurenční schopnost podniků. Analýza faktorů hospodářské úspěšnosti. Druhá etapa. Brno: Masarykova univerzita, 349 p. ISBN 978- 80-210-5058-7.","Castek_1_Castek 23.1.2015 7:38 Stránka 160 160 8 BIBLIOGRAPHY [12] BLAŽEK, L. et al. (2011). Nadnárodní společnosti v České republice II. Brno: Masarykova univerzita, 345 p. ISBN 978-80-210-5677-0. [13] BLUM, A., LANGLEY, P. (1997). Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence, Vol. 97, Issue 1–2, pp. 245–271. DOI: 10.1016/S0004- 3702(97)00063-5 [14] BOTTAZZI, G., SECCHI, A., TAMAGNI, F. (2008). Productivity, profitability and financial performance. Industrial and Corporate Change, Vol. 17, Issue 4, pp. 711–751. DOI: 10.1093/icc/dtn027 [15] CAGWIN, D., BARKER, K. J. (2006). Activity-based costing, total quality management and business process reengineering: their separate and concurrent association with improvement in financial performance. Academy of Accounting and Financial Studies Journal, Vol. 10, Issue. 1, pp. 49–77. [16] CARTER, J. R. (1977). In search of synergy – a structure-performance test. The Review of Economics and Statistics, Vol. 59, Issue 3, pp. 279–279. [17] COLES J., LEMMON, M., MESCHKE, F. J. Structural models and endogeneity in corporate finance: The link between managerial ownership and corporate performance. Journal Of Financial Economics, Vol. 103, Issue, pp. 149–168. DOI: 10.2307/1925046 [18] COVER, T. M., HART, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, Vol. 13, Issue 1, pp. 21–27. DOI: 10.1109/TIT.1967.1053964 [19] DAS, S. (2001). Filters wrappers and a boosting-based hybrid for feature selection. In: ICML '01: Proc. 18th Int. Conf on Machine Learning, pp. 74–81. [20] DASH, M., CHOI, K., SCHEUERMANN, P., LIU, H. (2002). Feature selection for clustering - a Filter solution. In: ICDM '02: Proc. 2002 IEEE Int. Conf on Data Mining, IEEE Computer Society, pp. 15–22. [21] DE VAUS, D. (2002). Analyzing Social Science Data. 50 Key Problems in Data Analysis. 1st edition. Sage: London, 401 p. ISBN 0-7619-5937-8. [22] DEVIJVER, P. A., KITTLER, J. (1982). Pattern Recognition: A Statistical Approach, London: Prentice-Hall. ISBN 0-136542360. [23] DUDA, R. O., HART, P. E., STORK, D. G. (2000). Pattern Classification, 2nd edition, Wiley-Interscience, 41p. ISBN 0-471-05669. [24] DUNNE, K., CUNNINGHAM, P., AZUAJE, F. (2002). Solutions to Instability Problems with equential Wrapperbased Approaches to Feature Selection. Technical Report TCD- CS-2002-28, Dublin: Department of Computer Science, Trinity College. [25] EVROPSKÁ KOMISE (2011). European Competitiveness Report 2011. 289 p., [cit. 12. 7. 2014], available on WWW . [26] FERRI, F. J., PUDIL, P., HATEF, M., KITTLER, J. (1994). Comparative Study of Techniques for Large-Scale Feature Selection. Machine Intelligence and Pattern Recognition, Vol. 1994, Issue 16, pp. 403–413. [27] FIELD, A. (2009). Discovering statistics using SPSS. London: Sage Publications Ltd. 850 p. ISBN 978-1-84787-906-6. [28] FORMAN, G. (2003). An extensive empirical study of feature selection metrics for text classification. The Journal of machine learning research, Vol. 3, Issue 1, pp. 1289–1305. [29] FRAJ-ANDRÉS, E., MARTINEZ-SALINAS, E., MATUTE-VALLEJO, J. (2009). A multidimensional approach to the influence of environmental marketing and orientation","Castek_1_Castek 23.1.2015 7:38 Stránka 161 8 BIBLIOGRAPHY 161 on the firm’s organizational performance. Journal of Business Ethics, Vol. 88, Issue 2, pp. 263–286. DOI: 10.1007/s10551-008-9962-2 [30] FUKUNAGA, K. (1990). Introduction to Statistical Pattern Recognition. San Diego: Academic Press. 598 p. ISBN 0-12-269851-7. [31] GADDE, L. E., HAKANSSON, H. (2001). Supply network strategies. Chichester: John Willey \& Sons, Ltd. [32] GREEN, S. B. (1991). How many subjects does it take to do a regression analysis?. Multivariate Behavioral Research, Vol. 26, Issue 3, pp. 499–510. DOI: 10.1207/s15327906mbr2603_7 [33] GUYON I., ELISSEEFF, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, Vol. 2003, Issue 3, pp. 1157–1182. [34] HANSEN, G. S., WERNERFELT, B. (1989). Determinants of firm performance: The relative importance of economic and organizational factors. Strategic Management Journal. Vol. 10, Issue 5, pp. 399–411. DOI: 10.1002/smj.4250100502 [35] HOMBURG, C., KROHMER, H., WORKMAN, J.P. (1999). Strategic consensus and performance: The role of strategy type and market-related dynamism. Strategic Management Journal, Vol. 20, Issue 4, pp. 339–357. DOI: 10.1002/(SICI)1097- 0266(199904)20:4<339::AID-SMJ29>3.0.CO;2-T [36] HOPE, R. D., SPENCER, C. (2001). SRM is not yet a suite spot. Gartner Group. Vol. 2001, Issue 10. 5 s. [cit. 13. 7. 2012], available on WWW: . [37] HULT, G. T. M. ET AL. (2008). An assessment of the measurement of performance in international business research. Journal of International Business Studies, Vol. 39, Issue 6, pp. 1064–1080. DOI: 10.1057/palgrave.jibs.8400398 [38] HUSSEIN, F., WARD, R., KHARMA, N. (2001). Genetic algorithms for feature selection and weighting, a review and study. In: Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on. IEEE. pp. 1240–1244. [39] JAIN, A. K., DUIN, R. P. W., MAO, J. (2000). Statistical Pattern Recognition: A Review. Pattern Analysis and Machine Inteligence. IEEE Transactions on, Vol. 22, Issue 1, pp. 4– 37. [40] JAIN, A. K., ZONGKER, D. (1997). Feature Selection: Evaluation, Application and Small Sample Performance. Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 19, Issue 2, pp. 153–158. [41] JIRÁSEK, J. A. (2000). Konkurenčnost: Vítězství a porážky na kolbišti trhu. Praha: Professional Publishing, ISBN 80-86419-11-8. [42] KALOUSIS, A., PRADOS J., HILARIO, M. (2007). Stability of feature selection algorithms: a study on high-dimensional spaces. In Knowledge and information systems, Issue 12, No. 1, 2007, pp. 95–116. [43] KANNAN, V. R., TAN, K. CH. (2004). Supplier alliances: differences in attitudes to supplier and quality management of adopters and non-adopters. Supply Chain Management, Vol. 9, Issue 4. DOI: 10.1108/13598540410550028 [44] KAPLAN, R.S., NORTON, D.P. (2004). Strategy maps: converting intangible assets into tangible outcomes. Boston: Harvard Business School. ISBN 1-59139-134-2. [45] KESSLER, A. (2007). Success factors for new businesses in Austria and the Czech Republic. Entrepreneurship and regional development, Vol. 19, Issue 5, pp. 381–403. DOI: 10.1080/08985620701439959","Castek_1_Castek 23.1.2015 7:38 Stránka 162 162 8 BIBLIOGRAPHY [46] KIRBY, J. (2005). Toward a Theory of High Performance. Harvard Business Review, Vol. 83, Issue 7/8, pp. 30–39. [47] KLAPALOVÁ, A. (2008). Význam zákazníka pro konkurenceschopnost podniku. Vývojové tendence podniků IV (Svazek I). Brno: Masarykova univerzita, p. 111–160, ISBN 978-80-210-4723-5. [48] KOHAVI, R., JOHN, G. H. (1997). Wrappers for Feature Subset Selection. Artificial Intelligence, Vol. 97, Issue 1–2, pp. 273–324. [49] KOHLI, A. K. ET AL. (1993). Markor: a Measure of Market Orientation. Journal of Marketing Research. Vol. 30, Issue 4, pp. 467–477. DOI: 10.2307/3172691 [50] KONONENKO, I. (1994). Estimating attributes: Analysis and extensions of Relief. In: Machine Learning: ECML-94. Springer Berlin Heidelberg, pp. 171–182. [51] KRIŠTOF, M. (2006). Měření ekonomické efektivnosti podniku. Vývojové tendence podniků (Svazek I). Brno: Masarykova univerzita, pp. 263–280, ISBN 80-210-4133-1. [52] KRIVOGORSKY, V., GRUDNITSKI, G. (2010). Country-specific institutional effects on ownership: concentration and performance of continental European firms. Journal of Management \& Governance, Vol. 14, Issue. 2, pp. 167–193. [53] KŘÍŽEK, P., KITTLER, J., HLAVÁČ, V. (2007). Improving stability of feature selection methods. In: Proc. 12th Int. Conf on Computer Analysis of Images and Patterns, Springer-Verlag, Vol. 4673, pp. 929–936. [54] KUČERA, R. (2005). Vztahy k vlastníkům (akcionářům). Vývojové tendence podniků (Svazek I). Brno: Masarykova univerzita, pp. 255–270. ISBN 80-210-3847-0. [55] KUDO, M., SKLANSKY J. (2000). Comparison of Algorithms that Select Features for Pattern Classifiers. Pattern Recognition, Vol. 33, Issue 1, pp. 25–41. DOI: 10.1016/S0031- 3203(99)00041-2 [56] KUNCHEVA, L. I. (2007). A stability index for feature selection. In Artificial intelligence and applications, 2007, pp. 421-427. [57] LIU, P.-L., CHEN, W.-CH., TSAI, CH.-H. (2004). An empirical study on the correlation between knowledge management capability and competitiveness in Taiwan’s industries. Technovation, Vol. 24, Issue 12, pp. 971–977. DOI: 10.1016/S0166-4972(03)00061-0 [58] LIU, H., YU, L. (2005). Toward Integrating Feature Selection Algorithms for Classification and Clustering. Knowledge and Data Engineering, IEEE Transactions on, Vol. 17, Issue 4, pp. 491–502. [59] MARILL, T., GREEN, D. (1963). On the effectiveness of receptors in recognition systems. Information Theory, IEEE Transactions on, Vol. 9, Issue 1, pp. 11–17. DOI: 10.1109/TIT.1963.1057810 [60] MAYER, H.A., SOMOL, P., HUBER, R., PUDIL, P. (2000). Improving Statistical Measures of Feature Subsets by Conventional and Evolutionary Approaches. In: Advances in Pattern Recognition. Springer Berlin Heidelberg. ISBN 978-3-540-44522- 7. [61] MCLACHLAN, G. J. (2004). Discriminant Analysis and Statistical Pattern Recognition. 2nd edition. New York: John Wiley \& Sons, ISBN 0-471-69115-1. [62] MILES, J. N. V., SHEVLIN, M. (2001). Applying regression and correlation: a guide for students and researchers. London: Sage Publications Ltd. 272 p. ISBN 978-0761962304. [63] MOLINA, M. A., DEL PINO, I. B., RODRIGUEZ, A. C. (2004). Industry, Management Capabilities and Firms Competitiveness: An Empirical Contribution. Managerial and decision economics, Vol. 25, Issue 5, pp. 265–281.","Castek_1_Castek 23.1.2015 7:38 Stránka 163 8 BIBLIOGRAPHY 163 [64] NADARAYA, E. A. (1964). On Estimating Regression. Theory of Probability and its Applications, Vol. 9, Issue 1, pp. 141–142. DOI: 10.1137/1109020 [65] NARVER, J. C., SLATER, S. F. (1990). The effect of a market orientation on a business profitability. Journal of Marketing, Vol. 1990, Issue 10, pp. 20–35. DOI: 10.2307/1251757 [66] NOVOVIČOVÁ, J., PUDIL, P. (1997). Feature selection and classification by modified model with latent structure. In: Dealing With Complexity: Neural Network Approac. Springer Verlag, Vol. 1997, pp. 126–140. [67] NOVOVIČOVÁ, J., PUDIL, P., KITTLER, J. (1996). Divergence Based Feature Selection for Multimodal Class Densities. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, Issue 2, pp. 218–223. DOI: 10.1109/34.481557 [68] O’TOOLE, T., DONALDSON, B. (2000). Managing buyer-supplier relationship archetypes. Irish Marketing Review, Vol. 13, Issue 1, pp. 12–20. [69] POKORNÁ, J., ČÁSTEK, O. (2013). How to measure organizational performance in search for factors of competitiveness. Acta universitatis agriculturae et silviculturae Mendelianae Brunensis, Vol. LXI, Issue 2, pp. 451–461. DOI: 10.11118/actaun201361020451 [70] POWELL, T, DENT-MICALLEF, A. (1997). Information Technology as Competitive Advantage: The Role of Human, Business, and Technology Resources. Strategic Management Journal. Vol. 18, Issue 5, pp. 375-405. DOI: 10.1002/(SICI)1097- 0266(199705)18:5<375::AID-SMJ876>3.0.CO;2-7 [71] PRESTON, L. E., O’BANNON, D. P. (1997). The Corporate Social-Financial Performance Relationship. Business and Society. Vol. 36, Issue 4, pp. 419–420. DOI: 10.1177/000765039703600406 [72] PUDIL, P., BLAŽEK, L., SOMOL, P., ČÁSTEK, O., GRIM, J. (2013). Identification of Corporate Competitiveness Factors – Comparing Different Approaches. In: Proceedings of the International Conference on Management, Leadership and Governance. Bangkok, 07.02.2013 – 08.02.2013. Reading: Academic Conferences and Publishing International Limited, pp. 259–267. ISBN 978-1-909507-01-2. [73] PUDIL, P., BLAŽEK, L., SOMOL, P., POKORNÁ, J., PIROŽEK, P. (2012). Searching Factors of Corporate Competitiveness Using Statistical Pattern Recognition Techniques, In: Proceedings of the 8th European Conference on Management Leadership and Governance. Pafos, 08.11.2012 – 09.11.2012. Reading: Academic Conferences and Publishing International Limited, pp. 556–559. ISBN 978-1-908272-76-8. [74] PUDIL, P., NOVOVIČOVÁ, J., CHOAKJARERNWANIT, N., KITTLER, J. (1995). Feature selection based on approximation of class densities by finite mixtures of special type. Pattern Recognition, Vol. 28, Issue 9, pp. 1389–1398. DOI: 10.1016/0031- 3203(94)00009-B [75] PUDIL, P., PIROŽEK, P., SOMOL, P. (2002). Selection of Most Informative Factors in Merger and Acquisition Process by Means of Pattern Recognition. Signal Processing, Pattern Recognition, and Application, IASTED, ACTA Press, pp. 224–229. [76] PUDIL, P., NOVOVIČOVÁ, J. (1998). Novel Methods for Subset Selection with Respect to Problem Knowledge. IEEE Transactions on Intelligent Systems, Special Issue on Feature Transformation and Subset Selection, Vol. 453, pp. 66–74. [77] PUDIL, P., NOVOVIČOVÁ, J., KITTLER, J. (1994). Floating Search Methods in Feature Selection. Pattern Recognition Letters, Vol. 15, Issue 11, pp. 1119–1125. DOI: 10.1016/0167-8655(94)90127-9","Castek_1_Castek 23.1.2015 7:38 Stránka 164 164 8 BIBLIOGRAPHY [78] PUDIL, P., SOMOL, P. (2008). Identifying the most Informative Variables for Decision- Making Problems – a Survey of Recent Approaches and Accompanying Problems. Acta Oeconomica Pragensia, Vol. 2008, Issue 4, pp. 37–55. [79] RAUDYS, Š. (2006). Feature Over-Selection. Lecture Notes in Computer Science LNCS Springer, Vol. 4109, pp. 622–631. DOI: 10.1007/11815921_68 [80] REUNANEN J. (2003). Overfitting in making comparisons between variable selection methods, Journal of Machine Learning Research. Vol. 3, 2003, pp. 1371–1382. [81] RICHARD, P.J. ET AL. (2009). Measuring organizational performance: towards methodological best practice. Journal of Management, Vol. 35, Issue 3, pp. 718–804. DOI: 10.1177/0149206308330560 [82] RIPLEY, B. (1996). Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press, Massachusetts, ISBN 0-521-46086-7. [83] RIPLEY, B. (2005). Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press, Massachusetts, ISBN 978-052171770. [84] SAEYS Y., INAKI I. I., LARRANGA P. L. (2007). A review of feature selection techniques in bioinformatics, Bioinformatics, 23(19), 2007, pp. 2507–2517. DOI: 10.1093/bioinformatics/btm344 [85] SALAPPA, A., DOUMPOS, M., ZOPOUNIDIS, C. (2007). Feature selection algorithms in classification problems: an experimental evaluation. Optimization Methods and Software, Vol. 22, Issue 1, pp. 199–214. DOI: 10.1080/10556780600881910 [86] SÁNCHEZ-BALLESTA, J. P., GARCÍA-MECA, E. (2007). A Meta-Analytic Vision of The Effect of Ownership Structure on Firm Performance. Corporate Governance, Vol. 15, Issue 5, pp. 879-892. [87] SCHWAB, K. (2012). The Global Competitiveness Report 2012 – 2013. Ženeva: World Economic Forum. 545 s. [cit 10. 7. 2014], available on WWW : . [88] SEBASTIANI, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, Vol. 34, Issue 1, pp. 1–47. DOI: 10.1145/505282.505283 [89] SEBBAN, M., NOCK, R. (2002). A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recognition, Vol. 35, Issue 4, pp. 835–846. DOI: 10.1016/S0031-3203(01)00084-X [90] ŠEDOVÁ, J. (2007). Vlastníci a jejich vztah k podniku. Vývojové tendence podniků III (Svazek I). Brno: Masarykova univerzita, pp. 137–180, ISBN 978-80-210-4466-1. [91] SIEDLECKI, W., SKLANSKY, J. (1993). On automatic feature selection. World Scientific Publishing Co., Inc.: River Edge, NJ, USA. pp. 63–87. [92] ŠIMBEROVÁ, I. (2008). Řízení vztahů se stakeholdry na průmyslových trzích v kontextu současných marketingových koncepcí. Vědecké spisy vysokého učení technického v Brně. Brno: VUTIUM. Sv. 251. p. 38. ISSN 1213-418X. [93] SIMONOFF, J. S. (1996). Smoothing Methods in Statistics. Springer. ISBN 0-387-94716- 7. [94] ŠIŠKA, J. (2012). Analýza výsledků dotazníkového šetření v nadnárodních společnostech: Bachelor diploma thesis. Brno: Masarykova univerzita, Head of the thesis: KRÁLOVÁ, M. [95] ŠIŠKA, L. (2008). Skupiny podniků vytvořené dle ukazatelů hospodářské úspěšnosti. Konkurenční schopnost podniků (Analýza faktorů hospodářské úspěšnosti). Brno: Masarykova univerzita, pp. 51 – 66, ISBN 978-80-210-4734-1.","Castek_1_Castek 23.1.2015 7:38 Stránka 165 8 BIBLIOGRAPHY 165 [96] SKURICHINA, M. (2001). Stabilizing Weak Classifiers. PhD thesis, Pattern Recognition Group, Delft University of Technology, Netherlands. [97] SOMOL, P., BAESENS, B., PUDIL, P., VANTHIENEN, J. (2005). Filter- versus wrapper- based feature selection for credit scoring. International Journal of Intelligent Systems, Vol. 20, Issue 10, pp. 985–999. DOI: 10.1002/int.20103 [98] SOMOL, P., GRIM, J., PUDIL, P. (2011). Fast Dependency-Aware Feature Selection in Very-High-Dimensional Pattern Recognition. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. Anchorage, 09. 10. 2011 – 12. 10. 2011. San Diego: IEEE Computer Society, ISBN 978-1-4577-0652-3. [99] SOMOL, P., NOVOVIČOVÁ, J. (2008a). Evaluating the stability of feature selectors that optimize feature subset cardinality. In Structural, Syntactic, and Statistical Pattern Recognition, Vol. 5342, pp. 956–966. [100]SOMOL, P., NOVOVIČOVÁ, J., GRIM, J., PUDIL, P. (2008b). Dynamic oscillating search algorithms for feature selection. In: Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. IEEE, pp. 1–4. [101]SOMOL, P., NOVOVIČOVÁ, J., PUDIL, P. (2006). Flexible-hybrid sequential floating search in statistical feature selection. In: Structural Syntactic and Statistical Pattern Recognition, Vol. 4109, pp. 632-639. [102]SOMOL, P., PUDIL, P. (2000). Oscillating search algorithms for feature selection, In: Proceedings of the 15th IAPR Int. Conference on Pattern Recognition, Conference B: Pattern Recognition and Neural Networks, pp. 406–409. [103]SOMOL, P., PUDIL, P., ČÁSTEK, O., POKORNÁ, J. (2014). Improved Model for Attribute Selection on High-Dimensional Economic Data. In: Proceedings of the 2nd International Conference on Management, Leadership and Governance [CD]. Boston, 20.03.2014 – 21.03.2014. Reading: Academic Conferences and Publishing International Limited, pp. 276–285. ISBN 978-1-909507-99-9. [104]SOMOL, P., NOVOVIČOVÁ, J. (2010). Evaluating Stability and Comparing Output of Feature Selectors that Optimize Feature Subset Cardinality. IEEE Transactions on PAMI Vol. 32, Issue 11, pp. 1921–1939. DOI: 10.1109/TPAMI.2010.34 [105] SOMOL, P., PUDIL, P. (2000). Oscillating Search Algorithms For Feature Selection, Proc. 15th IAPR International Conference on Pattern Recognition, Barcelona, Spain, pp. 406–409. [106]SOMOL, P., PUDIL, P., KITTLER, J. (2004). Fast Branch \& Bound Algorithms for Optimal Feature Selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, Issue 7, pp. 900–912. DOI: 10.1109/TPAMI.2004.28 [107]SOMOL, P., PUDIL, P., NOVOVIČOVÁ, J., PACLÍK, P. (1999). Adaptive Floating Search Methods in Feature Selection. Pattern Recognition Letters. Vol. 20, Issue 11, 12, 13, pp. 1157–1163. DOI: 10.1016/S0167-8655(99)00083-5 [108]ŠPALEK, J., ČÁSTEK, O. (2010). Přínos učících se metod statistického rozpoznávaní obrazu při hledáni konkurenceschopnosti českých podniku (in Czech). Journal of Economics, Vol. 58, Issue 9, pp. 922–937. [109]STEARNS, S. D. (1976). On selecting features for pattern classifiers. In: Proceedings of the 3rd International Joint Conference on Pattern Recognition. pp. 71–75. [110]SUCHÁNEK, P. (2005). Hodnocení konkurenceschopnosti podniku. Vyvojové tendence podniků (Svazek I.). Brno: Masarykova univerzita, pp. 271–278, ISBN 80-210-3847-0. [111]THEODORIDIS, S., KOUTROUMBAS, K. (2006). Pattern Recognition. USA: Academic Press, 3rd edition.","Castek_1_Castek 23.1.2015 7:38 Stránka 166 166 8 BIBLIOGRAPHY [112]TSAMARDINOS, I., ALIFERIS, C. (2003). Towards Principled Feature Selection: Relevancy, Filters, and Wrappers. In: Proceedings of the ninth international workshop on Artificial Intelligence and Statistics. Morgan Kaufmann Publishers: Key West, FL, USA. [113]VAFAIE, H., IMAM, I. (1994). Feature Selection Methods: Genetic Algorithms vs. Greedy-like Search, In: Proceedings of the International Conference on Fuzzy and Intelligent Control Systems. [114]WEBB, A. (2002). Statistical Pattern Recognition, 2nd ed., Chichester: John Wiley \&Sons, 2002. ISBN 0-470-84514-7. [115]WHITE, R. E. (1986). Generic Business Strategies, Organizational Context and Performance: An Empirical Investigation. Strategic Management Journal, Vol. 7, Issue 3, pp. 217–231. DOI: 10.1002/smj.4250070304 [116]WHITNEY, A. W. (1971). A direct method of nonparametric measurement selection. IEEE Trans. Comput. Vol. 20, Issue 9, pp.1100–1103. DOI: 10.1109/T-C.1971.223410 [117]WILLIAMSON, P. J., VERDIN, P.J. (1992). Age, experience and corporate synergy: When are they sources of business unit advantage?. British Journal of Management, Vol. 3, Issue 4, pp. 221–235. DOI: 10.1111/j.1467-8551.1992.tb00047.x [118]XING, E. P. (2003). Feature Selection in Microarray Analysis. Springer, 2003, pp. 110- 129. [119]YANG Y., PEDERSEN J. O. (1997). A comparative study on feature selection in text categorization. In: ICML '97: Proc. 14th Int. Conf on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco 1997, pp. 412–420. [120]YANG, J., HONAVAR, V. (1998). Feature Subset Selection Using a Genetic Algorithm. IEEE Intelligent Systems. Vol. 453, Issue 13, pp. 44–49. DOI: 10.1109/5254.671091 [121]YILMAZ, C., ALPKAN, L., ERGUN, E. (2005). Cultural determinants of customer - and learning-oriented value systems and their joint effects on firm performance. Journal of Business Research. Vol. 58, Issue 10, pp. 1340–1352. DOI: 10.1016/j.jbusres.2004.06.002 [122]YOUNDT, M. A., SNELL, S. A., DEAN, J. W., LEPAK, D. P. (1996). Human resource management, manufacturing strategy and firm performance. Academy of Management Journal, Vol. 39, Issue 4, pp. 836–866. [123]YU, L., LIU, H. (2003). Feature selection for high-dimensional data: A fast correlation- based filter solution. In Proceedings of the 20th International Conference on Machine Learning, pp. 56–63. [124]YUSTA, S. C. (2009). Different metaheuristic strategies to solve the feature selection problem. Pattern Recogn. Lett., Vol. 30, Issue 5, pp. 525–534. DOI: 10.1016/j.patrec.2008.11.012","Castek_1_Castek 23.1.2015 7:38 Stránka 167 167 List of Figures Figure 1-1: The methodology of previous research activities 13 Figure 1-2: Methodology of current research activities 17 Figure 2-3: Quintiles method 31 Figure 2-4: Descriptive statistics of informativeness values produced by different methods 34 Figure 2-5: Summary of the SFFS algorithm output for the exponent method 34 Figure 2-6: Fluctuation analysis 35 Figure 3-7: Peaking Phenomenon – correct classification rate is a nonmonotonous function of the number of features 43 Figure 3-8a: Scheme of FS 43 Figure 3-8b: Scheme of FE 43 Figure 3-9: Feature selection algorithms can be viewed as black box procedures generating a sequence of candidate subsets with respective criterion values, among which intermediate solutions are chosen 47 Figure 3-10: In this 2D case nor feature 1 nor 2 is sufficient to distinguish patterns from classes of rectangles and circles. Only when information from both features is combined, classes can be separated (dotted line) 48 Figure 3-11: Sequential Forward Floating Selection Algorithm 50 Figure 3-12: Comparing the course of search (current subset size depending on time) in standard sequential search methods 51 Figure 3-13: Sub-optimal FS methods’ optimization performance on 3-NN wrapper 60 Figure 3-14: Sub-optimal FS methods’ performance verified using 3-NN on independent data 60 Figure 4-15a: Informativeness of isolated features 66 Figure 4-15b: Synergic effect of more features 66 Figure 4-16: Cases of possible univariate analysis failure (illustrative example). Let the dots represent the companies of poor CFP and the rectangles represent the well performing companies. Univariate analysis is not capable of revealing a) competitiveness factor redundancy, b) multi-variate factor dependency leading to crucial model accuracy improvement in higher than one-dimensional subspace. 69 Figure 4-17: Comparing SFFS and DAF stability and performance when selecting 5 or 15 features out of 37, with k-NN for k=1, 2, 3. Diamonds show k-NN accuracy achieved; other lines show stability 70 Figure 4-18: Given existing samples S , .., S and distance function d(), the yn value of 3 1 sample Sn is predicted as sum over w * y for i=1, .., 3, where wi reflects d(S , i n i S ) distance from 1-D Gaussian kernel centered in S i 75 i Figure 4-19: Errors of four regression models – each dot represents a company, positioned according to its Assets Growth (GA) and Return on Assets (ROA) values, a higher dot diameter depicts a higher regression error; black and grey colours depict its positive or negative value, respectively: a) 1-Nearest Neighbour with missing values substituted by mean values, b) Kernel Regressor with missing values substituted by mean values, c) 1-Nearest Neighbour with missing values","Castek_1_Castek 23.1.2015 7:38 Stránka 168 168 treated pessimistically, d) (best) Kernel Regressor with missing values treated pessimistically. 76 Figure 4-20: Importance of single company characteristics according to the best- achieved regression model. The graph represents growing subsets of features, features added according to the highest DAF1 coefficients. Note that model accuracy markedly improves after adding the first 8 features, and then after adding roughly the next 7 features. 77 Figure 4-21: Importance of single company characteristics according to best-achieved classification using 1-NN classifier and 3-class data. The graph represents growing subsets of features, features added according to highest DAF1 coefficients. Note that model accuracy markedly improves after adding the first 8 features, and then after adding roughly the next 10 features. 79 Figure 4-22: Comparing Euclidean distance to alternative distances. Higher-order distance functions improve the accuracy of the kernel-based regression model on subspaces roughly of up to D/2 where D is the total number of features 81 Figure 4-23: Comparing accuracy of a pseudo-kernel regression model with a globally optimized multiplication constant m to a model with a locally optimized m. Note that a local optimization of m (optimization on subspace instead on full space) improves accuracy when a small number of features is used 81 Figure 4-24: Comparing the accuracy of default model to that of the improved model. Comparison provided separately for 37- and 74-dimensional dataset representing the same domain. Default configuration compared to the best configuration identified in this paper 85 Figure 5-25: The effect of industry on financial performance controlled for the company size 96 Figure 5-26: The effect of company size on financial performance controlled for the industry 97 Figure 5-27: The combined effect of company size and industry on financial performance 97 List of Tables Table 3-1: Evolution of sequential search methods 49 Table 3-2: Single features in descending order, first best 7 then last worst 7, according to individual criterion values (i.e., “individual discriminative power”), 4-class, 38-dimensional Australian credit scoring data 53 Table 4-3: Analogy of problems of statistical pattern recognition and the problem of determining competitiveness factors 68 Table 4-4: Comparing feature orderings yielded by model using Euclidean distance and L7 distance 82 Table 4-5: Comparing feature orderings yielded by a model using a globally versus locally optimized kernel width multiplier constant. Euclidean distance is used in both cases. 84 Table 4-6: Top 25 features selected using the optimized model on 74-dim and 37-dim data 86","Castek_1_Castek 23.1.2015 7:38 Stránka 169 169 Table 5-7: Frequency distribution of a categorized company size 93 Table 5-8: The association between the company size and the financial performance 93 Table 5-9: The association between the industry and the financial performance 94 Table 5-10: The association between the industry and the financial performance, controlling for the size of the company: statistics 95 Table 5-11: The association between the industry and the financial performance, controlling for the size of the company: differences in CFP 95 Table 5-12: The influence of company size on financial performance, controlling for the industry: statistics 96 Table 5-13: Variables describing the internal environment of a company: bivariate correlations with CFP 98 Table 5-14: Summary of the internal factors regression model 99 Table 5-15: Coefficients of the internal factors regression model 100 Table 5-16: Variables describing the external environment of a company: bivariate correlations with CFP 101 Table 5-17: Summary of the external factors regression model 102 Table 5-18: Coefficients of the external factors regression model 103 Table 5-19: Contents of the variables describing the stakeholder orientation and characteristics of an organizational structure 104 Table 5-20: Variables describing stakeholder orientation of an enterprise and characteristics of an organizational structure: bivariate correlations with CFP 105 Table 5-21: Summary of the regression model describing stakeholder orientation of an enterprise and characteristics of its organizational structure 106 Table 5-22: Coefficients of the regression model describing stakeholder orientation of an enterprise and characteristics of its organizational structure 107 Table 5-23: Contents of variables describing the ownership and property structure 108 Table 5-24: Variables describing the ownership and property structure: bivariate relationships with CFP 109 Table 5-25: Summary of the regression model describing the ownership and property structure 110 Table 5-26: Coefficients of the regression model describing the ownership and property structure 111 Table 5-27: Contents of variables describing the stakeholder group of employees 112 Table 5-28: Variables describing the stakeholder group of employees: bivariate relationships with CFP 113 Table 5-29: Summary of the regression model describing the stakeholder group of employees 115 Table 5-30: Coefficients of the regression model describing the stakeholder group of employees 115 Table 5-31: Contents of variables describing customers of a company 117 Table 5-32: Variables describing customers: bivariate relationships with CFP 117 Table 5-33: Summary of the regression model describing the stakeholder group of customers 119 Table 5-34: Coefficients of the regression model describing the stakeholder group of customers 119","Castek_1_Castek 23.1.2015 7:38 Stránka 170 170 Table 5-35: Contents of variables describing suppliers of a company 120 Table 5-36: Variables describing suppliers of a company: bivariate relationships with CFP 121 Table 5-37: Summary of the regression model describing the stakeholder group of suppliers 122 Table 5-38: Coefficients of the regression model describing the stakeholder group of suppliers 123 Table 5-39: Variables describing CSR activities of a company: bivariate relationships with CFP 124 Table 5-40: Summary of the regression model describing corporate social performance 125 Table 5-41: Coefficients of the regression model describing corporate social performance 125 Table 5-42: Summary of the comprehensive regression model 127 Table 5-43: Coefficients of the comprehensive regression model 128 Table 6-44: Overview of variables 130 Table 6-45: Summary of frequencies of categorical variables in the sample, group A and its subgroups 153 Table 6-46: Summary of average values of chosen variables in the sample, group A and its subgroups 155 List of Graphs Graph 2-1: Clusters of cluster analysis 27 Graph 2-2: Hyperbola 29 Graph 6-3: Relative frequency of quartiles in manufacturing and construction 131 Graph 6-4: Relative frequencies of quartiles in private limited companies and in joint stock companies 132 Graph 6-5: Relative frequencies of quartiles according to company size 133 Graph 6-6: Relative frequencies of quartiles according to concern membership 133 Graph 6-7: Relative frequencies of quartiles in companies with and without FDI 134 Graph 6-8: Relative frequencies of quartiles according to the ownership concentration 135 Graph 6-9: Relative frequencies of quartiles according to generic strategy 136 Graph 6-10: Share of companies with predominant exports in the quartiles 137 Graph 6-11: The average span of control in the quartiles 137 Graph 6-12: The average upper limit of the motivation components of workers’ wages in the quartiles 138 Graph 6-13: The average upper limit of the motivation components of top-managers’ salaries in the quartiles 139 Graph 6-14: The average percentage of technical and administrative staff in the quartiles 139 Graph 6-15: The average percentage of workers in the quartiles 140 Graph 6-16: The average percentage of women in the quartiles 141 Graph 6-17: The average frequency of reasons for the staff turnover rate (from the employee’s initiative) in the quartiles 141 Graph 6-18: The average tangible assets level in the quartiles 142","","Castek_1_Castek 23.1.2015 7:38 Stránka 172 Board of Scientific Editors: prof. PhDr. Ladislav Rabušic, CSc. Mgr. Iva Zlatušková Ing. Radmila Droběnová, Ph.D. Mgr. Michaela Hanousková doc. PhDr. Jana Chamonikolasová, Ph.D. doc. JUDr. Josef Kotásek, Ph.D. Mgr. et Mgr. Oldřich Krpec, Ph.D. prof. PhDr. Petr Macek, CSc. PhDr. Alena Mizerová doc. Ing. Petr Pirožek, Ph.D. Mgr. Petra Polčáková doc. RNDr. Lubomír Popelínský, Ph.D. Mgr. Kateřina Sedláčková, Ph.D. prof. RNDr. David Trunec, CSc. prof. MUDr. Anna Vašků, CSc. prof. PhDr. Marie Vítková, CSc. doc. Mgr. Martin Zvonař, Ph.D. Pavel Pudil, Ladislav Blažek, Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová Identifying Corporate Performance Factors Based on Feature Selection in Statistical Pattern Recognition. METHODS, APPLICATION, INTERPRETATION Editorial Board: doc. Ing. Petr Pirožek, Ph.D., doc. Ing. Petr Suchánek, Ph.D., doc. RNDr. Milan Viturka, CSc., doc. Ing. Vladimír Hyánek, Ph.D., Ing. Eva Hýblová, Ph.D., Ing. Daniel Němec, Ph.D., Ing. Mgr. Markéta Matulová, Ph.D. Cover design: RNDr. Petr Somol, Ph.D. Layout and typeset: Lea Novotná Published by Masaryk University in cooperation with University of Economics, Prague Brno 2014 First Edition ISBN 978-80-210-7557-3 ISBN 978-80-210-7672-3 (online : pdf) DOI: 10.5817/CZ.MUNI.M210-7557-2014","9 788021 075573 2 0 14 ISBN 978-80-210-7557-3 DOI: 10.5817/CZ.MUNI.M210-7557-2014 Administration, Masaryk University PUDIL, BLAŽEK et al. Faculty of Economics and \& University of Economics, Prague Faculty of Management 2014 Identifying Corporate Performance Factors Based on Feature Selection Faculty of Management University of Economics, Prague Identifying Corporate Performance Factors Based on Feature Selection \& Faculty of Economics and Faculty of Management Administration, Masaryk University University of Economics, Prague \& Faculty of Economics and Administration, Masaryk University Identifying Corporate Performance Factors Based on Feature Selection in Statistical Pattern Recognition Identifying Corporate Performance M E T H O D S , A P P L I C AT I O N , I N T E R P R E TAT I O N Factors Based on Feature Selection in Pavel Pudil, Ladislav Blažek, Statistical Pattern Recognition Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová METHODS, APPLICATION, INTERPRETATION Pavel Pudil, Ladislav Blažek, Ondřej Částek, Petr Somol, Jana Pokorná, Maria Králová 2014 2014 2014 Faculty of Management anagement Faculty of M Universit University of Economics, Praguey of Economics, Prague \& \& Faculty of E Faculty of Economics and conomics and PUDIL, BLAŽEK et al. Administr Administration, Masaryk Universityation, Masaryk University DOI: 10.5817/CZ.MUNI.M210-7557-2014 2 0 14 2 0 14 88021 PUDIL, BLAŽEK et al. 2014 Identifying Corporate Performance Factors Based on Feature Selection Factors Based on Feature Selection in 2 0 14 2 0 14 DOI: 10.5817/CZ.MUNI.M210-7557-2014 DOI: 10.5817/CZ.MUNI.M210-7557-2014 2014 Maria Králová Ondřej Částek, Petr Somol, Jana Pokorná, Pavel Pudil, Ladislav Blažek, METHODS, APPLICATION, INTERPRETATION Statistical Pattern Recognition Identifying Corporate Performance Administration, Masaryk University Faculty of Economics and \& University of Economics, Prague Faculty of Management"];