• Nem Talált Eredményt

Contributions to Industry 4.0 issues

As an impact of Industry 4.0, the labour market is changing drastically, because the change in technology requires the obtain new skills and competences. One source of skills is education, which needs to prepare for changes in the future.

My dissertation contributes to defining the relationship between education and the labour market. Examining the horizontal match of education programs and jobs is a hard problem. The method has developed in thesis contributes to a more accurate examination of the early career path of graduates based on an administrative database of the whole cohort of graduates.

The number of STEM graduates and/or the number of people working in STEM jobs are often used indicators to specify the readiness of regions for Industry 4.0. Still, none of the indicators examines the correlation between them in depth. Based on the results, it can be concluded that the two indicators show similar phenomena. The matching of engineering programs and jobs is high. However, not all STEM graduates work in STEM jobs. Also, not only STEM graduates work in STEM jobs.

Chapter 3

Modularity based attractivity in a spatial network

Abstract How are ownership relationships distributed in the geographical space? Is physical proximity a significant factor in investment decisions? What is the impact of the capital city? How can the structure of investment patterns characterize the at-tractiveness and development of economic regions? To explore these issues, we analyse the network of company ownership in Hungary and determine how are connections are distributed in geographical space. Based on the calculation of the internal and exter-nal linking probabilities, we propose several measures to evaluate the attractiveness of towns and geographic regions. Community detection based on several null models indicates that modules of the network coincide with administrative regions, in which Budapest is the absolute centre, and where county centres function as hubs. Grav-ity model-based modularGrav-ity analysis highlights that besides the strong attraction of Budapest, geographical distance has a significant influence over the frequency of con-nections and the target nodes play the most significant role in link formation, which confirms that the analysis of the directed company-ownership network gives a good indication of regional attractiveness.

3.1 Introduction

Mining valuable information from social networks is a hard problem due to its dynamic nature [115, 116], complex structure [117, 118] and multidimensionality [119]. This chapter deals with the structural issues as it tries to evaluate regional attractiveness based on a set of goal-oriented null models identified to describe the geographical distributions of company ownership relations.

Complex multivariate socio-economic data is widely used to monitor regional pol-icy [120, 121]. As the usage of a different set of variables results in various rankings, the definition and selection of socio-economic variables is the key issue in these

appli-cations. The drawback of these indicator-based approaches is that although economic behaviour is socially constructed and embedded in networks of interpersonal relations [122] and strong related to location [123], the network structure of the economy is neglected.

This chapter adds a viewpoint to regional studies based on the analysis of how the network of personal investments and the founding of companies relate to the settlement hierarchy. We assume that the socially embedded economy must have a network-based imprint in the company-ownership network which is a good indication of regional attractiveness.

Attractiveness is meaningful in preferential attachment networks, where the like-lihood of a new connection is proportional with degree [51] and fitness [124] of the node. These models were generalized to handle initial attractiveness [125] and late-comer nodes with a higher degree of fitness [32, 124]. It is important to note that these models generate power-law (degree) distributions that are similar to the distribution of socio-economic variables of settlements indicating that preferential attachment is a process that can be used to describe city grow [126, 127, 128, 129, 130]. In the case of geographically distributed networks, the likelihood of link formation is dependent on distance due to the cost of establishing connections and spatial constraints [41].

Connection costs also favour the formation of cliques and thus increase the clustering coefficient [131]. Space is important in social networks as most individuals connect with their spatial neighbours [131] to minimize their effort and maintain social ties [132], e.g. the majority of our friends are in our spatial neighbourhood [133]. The proba-bility P(d) that distance d separates two connected individuals is found to behave as P(d)∼ d−2 in terms of Belgian mobile phone data [134], or generally P(d)∼d−δ, as has been shown in the case of the social network of more than one million bloggers in the USA [135], in friendship network of Facebook users, and in email communication networks [136, 137].

The attractiveness of airports [138], countries for foreign investments [139] and tourist destinations [140] is evaluated based on socio-economic variables. As many origins and destinations are present in these applications, the theory of bilateral trade flows accounts for the relative attractiveness of origin-destination pairs. The gravity model is one of the most successful empirical models implemented in economics to describe such interactions across the space [141]. Originally developed by Newton, the law of gravity can also be used in economics to describe the extent of interactions between two points of mass in networks. Almost 40 years ago, before the emergence of network science, Anderson suggested that as a force between two mass points, the number of trips from location i to location j follows the (economic version) of the

’Gravity’ law, F(d) ∼ P(d) ∼ I1αI2αd−δ [142]. Nowadays, many complex networks embedded in space and spatial constraints may have an effect on their connectivity patterns such as trade markets [143], migration [144], traffic flow [145] and mobile

communication [134] that can be successfully modelled by a gravity model, which was also successfully applied in link prediction [146].

It is assumed that regions that heavily rely on local resources consist of more in-ternal connections that form modules in networks, so the modularity of the networks which reflect socio-economic relationships can be used to measure regional attractive-ness. The goal of modularity analysis is to separate the network into groups of vertices that have fewer connections between them than inside the communities [33]. In social network analysis, community detection is a basic step in understanding the structure, function and semantics of networks [118]. Community analysis is performed in two separate phases: first, detection of meaningful community structure from a network, and second, evaluation of the appropriateness of the detected community structure [72]. Systematic deviations from a random configuration allow us to define a quan-tity called modularity, that is a measure of the quality of partitions. Newman-Girvan modularity considers only the degree of nodes as a null model which is equivalent to rewiring the network whilst preserving the degree sequence [147, 60]. This random model overlooks the spatial nature of the network thus modules are blind to spatial anomalies and fails to uncover modules determined by factors other than mere physical proximity [41], which is the reason why several distance-dependent null models have been proposed recently [148, 72, 41, 149].

The goal is to use the tools of network community detection to evaluate the at-tractiveness of the elements of settlement hierarchies (towns, statistical sub-regions, counties, regions) based on their modularities as well as internal and external connec-tion densities. The internal connecconnec-tions of the ownership network through the point of view of Newman-Girvan, spatial and gravity based null models was studied. As the modularity is based on the difference between the actual and evaluated values of weight of edges, the more accurately describes the null model the real spatial network, the total modularity tends to be zero, so the modules highlight the hidden structural similarities. A visualization technique was developed to analyse these unknown effects on community structure which can explain the attractiveness of a settlement/region.

Besides measuring the attractiveness, the Louvain community detection algorithm [150, 151] was utilized to identify closely related regions. The complete investment network of Hungarian companies was examined to explore how the ownership connec-tions are geographically distributed, what is the structure of the network, what are the common connection directions as well as how the extracted information is corre-lated to the settlement hierarchy. The studied database contains information about the owners and addresses of the companies.

The results highlight that distance dependence of the investment connections is more significant than was found in online social networks [137, 152, 133]. The analysis shows that the network is hierarchical and modular as well as shaped according to the settlement hierarchy, in which Budapest is the absolute centre, and the centres of

counties function as hubs.

The outline of this chapter is as follows: Section 3.2.1 presents the company own-ership network and the metrics related to attractiveness. Section 3.2.3 describes the null models designed in this work to measure modularity as well as handle physi-cal proximity and presents how closely related regions can be explored based on the modularity-related merging of towns and sub-regions. The results and discussion are provided in Section 3.3.

3.2 Problem formulation: settlement hierarchy and