Prolegomenon

This is the publication of all data used analysing the Nationale Forschungsdateninfrastruktur (NFDI) with its different consortia. The source of the data are the binding Letters of Intent (LoI) of the consortia in which they name their collaboration partners.1 The following analysis only includes documents that were submitted to the DFG as binding pre-applications for 2019 and 2020 (binding Letters of Intent). Consortia that did not submit a binding Letter of Intent in 2019 or 2020 were not included. We gathered all the data at the point when the LoI have been turned in. Consortia which have not turned in a binding Letter of Intent are not considered at all.


  1. Have a look at the GitHub repository of Dorothea Strecker (https://github.com/dorothearrr/NFDI_Netzwerk), where the data has been destilled from the various LoI.

Presettings

Making sure that we have a common seed.

We define colors we will use for grouping consortia.

Some presettings for plotting the networks.

We can define a function for rescaling the vertex size. This makes plots of the 2019 and 2020 networks visually comparable.

Next we are setting up a function to get the same presettings for all different data frames.

Data sets

The core of the this publication are the sets of edges i.e. the connections for collaborations between the consortia. So far there are only two data sets available, for 2019 and for 2020.

2019

Now follows the information regarding the consortia. Precisly the allocated group according to the NFDI-conference system (group), and since at the time of turning in the LoI none of the consortia had been funded the column (funded) has 0 as value for all the consortia.

2020

What follows is the data for the year 2020 and the collaboration intentions between new and already funded consortia.

Edges

Let us now display the edges for the data sets we have so far. We do this using a particular function.

2019

2020

Data vectors

Following we present and explain the data and its origin.

Funded

The values whether a consortia is funded or not cannot be calculated. Therefore we already passed the information as a column to the consortia above.

The column funded has either the value 0 (consortium has not been funded) or 1 (consortium has been funded). Since we look at the consortia at the time of turning in binding Letters of Intent, for 2019 there is no value at all.

Now we can plot the networks and for 2020 we will highlight the funded consortia.

NFDI conference system (group)

Each consortium has been allocated to a group which is known as the NFDI conference system.

In the first column you see the consortia’s name in the second column (group) the number of the allocated group. There are five groups and each has a special color code.

No. Conference group Color code
(1) Medizin #f5ac9f
(2) Lebens- und Geowissenschaften #e43516
(3) Geistes- und Sozialwissenschaften #f9b900
(4) Ingenieurwissenschaften und Mathemaik #007aaf
(5) Chemie und Physik #6ca11d

: NFDI Conference system. Each consortia is allocated to one specific group.

Plotting grouped network

First we are defining a common legend for the networks, since we want to be able to differentiate the various groups the consortia are colored in.

Now we can go on and plot the network with colored consortia nodes.

Cluster

The value of the column cluster has been calculated by the function cluster_optimal. Each consortia is allocated to a cluster.

Plotting clustered network

With the algorithm cluster_optimal we can do a community detection.

Now we are hightlighting only the connections of consortia between different clusters.

Amount of edges

There are three different ways of counting edges to a node: all, incoming ones and outgoing ones.

All edges (degree.total)

We get all the edges with the function

degree(<GRAPH-OF-DATA-FRAME>, mode="total")

and receive the following table.

Incoming edges (degree.in)

For counting incomgin edges a directed network is necessary. Then the function degree with a different value for mode is applied.

degree(<GRAPH-OF-DATA-FRAME>, mode="in")

Outgoing edges (degree.out)

As before the function degree with a different value for mode is applied.

degree(<GRAPH-OF-DATA-FRAME>, mode="out")

Vertex and edge betweenness centrality (betweenness)

The vertex betweenness has been calculated by the function betweenness()^[https://igraph.org/r/doc/betweenness.html].

Closeness centrality

Clo[se]ness centrality measures how many steps is required to access every other vertex from a given vertex.^[https://igraph.org/r/doc/closeness.html]

Thereby we can differentiate between three different ways of closeness centrality.

closeness.total

closeness(<GRAPH-OF-DATA-FRAME>, mode="total")

closeness.in

closeness(<GRAPH-OF-DATA-FRAME>, mode="in")

closeness.out

closeness(<GRAPH-OF-DATA-FRAME>, mode="out")

Data Collection

Here is an overview of all the data we have gathered so far.

2019

2020

Further stats

Wilcoxon rank sum test

Amount of Letters of Intent with collaborations mentioned

Nodes and edges in the network

Nodes

We can easily count all the nodes in a network of a particular year by using the function gorder.^[https://igraph.org/r/doc/gorder.html]

Edges

We can easily count all the nodes in a network of a particular year by using the function gsize.^[https://igraph.org/r/doc/gsize.html]