German English

Affiliation Analysis

Bibliometric studies of computer science and database publications to date mainly focus on the number of papers and citations per author or per journal. As (commercial) bibliographic systems concentrate on journals, there is only little analysis regarding the affiliations of authors in computer science and database research.

We analyze author affiliations of publications to determine the main institutions contributing research to a specific field. For instance, we determine top affiliations in terms of number of papers (productivity) and also aggregate the numbers at varying level of detail, e.g. cities, countries, and continents.

Author affiliations in publications are given in quite heterogeneous form. Before any analyses on these data can be undertaken, the affiliation mentions denoting the same real world institutions have to be aligned. For this, we investigated into web-based affiliation recognition, matching, and clustering (cf. our publications).

Interpreting multiple-author papers as collaborations, bonds within and across institutions, cities, countries, and continents become visible (e.g. see illustration).


Illustrating collaborations within and across major countries publishing database research

Project Members

Publications

PDF
further information
Google Scholar
Aumueller, D.; Rahm, E.
Affiliation analysis of database publications
ACM SIGMOD Record, Vol. 40, No. 1, pp 26-31, March 2011
2011-03-31
PDF
further information
Google Scholar
Aumueller, David; Rahm, Erhard
Web-based Affiliation Matching
14th International Conference on Information Quality 2009 (ICIQ’09)
2009-11
PDF
further information
Google Scholar
Aumueller, David
Retrieving metadata for your local scholarly papers
BTW 2009
2009-03
PDF

Google Scholar
Aumueller, David
Towards web supported identification of top affiliations from scholarly papers
BTW 2009
2009-03

See also Citation Analysis and Semantic Content

Dataset example

With the following archive we provide some of our data for download – contained therein:

  • affiliation strings, mostly as available from ACM, though in cases also the original PDFs were taken into account
  • correspondences between affiliation strings on institution level, i.e. neglecting departments etc.

Download: affiliationstrings.zip

Note: Other object matching datasets available via Benchmark datasets for entity resolution.

Exemplary results of ten years of database publications

The following tables present initial results of an affiliation analysis of publications of the last decade (2000–2009) that appeared in the top conferences SIGMOD and VLDB and in the VLDBJ and TODS journals. It is also browsable along affiliation via our publication categorizer.

Notes on table headings:

  • papers: productivity of regarded entity using total counting of papers
  • frac: fractional counting (other columns always total counting)
  • affils: number of affiliations within entity
  • years 2000–2004 and 2005–2009 as first and second, respectively

SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Data overview on continental level (subsuming Africa, Oceania, and South America into Southern Hemisphere)


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Summary per five year spans and decade


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Base data by venue


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Base data by year


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Top countries


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Top authors


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Countries by research papers only


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Countries by industrial papers only


SQLSTATE[HY000] [2003] Can’t connect to MySQL server on ‘dbserv2’ (113)

Countries by demo papers only