Education
A method for identifying different types of university research teams
Z. Cheng, Y. Zou, et al.
This study by Zhe Cheng, Yihuan Zou, and Yueyang Zheng unveils a groundbreaking method for classifying university research teams into four distinct types. With case studies from leading universities in Materials Science and Engineering, the findings highlight the dominance of backbone-based research groups and their unique collaboration patterns across various cultural contexts.
~3 min • Beginner • English
Introduction
The study addresses how to identify different types of university research teams and what characterizes research groups within universities. Team science research spans small and large teams, with varying definitions and scopes. Prior work has identified teams via co-authorship, projects, or centers, but large-scale, performance-oriented analyses often require bibliometric identification. Existing approaches (e.g., social network analysis, community detection algorithms) typically target single team types, struggle with temporal consistency, member multiplicity, and lack of stable categorizations. This study proposes algorithms grounded in team definitions and bibliometrics to identify diverse university research teams, applied to two leading universities in Materials Science and Engineering. Research questions: (1) How can we identify different types of university research teams? (2) What are the characteristics of research groups within universities?
Literature Review
Traditional identification methods rely on project member lists or institutional webpages, which may omit contributors, include non-collaborating names, or lack intensity and ability measures. Bibliometric approaches use co-authorship and citation networks, employing tools/algorithms such as UCINET, CiteSpace, Fast-Unfolding, Louvain, FP-GROWTH, faction/CL-leader algorithms, and topic-based clustering. Limitations include fixed algorithm rules that are hard to customize, overemphasis on connections without aligning with team definitions, temporal inconsistencies in co-authorship, instability across standards, and inability to assign members to multiple teams. The present study incorporates project information and first-author roles to address temporal and spatial collaboration, uses Price’s Law and Everett’s Rule for member classification, and handles multiple affiliations via Jaccard similarity and Louvain to achieve classification of university research teams.
Methodology
Overview: The method operationalizes a team as researchers collaborating toward shared objectives, using co-authorship relations as evidence of interaction. It proposes algorithms to identify four entities: Project-based research teams (Pbrt), Individual-based research teams (Ibrt), Backbone-based research groups (Bbrg), and Representative research groups (Rrg). Member classification: Members are classified by contribution and collaboration using (a) Price’s Law for prolific authors: an author is prolific if their publication count ≥ 0.749 × sqrt(Pmax) where Pmax is the max output in the group; (b) Everett’s Rule (k-plex criterion): each member should be directly connected to at least (n−1)/2 members to qualify as a stable factional tie. Definitions: Core members are prolific authors who collaborate with ≥ half of prolific authors; Backbone members are prolific authors collaborating with < half of prolific authors; Ordinary members are non-prolific who collaborate with ≥ half of prolific authors; Marginal members are non-prolific who collaborate with < half of prolific authors. Team identification: 1) Pbrt: Extract funding numbers; cluster papers sharing a funding number; build co-authorship networks per cluster; classify members via the above criteria. 2) Ibrt: For papers without funding, use the first author as initiator; gather all their papers within the identification period to form a paper group; build the co-authorship network and classify members. 3) Bbrg: Identify discipline backbone members at the university level via Price’s Law (authors meeting the 0.749 × sqrt(Pmax) threshold). Consolidate all Pbrts and Ibrts in which a discipline backbone serves as a core member to form preliminary Bbrgs. Merge Bbrgs using Jaccard similarity on sets of core+backbone members: Jaccard(A,B) = |A ∩ B| / |A ∪ B|; merge if ≥ 0.5. Because projects/teams can be reused across groups, further merge groups when their paper sets exhibit high similarity, applying a principle of "if connected, then merged" to ensure heterogeneity across final groups. 4) Rrg: Extract core and backbone members from all Bbrgs and run the Louvain algorithm to form representative research groups without member redundancy, aggregating important contributors across Bbrgs. Validation: Following Boyack & Klavans (2014), use SCIVAL topic clusters to test consistency across the top three research areas of core/backbone members within Bbrgs; high consistency indicates valid identification. Data collection and preprocessing: Field: Materials Science and Engineering (MSE) at Tsinghua University (THU) and Nanyang Technological University (NTU). Source: Clarivate Analytics subject categories. Period: Publications 2017–2021; to account for project durations, include papers associated with projects appearing in 2017–2021 within 2011–2022. Inclusion: Papers where first or corresponding author is affiliated with the target university. Name disambiguation: Standardize capitalization; remove hyphens; use Python dedupe for ambiguity; unify variants based on affiliations and collaborators (e.g., LONG, W.H → LONG, WENHUI). Analytical computations: Network density D = 2R/[N(N−1)]. Community detection with Louvain; merging with Jaccard threshold ≥ 0.5. Outputs computed include counts of Ibrt, Pbrt, Bbrg, merged components, and Rrg; network densities among core, core+backbone, and all members; group outputs (publications, per-capita outputs, field-weighted citation impact (FWCI), and citations per publication).
Key Findings
- Identification coverage and counts: For MSE, THU and NTU have many Pbrts, indicating broad funding support; fewer Ibrts exist. Most Ibrts and Pbrts are encompassed by Bbrgs. Rrgs total 39 at each university. Counts: THU: Ibrt 332; Pbrt 8853; Bbrg 260; merged Ibrt in Bbrg 328; merged Pbrt in Bbrg 6281; Rrg 39. NTU: Ibrt 476; Pbrt 4842; Bbrg 221; merged Ibrt in Bbrg 206; merged Pbrt in Bbrg 4002; Rrg 39. - Topic consistency validation: Average consistency of the most concentrated research area across Bbrgs is 0.93 (e.g., in a Bbrg of 10 core/backbone members, ~9.3 share the same main area). Across sizes, the top-area consistency ≥0.90, supporting validity. - Group scale and member ratios: Core member distribution is similar across universities; Bbrgs with 6–10 core members are most common, followed by 0–5. Average core members per Bbrg: 7.08. Proportion of core and backbone members in groups: ~11.22% (THU) to 13.88% (NTU), overall 12.45%, consistent with Price’s Law. - Network structure: Network density declines with group size; highest among core members, lowest among all members. Averages: Core density THU 0.86, NTU 0.86 (no significant difference). Core+backbone density: THU 0.53 vs NTU 0.64 (significantly lower at THU). All members density: THU 0.09 vs NTU 0.16 (significantly lower at THU). This indicates closer collaboration among NTU’s prolific authors beyond the core. Rrg network density: NTU 0.028 vs THU 0.022. - Interdisciplinarity: Rrgs span multiple departments/schools at both universities (e.g., Materials, Engineering, Physics, Chemistry, Medicine; external institutes at NTU), evidencing interdisciplinary MSE teams. - Outputs and impact: Bbrg average outputs (THU vs NTU): Total publications 75.88 vs 42.19 (THU higher, p<0.01); publications per core+backbone capita 3.92 vs 3.00 (THU higher, p<0.01); most prolific author’s per-group publications 40.95 vs 25.19 (THU higher, p<0.01). Impact metrics: FWCI 2.08 vs 2.68 (NTU higher, p<0.01); citations per publication 25.58 vs 34.65 (NTU higher, p<0.01). - Case studies: THU (Kang Feiyu) vs NTU (Liu Zheng) Bbrgs: Core 52 vs 79; backbone 46 vs 67; marginal 1777 vs 1827; total papers 702 vs 497; FWCI 3.28 vs 3.15. Network density: core 1.0 for both; core+backbone 0.273 (THU) vs 0.307 (NTU); all members 0.008 vs 0.009. THU team shows strong central sub-team structures; NTU team has more dispersed centrality. - Funding concentration: The prevalence of Bbrgs and member ratios indicate concentration of research resources and alignment with Price’s Law; a "rich club" effect is suggested.
Discussion
The proposed multi-stage identification approach—integrating Price’s Law, Everett’s Rule, Jaccard merging, and Louvain clustering—successfully distinguishes project-based, individual-based, backbone-based, and representative research teams, while classifying member roles. Validation using SCIVAL topic clusters shows high topical cohesion among core/backbone members, supporting accuracy. Empirically, most university research activity is organized around backbone-based research groups, consistent with concentrated funding and productivity patterns. THU’s Bbrgs are more prolific in terms of publication outputs, whereas NTU’s exhibit higher academic influence (FWCI, citations per publication), suggesting that collaboration patterns and institutional cultures may shape impact beyond sheer productivity. The cross-departmental distribution of Rrgs confirms the interdisciplinary nature of MSE research, aligning with prior evidence that interdisciplinary teams occupy central positions and attain strong funding performance. Member structure adheres to Price’s Law: core/backbone members constitute roughly 10–15% of group membership yet account for the bulk of output, reflecting the role of leading scholars and stable senior researchers. Comparative analysis indicates that hierarchical, leader-centric structures (as in THU) may drive higher volume, while more distributed, open collaboration (as in NTU) correlates with higher impact per publication. These findings inform research management and funding strategies: balancing support between elite groups and smaller teams could mitigate over-concentration and potentially enhance overall influence.
Conclusion
This study contributes a comprehensive, programmable methodology to identify and classify university research teams by integrating co-authorship and project information with Price’s Law, Everett’s Rule, Jaccard-based merging, and Louvain community detection. It delineates four team types (Pbrt, Ibrt, Bbrg, Rrg) and member roles (core, backbone, ordinary, marginal), improving clarity on collaboration scope and intensity. Applying the method to two top MSE universities shows that Bbrgs predominate, member distributions follow Price’s Law, research groups are interdisciplinary, and collaboration structures differ across institutional cultures, with productivity-impact tradeoffs observed between THU and NTU. Future research could extend to additional fields and institutions, refine disambiguation and temporal modeling of teams, analyze the dynamics and roles of member categories over time, and evaluate how funding allocation strategies affect team structures, interdisciplinarity, and impact.
Limitations
- Scope: Analysis is limited to two universities and one field (Materials Science and Engineering), potentially constraining generalizability. - Data sources: Reliance on co-authorship and funding acknowledgments may omit informal collaborations or misattribute team boundaries. - Name disambiguation: Despite cleaning and deduplication, residual ambiguities may persist. - Temporal consistency: Publication-based networks may not perfectly align with actual collaboration periods; project inclusion windows were approximated (2011–2022) to cover 2017–2021 outputs. - Method thresholds: Use of Price’s Law cutoff, Everett’s Rule, and Jaccard similarity threshold (≥0.5) may influence team delineations; alternative thresholds could yield different results. - Member multiplicity: Researchers can belong to multiple teams; while addressed via similarity merging and Louvain clustering, some overlapping affiliations may remain. - Granularity: Rrgs aggregate key members and remove redundancies but do not capture detailed collaboration content within teams.
Related Publications
Explore these studies to deepen your understanding of the subject.

