Abstract
Network modeling transforms data into a structure of nodes and edges such that edges represent relationships between pairs of objects, then extracts clusters of densely connected nodes in order to capture high-dimensional relationships hidden in the data. This efficient and flexible strategy holds potential for unveiling complex patterns concealed within massive datasets , but standard implementations overlook several key issues that can undermine research efforts. These issues range from data imputation and discretization to correlation metrics, clustering methods , and validation of results. Here, we enumerate these pitfalls and provide practical strategies for alleviating their negative effects. These guidelines increase prospects for future research endeavors as they reduce type I and type II (false-positive and false-negative) errors and are generally applicable for network modeling applications across diverse domains.
| Original language | American English |
|---|---|
| Journal | Patterns |
| Volume | 2 |
| DOIs | |
| State | Published - Dec 2021 |
Keywords
- clustering
- community detection
- correlation
- gene co-expression analysis
- high-dimensional patterns
- network analysis
Disciplines
- Data Science