Abstract
Network modeling transforms data into a structure of nodes and edges such that edges represent relationships between pairs of objects, then extracts clusters of densely connected nodes in order to capture high-dimensional relationships hidden in the data. This efficient and flexible strategy holds potential for unveiling complex patterns concealed within massive datasets , but standard implementations overlook several key issues that can undermine research efforts. These issues range from data imputation and discretization to correlation metrics, clustering methods , and validation of results. Here, we enumerate these pitfalls and provide practical strategies for alleviating their negative effects. These guidelines increase prospects for future research endeavors as they reduce type I and type II (false-positive and false-negative) errors and are generally applicable for network modeling applications across diverse domains.
Original language | American English |
---|---|
Journal | Patterns |
Volume | 2 |
DOIs | |
State | Published - Dec 2021 |
Keywords
- clustering
- community detection
- correlation
- gene co-expression analysis
- high-dimensional patterns
- network analysis
Disciplines
- Data Science