Identifying Influential Software Developers Using Social Network Analysis
Teaching
Tasks
Due to the rise of social coding, software developers can track activities and knowledge across various projects on different project hosting platforms (e.g. GitHub). Software development is heavily dependent on the participants and their roles in the process. Each developer has specific skills and interests, contributing to projects in different ways with varying reputational consequences.
Collaborative software development makes an investigation of network structures on social coding sites valuable. Community ties in developer networks are thought to be instrumental to project success. Most research has focused on factors affecting team-level performance, with little systematic work on what drives individual developer performance. These top contributors are critical to the community since they drive the development of the projects and are essential for creating and sharing knowledge. Identifying these influential individuals and the factors that lead to “stardom” could be highly valuable.
Subtasks
- Contribute to the body of knowledge on social coding by investigating the network structure of social coding in GitHub and possibly StackOverflow.
- Construct a social network of software developers (developer-developer and project-project relationship graphs).
- Choose appropriate analysis strategies to identify influential developers (compute various characteristics of the graphs based on social network approach consistent with the core principles of structural/relational analysis).
Expectations
- Strong analytical skills and passion for data
- Skills in R or Python
- Very good knowledge of English
- Ability to work independently and in a goal-oriented manner
Ideally:
- Experience in graph theory
- Experience in SQL
- Experience in machine learning
- Experience in using High-Performance-Compute-Cluster (HPC)
