Skills, division of labor and performance in collective inventions: Evidence from open source software

https://doi.org/10.1016/j.ijindorg.2009.07.004Get rights and content

Abstract

This paper investigates the skills and the division of labor among participants in collective inventions. Our analysis draws on a large sample of projects registered at Sourceforge.net, the world's largest incubator of open source software activity. We test the hypothesis that skill variety of participants is associated with project performance. We also explore whether the level of modularization of project activities is correlated with performance. Our econometric estimations show that skill heterogeneity is associated with project survival and performance. However, the relationship between skill diversity and performance is non-monotonic. Design modularity is also positively associated with the performance of the project. Finally, the interaction between skill heterogeneity and modularity is negatively associated with performance.

Introduction

Collective inventions among profit-seeking individuals and organizations have become popular in the economics literature since the seminal paper of Robert Allen (1983) on the iron district of Cleveland in the nineteenth century. More recently, collective inventions have come to the forefront of economists' attention because of the diffusion of open source software (OSS hereafter). OSS can be viewed as a ‘virtual’ community of practice made up of inventors who voluntarily contribute to multiple collective inventions. OSS offers expert developers the opportunity to participate in innovation networks which are, to some extent, reminiscent of the communities of users in the early age of computing (Steinmueller, 1996, Torrisi, 1998) or other user-centered innovation processes such as those analyzed by von Hippel (1988).

Most studies have attempted to explain why a growing number of independent developers (‘hackers’) voluntarily disclose their inventions. Several theoretical works seek to understand not only the motivations for disclosure of the source code, but also the social norms and the patterns of collaboration among distributed developers, and the implications for efficiency and social welfare (e.g. Raymond, 1999, von Hippel, 2001, Lerner and Tirole, 2002, Johnson, 2002, Harhoff et al., 2003, Dalle and David, 2005).

Empirical studies (e.g. Lakhani and von Hippel, 2003, Hertel et al., 2003, Lakhani and Wolf, 2005) also ask why hackers freely reveal information and what is the contribution of single participants to the productivity of specific OSS projects. However, little is known about the determinants of OSS projects' performance on a larger scale.

Our paper uses a large sample of OSS teams to study the association between a project's performance (measured by bugs and patches fixed, new feature requests completed, new file releases and changes made to the project's source code) and two important dimensions of team production — skill composition and the level of modularity of project activities.

Our analysis draws on two streams of the literature. The first one is rooted into team production theory. Team production requires collaborative skills, i.e. communication ability (people skills), leadership, and the ability to carry out multiple tasks. These skills add to specialized technical skills, thereby expanding production possibilities. Collaborative skills also favor the “discovery of ways to assign, organize, and perhaps alter tasks to produce more efficiently” (Hamilton et al., 2003: p. 470). Moreover, most importantly for this paper, heterogeneity among team members favors mutual learning and intra-team bargaining, creating opportunities for nonmonetary benefits such as a stimulating working environment, peer recognition and decisional authority (Hamilton et al., 2003).

In this setting we analyze the association between project performance and skill heterogeneity of its members. We expect that skill heterogeneity is positively associated with project performance in teams of open source software developers (Galunic and Riordan, 1998, Sutton and Hargadon, 1997). We focus on two dimensions of skill heterogeneity. First, individual participants must be prepared to carry out multiple tasks whose fulfillment requires a variety of skills. Second, open source participants may have different levels of commitment to single projects. In particular, we can distinguish core developers, who are highly committed and presumably highly experienced people, from the varied community of contributors, who occasionally participate in problem solving by supplying patches, reporting bugs or asking for assistance. Then, it is likely that the level and composition of skills vary across different categories of participants.

The second research line is associated with a key characteristic of the modern organization design that is modularity (Milgrom and Roberts, 1990, Milgrom and Roberts, 1995). Modularity in design and production has been defined as a strategy for “building a complex product or process from smaller subsystems that can be designed independently” (Baldwin and Clark, 1997, p. 84). In modular production, the value generated by each module can be separated from the total outcome. Moreover, modularity allows for experimentation and innovation, increases the efficiency of design activities, favors mutual learning between team members, and stimulates innovation (Baldwin and Clark, 1997, Langlois, 2002, Pil and Cohen, 2006). Thus, we posit that the level of design modularization or division of tasks at the project level is correlated with observable differences in performance across open source software projects.

This paper provides a novel empirical contribution to the literature on the economics of collective inventions. Our contribution is twofold. First, unlike many previous works that have focused on one or a few open source software projects, we provide an empirical investigation based on a large sample of OSS projects hosted by the SourceForge.net website, one of the largest repositories of OSS activity. To our knowledge, this is one of the few attempts to provide a systematic empirical analysis of multiple dimensions of OSS projects. Second, we focus on a crucial economic issue and examine open source projects with the aim of understanding the association between performance and key project characteristics — team members' skill composition and design modularity.

This paper is organized as follows. Section 2 discusses the theoretical background. Section 3 presents the data. Section 4 illustrates the methodology for estimating the relationship between skills and modularity and project performance. Section 5 analyzes the empirical results. Section 6 concludes.

Section snippets

Theoretical background

Our paper focuses on two dimensions of collective inventions: (i) the diversity of skills of team members and (ii) modularity.

Data and variables

Our empirical analysis draws on a unique dataset containing information on OSS projects and individuals (registered users). The dataset has been built from data provided by SF.net over the period November 3rd 1999 to January 10th 2003.

The number of projects registered at SF.net has increased quite rapidly since its foundation. In May 2008, the number of registered projects was about 178,000 and the number of registered contributors was more than 1,800,000. Other websites hosting open source

Econometric method

To study how skills and modularity are associated with project survival and performance we have estimated three sets of equations.

First, we estimated the probability of project survival with a logistic regression. The dependent variable is ACTIVE, which is equal to 1 when at least one of the following events has occurred during the period October 2002 to December 2002: a fixed bug, a fixed patch, a fulfilled feature request, a change to the project source code (a CVS commit sent to the CVS

Descriptive statistics

Table 1 summarizes the variables used in the empirical analysis and provides some descriptive statistics for the sample of 9076 projects. For about 26% of these projects (2372) data on skills are missing or not reported. Given the importance of this variable in our analysis, we compared these two categories of projects and checked for statistically significant differences in the two distributions. About 29.13% of sample projects that report data on skills are active while the same percentage is

Discussion and conclusions

Our analysis provides novel empirical evidence on an important example of collective inventions. The analysis draws on two streams of the literature. First, the theory of teams has submitted different, contrasting hypotheses about the effect of team composition on team performance. In particular, this paper has addressed the association between skill heterogeneity and performance. Second, the economics of modern production has explored the importance of modular design and production for

Acknowledgements

We thank Gaia Rocchetti for her valuable collaboration in the early stages of our research project and the administrators of SourceForge.net for providing us with their data on open source software projects and individual contributors. We also thank a sample of SourceForge developers for helping us to interpret some key variables in the dataset. We thank Bruno Cassiman, Paul David, Neil Gandal, Walter Garcia Fontes, Joachim Henkel and Gregorio Robles for useful comments to earlier drafts of the

References (52)

  • G. Robles et al.

    Beyond source code: the importance of other artifacts in software development (a case study)

    Journal of Systems and Software

    (2006)
  • C.Y. Baldwin et al.

    The architecture of participation: does code architecture mitigate free riding in the open source development model?

    Management Science

    (2006)
  • C.Y. Baldwin et al.

    Managing in an age of modularity

    Harvard Business Review

    (1997)
  • K.A. Bantel et al.

    Top management and innovations in banking: does the composition of the top team make a difference?

    Strategic Management Journal

    (1989)
  • R. Blundell et al.

    Dynamic count data models of technological innovation

    Economic Journal

    (1995)
  • T.F. Bresnahan et al.

    Information technology, workplace organization and the demand for skilled labor: firm-level evidence

    Quarterly Journal of Economics

    (2002)
  • S. Brusoni et al.

    Unpacking the black box of modularity: technologies, products and organizations

    Industrial and Corporate Change

    (2001)
  • E. Caroli et al.

    Skill-biased organizational change? Evidence from a panel of British and French establishments

    Quarterly Journal of Economics

    (2001)
  • K. Crowston et al.

    Information systems success in free and open source software development: theory and measures

  • J.M. Dalle et al.

    The allocation of software development resources in open source production mode

  • M.S. Elliot et al.

    Free software: a case study of software development in a virtual organizational culture

    (2003)
  • S. Faraj et al.

    Coordinating expertise in software development teams

    Management Science

    (2000)
  • Fershtman, C., Gandal, N., 2004. The determinants of output per contributor in open source projects: an empirical...
  • D.C. Galunic et al.

    Resource recombinations in the firm: knowledge structures and the potential for Schumpeterian innovation

    Strategic Management Journal

    (1998)
  • B.H. Hall et al.

    The patent paradox revisited: an empirical study of patenting in the US semiconductor industry 1979–1995

    RAND Journal of Economics

    (2001)
  • B.H. Hamilton et al.

    Team incentives and worker heterogeneity: an empirical analysis of the impact of teams on productivity and participation

    Journal of Political Economy

    (2003)
  • Cited by (0)

    View full text