This essay seeks to examine the rapidly emerging production method of "Free Software" or "Opensource Software". It outlines analogues between Economic theory and Software Engineering in order to bring economic analysis to bear on the area.
Specifically, it aims to provide a quantitative analysis of what has until now been primarily examined in a qualititative way. Free Software has existed in one form or another since the very early days of computing, but very little attention has been paid to it until recently. Many Free Software projects have achieved significant or even dominant positions in their marketplace, and more firms are starting to utilise or release Free Software.
The two major ideological principles underlying Free Software are the protection of user/programmer choice, and the belief that the best solutions must be shared.
The first principle arose from Richard M. Stallman's dismay at the rise of proprietary software as the dominant format. Stallman believes that proprietary (also known as closed-source) software is a violation of the individual's right to choose other packages. He argues that access to the sourcecode grants freedoms to modify and augment without being "locked in" to one company's whims. Further, he argues that sourcecode access gives users a choice to go their own way, in defiance of the company's wishes should they be detrimental.
The second principle, that good solutions should be shared, arises from the so-called "Hacker Culture". Within this culture, brainpower is seen as a limited resource, which should not be wasted on unnecessarily reinventing the wheel. It is reasonable, therefore, that all solutions (embodied in sourcecode) should be available for anyone to use. The corrollary is that withholding solutions (or source) is effectively evil or sad, inasmuch as it is wasteful of resources. A sense of futility in this approach is summed up neatly by Philip Greenspun: "We're not fond of Bill Gates, but it still hurts to see Microsoft struggle with problems that IBM solved in the 1960s. Thus, we share our source code with others in the hopes that programmers overall can make more progress by building on each other's works than by trying blindly to replicate what was done decades ago."
The legal foundations of Free Software stem from a careful blend of Contract and Copyright laws. Free Software Licenses use the principles of Contract law to create their terms. Typically, these ensure that an author's version of the source is always available and that any modification made by anyone is likewise available under the same terms. Some licenses go so far as to impose these terms onto software where opensource code has been added as an external source.
If the user doesn't agree to the License, the law of Contract renders the terms of the License effectively powerless without punitive terms. However, at this point, standard Copyright laws take effect, and the user is granted no rights whatsoever. There is significant coercion, then, to accept the terms of the License on a contractual basis.
To place Free Software in an economic framework is considerably more difficult, but quite profitable. There are a range of issues and outcomes that emerge naturally from the application of elementary economic thought to a Free Software "economy". The main body of this essay assumes that just such a framework has been established. However, the description of such a framework is long and outside the scope of the main body of this essay, soh a short treatise on the elementary economic framework of Free Software can be found in Appendix IV.
According to Fred Brook's law, adding people to a late project makes it later. It's like adding gas to a fire. New people need time to familiarise ... their training takes up the time of ... [other] people ... and merely increasing the number of people increases the complexity and amount of project communication. Brooks points out that the fact that a woman can have a baby in nine months does not imply that nine women can have a baby in one month.Managers need to understand ... More workers working doesn't necessarily mean more work will get done.
--Steve McConnell, "Code Complete"
In his essay "The Magic Cauldron", Eric S. Raymond estimates that almost 95% of software development is "in-house". This is the traditional meal-ticket of programmers and software engineers. It is from this heartland that Brook's Law is drawn.
Brook's Law - specifically - is that adding people to a late project willy-nilly will only make it later. Brooks derived this law from his own personal experience as a project manager on IBM's original OS/360 project. In his book, The Mythical Man-Month, Brooks pointed out the fallacy of simply throwing more "man-hours" (labour units) at the project in order to deliver it earlier.
According to McConnell, Brook's analysis of his own laws suggests an exception to the rule. " ... if a project's tasks are partitionable, you can divide them further and assign them to ... people who are added late to the project."
In short, we can summarise Brook's Law in two parts:
Brooks gave several justifications for his law, outlined by McConnell above. One of the easier to seize upon, in economic and mathematical terms, is the complexity problems. It is argued that programming requires a large amount of communications between workers. It can be shown mathematically that if the number of programmers rises linearly, the number of possible communications paths between them rises quadratically. This is illustrated by the diagram below.
But Brook's Law is not original. In fact, the first famous instances of the principles that Brooks expounded are not found in software engineering - they are found in a rice paddy.
In the classic example of the Law of Diminishing Returns, many textbook authors ask us to imagine a rice paddy. We might start with one worker on this paddy, who is barely able to care for and harvest even a fraction of the paddy. In comparison to neighbouring paddies, this paddy is woefully inefficient compared to its neighbours.
So another worker is added. Productivity rises sharply, as two workers can now work the field. We can measure this rise in productivity in terms of the total output and the change in the output - the marginal output.
We continue to add workers. At first, they replicate each other's work, all, say, take a quadrant of the paddy and work it. Later, some will specialise. Some will care for the rice, some will harvest it. Productivity continues to rise.
But the trend is not endless. At a certain point, adding more workers no longer causes a rise in productivity. Perhaps these extra workers need to be trained by other workers. Perhaps they get in one another's way, or there are workers standing by, idle, as excess working capacity. In any case, the marginal rate of productivity begins to fall, followed by the average total output.
It is not difficult to draw parallels with software engineering. Indeed, if Adam Smith had been working today, he may have used a software project as his example!
Let us take a project with one programmer. This programmer has begun to code, but the project is large. He is unable to produce many lines of code on his own, having to continuously stop and refer to manuals for unfamiliar areas, even to remind himself of what part of the project he is dealing with.
Let us add another programmer. Suddenly, they can divide the work amongst them, working on two different parts of the program at once. Then we can add more programmers - including specialists. The program is divided and subdivided into smaller units of specific purpose, and the specialists can focus on these parts.
The subdivision of units and the matching of specialists with program components means that the productivity rises.
But again, we come to a certain point where it begins to falter. Programmers are added who need to be trained on the deep secrets of the existing work. They need to be introduced to procedures and their tools, diverting time for existing programmers. They add overhead to communications paths, and some may spend time fallow, dragging down the average total output once more.
In economic terminology, Brook's Law might be re-summarised thusly:
This is an example of a typical Law of Diminishing Returns graph. One line shows Average Total Output (AO), and the other line shows Marginal Output (MO).
The real signature of diminishing returns is the marginal output line. The MO line in mathematical terms, is the derivative of the ATO line. It shows the rate of change of the ATO at any given "x-value", or units of labour.
The classic Law of Diminishing Returns MO is shown. It rises, peaks, and then crosses zero at the point where ATO peaks. It becomes negative, causing the ATO curve to nose over and dive.
Examples from where hard, actual data could be drawn are legion. In this case, it is generally applicable that this pattern will occur where the Law is true. Indeed, it is the recurrence of this pattern that is often used as proof of the Law's applicability.
This graph takes a slightly different tack. Rather than showing Labour-units vs output, it shows "Men" vs "Months". What we see is the 'fruitbowl' curve. This curve is the same curve we see in the Law of Increasing Costs - the logical twin of the Law of Diminishing returns. Compare this graph with the one below.
So it becomes reasonable to assert that the Law of Diminishing Returns and Brook's Law are roughly equivalent. The terminology between them fluctuates, but the meaning and the consequent graphs are highly similar. And, just as the Law of Diminishing Returns has been demonstrated to appear over and over, so has Brook's Law. It is not a case of a coincidental match of graphs.
For the rest of this essay, Brook's Law and the Law of Diminishing Returns will be assumed to be functionally equivalent. This being so, it becomes viable to apply certain basic economic analyses to Brook's Law. In particular, we investigate the bold claim that Brook's Law can be broken.
The Bazaar concept is somewhat multi-faceted. Some key elements of Bazaar projects include:
In essence, Raymond argues that when Bazaar conditions exist, the opposite of Brook's Law becomes true: more programmers mean higher productivity. In doing so, he proposes several reasons why this might be so. These arise, almost naturally, from a combination of what Raymond considers the 'Hacker mindset'[4], ease of communication, and the open availability of sourcecode.
Raymond asserts that, for these reasons, open-source projects can break Brook's Law. In particular, he points to the parallel nature of open-source development, going so far as to say "Given enough eyeballs, all bugs are shallow", dubbing this "Linus' Law", in honour of the Linux system.
Previously, we sought to establish a link between Brook's Law and the Law of Diminishing Returns. The outcome was that Brook's Law is analogous to the Law of Diminishing Returns, but having been derived as a single case from one field of human endeavour, rather than as a general law. The two were shown to be equivalent in describing a process.
By implication, then, Raymond has asserted that the Law of Diminishing Returns can be broken in a Bazaar environment.
The GNOME has properties which make it an ideal source of data:
It is also the last property - the extensive use of CVS - that renders GNOME a useful source of data. The CVS system automatically keeps extensive logs of all programmer activity. For the GNOME project, all of this data is publicly available. It is these logs that form the tables on which this essay is based.
GNOME also happens to be a quite large project. It is not the largest Free Software project (the largest probably being the Linux operating system), but it is one of the largest with public-access CVS logs. It is its size which allows for the construction of smoother curves.
The five graphs reflect the two ways of viewing the GNOME project. The first view is GNOME as a single, conglomerated project. All data is combined for representation in these graphs; all programmers, all output in the form of lines of code changed (LOC).
There are three likely conclusions that we could draw from examination of these graphs. In answer to the question "Does Free Software Production in a Bazaar Obey the Law of Diminishing Returns?", we might find that:
These conclusions are dependent on the logical chain of argument that preceding body of the essay has sought to establish. In particular, it assumes that Brooks' Law and the Law of Diminishing Returns are functionally equivalent and that a Bazaar environment is a production process in the short-run. There are alternative conclusions if these assumptions are challenged, more detail on which is available in Appendix III.
Our examination of the data will now examine the GNOME project from two perspectives: as an aggregate set of data, and as a collection of subprojects.
This graph displays a smooth gradiation in the project's ATO as contributors are added. Further to which, the curve displays slight acceleration. This happens to be the same pattern as a very early-stage ATO graph in a Diminishing Returns situation - marginal output is still accelerating. However, this is not a conclusive proof, as the graph could (theoretically) now take any one of an infinite number of paths as more contributors are added. Thus, we can take the evidence presented by this graph as circumstantial only.
These graphs probably present a better picture of Bazaar behaviour than the Single Project view. This is because each project is largely self-contained and self-led. Whilst a certain proportion of the GNOME project is planned on a higher level than the subproject, a larger fraction is formed by projects that are operated somewhat independently of the larger project.
The subproject level also has a higher 'concentration' of data. In the aggregate graph, the contributions of programmers are counted once. However, contributors to GNOME often contribute to multiple projects. This means that in contributor terms, GNOME hosts a 'virtual' Labour force of more than its 300 or so programmers (this number varies, see Appendix III for more detail).
In the project-level view, the contributions made to each project are individually counted and then aggregated across the project. This means that GNOME's entire 'virtual Labour force' is wholly represented, instead of actual contributors.
The 'virtual Labour force' does pose some problems for dealing with the data, however. Refer to Appendix III.
Our first graph is the Average Total Output over Subprojects. This graph is formed by breaking down each project into levels of programmers versus LOC changed. These figures are then averaged across all the graphs.
This graph is similar to our very first graph. Once again, it rises steadily upwards. However, the graph is not nearly so smooth. Most notably, there is two sharp dips in the graph in the mid-fifties and low seventies. It must be noted that the general trend observed before persists, however. But, for some reason, projects in the region of mid-fifties and seventies seem to have a lower productivity than projects elsewhere in the curve.
This evidence, whilst less smooth than the aggregate single-project curve, it does roughly concur. There are no surprises here, in that sense. If we are to repeat the interperetation, we would again say that there is circumstantial evidence of the Law of Diminishing Returns applying, but that this evidence is by no means conclusive in its nature.
Our second graph is the Marginal Output curve, generated from the same data.
This graph is the most interesting from an economist's standpoint. A standard Law of Diminishing Returns Marginal Output graph would have, in theory, started positive and then dipped below into negative values.
This graph has done no such thing.
What we see instead is a pattern of wide oscillation, that generally does not fall below zero at all. Instead, what start as small, almost insignificant 'pulses' grow in amplitude as the number of contributors increases. From this graph, it is not possible to make any conclusive points with regards to applicability of the Law of Diminishing Returns.
If GNOME had obeyed the Law of Diminishing Returns, we would have seen the distinctive hill-and-valley curve in the marginal output.
Therefore, GNOME may be breaking the Law of Diminishing Returns. ,If GNOME had obeyed the Law of Diminishing Returns, we would expect to see the bell-shaped curve in the average total output graphs. We saw what might have been the beginning of such a curve in both cases.
Therefore, GNOME may be obeying the Law of Diminishing Returns.As the astute reader has noticed, these are contradictory. We are now forced to fall back on the balance of liklihood to make our conclusion.
Seeing as the Law of Diminishing Returns has not yet been broken, the author is inclined to say that:
GNOME probably obeys the Law of Diminishing Returns, but has not yet reached its production turning point.
The first is that the GNOME data-set was too small to be conclusive. Whilst there are hundreds of actual volunteers on the project, and a 'virtual Labour force' several times as large, the data simply has not shown us anything conclusive. A much larger set of data will be needed, in particular to:
The second consideration is the oscillatory nature of the MO curve. A standard MO curve moves in a definite way. This MO curve bears no resemblance to the standard one, giving credence to the possibility that Bazaars may, in fact, break the Law of Diminishing Returns.
In terms of assisting this research, a better range of tools needs to be made available and used for the gathering of project data. Whilst the CVS system keeps logs, these are by themselves not enough. Primarily, they are stored in a format which makes meaningful extraction of data difficult and less useful.
Sourcecode is the 'recipe' for a task a computer might operate. It tells a computer how the data it is using is structured, how to access it, and what to do with it. This can be expressed in a number of artificial "programming languages". If one has access to the sourcecode, it is possible to intimately understand how a program works. It also becomes possible to expand or modify its functionality; or to reuse pre-existing sourcecode in new programs.
Raymond's use of the term is, in fact, the original computer-world meaning. He takes a hacker to be a person who delights in problem solving (especially in programming), and who believes that once a problem is solved, the solution ought to be shared.
The more common use - intruder - is delineated by traditional hackers with the word "crackers". Hackers take great lengths to distance themselves from crackers and their activities.
Apart from its usefulness in centralising code storage and management, CVS provides change-tracking capabilities. It keeps 'delta-files', which list every change made to any file, at any time, by any programmer. This is normally used as a sort of super-powered "undo" function.
In order, the appendices are:
Firstly, then, I would like to thank my first teacher of economics, ("Sir") Andrew Boukaseff ("QC"). Boukaseff - known as bouka, seffsta, andrew, drew, joke and bloke to friends and students - was the kind of teacher everyone loves to get. He made economics real and relevant. In a subject where the numbers are critical, he reminded us that beneath the national income accounts and development theories there are real human beings. The lesson that sticks with me best is that economics is just one of the parts of the greater human story.
Secondly, I'd like to thank Eric S. Raymond. Eric's theories and papers provided the jumping-off point for a lot of my own work. He has given me moral support as I have slowly chipped away at this essay for the last two years; he was the first to agree to review it.
Thirdly, I'd like to thank my Theory of Knowledge teacher, Mrs Forbes-Harper. We've had our angry exchanges and disagreements over the years, but Mrs F-H has helped me more (than almost anyone else) to understand just how one goes about proving things.
Fourthly, I'd like to thank my father, Barry Chester. My dad has been a useful wellspring of critique over the years. Certainly, when I want to critique my own work, I find myself thinking "What would Dad say?"
Fifthly, I'd like to thank my second economics teacher, Tony Trickey. The trickster is apt to get angry when poked with a stick or likewise provoked, but he certainly hammers home the point that the numbers do matter. Don't worry sir, you didn't fret for nothing!
Sixthly, I'd like to thank Michael Zucchi, without whose help I would be paddling up a certain proverbial creek sans paddle. His programming ability and expertise with the GNOME CVS system made him the ideal canidate to generate the raw data I needed for this essay.
Next, I'd like to thank the band of online reviewers who saw my essay in various incomplete and generally spotty incarnations. Some of these have already been listed above, others include Miguel de Icaza (leader of the GNOME project) and the license-discuss list hosted by the Open Source Initiative. Thanks, guys.
Finally, but perhaps most of all, I would like to thank Richard M. Stallman.
More than anyone else on this list, Stallman (RMS, as he is widely known) is a 'demi-god' in the hacker community. Even so, he has taken out much of his time to kindly help me with my work, pointing out items of representation and wording that could be fixed.
He is a programming genius, probably one of the greatest of all time. His calm approach, and unflinching resolve to uphold his principles, make him a leader and visionary who will - I think - one day be listed alongside the likes of Martin Luther King and Ghandi.
To anyone else I have neglected to mention - thankyou. Thankyou so very, very much in helping me on this long journey. I feel like I have been doing this forever, and now it's done! Best wishes to all of you, and be well.
The intent of the Open Source Definition is to write down a concrete set of criteria that we believe capture the essence of what the software development community wants ``Open Source'' to mean -- criteria that ensure that software distributed under an open-source license will be available for independent peer review and continuous evolutionary improvement and selection, reaching levels of reliability and power no closed product can attain.
For the evolutionary process to work, we have to counter short-term incentives for people to stop contributing to the software gene pool. This means the license terms must prevent people from locking up software where very few people can see or modify it.
Open source doesn't just mean access to the source code. The
distribution terms of open-source software must comply with the
following criteria:
1. Free Redistribution
The license may not restrict any party from selling or giving away the
software as a component of an aggregate software distribution containing
programs from several different sources. The license may not require a
royalty or other fee for such sale.
By constraining the license to require free redistribution, we
eliminate the temptation to throw away many long-term gains in order
to make a few short-term sales dollars. If we didn't do this, there
would be lots of pressure for cooperators to defect.
We require access to un-obfuscated source code because you can't evolve
programs without modifying them. Since our purpose is to make
evolution easy, we require that modification be made easy.
The mere ability to read source isn't enough to support independent
peer review and rapid evolutionary selection. For rapid evolution to
happen, people need to be able to experiment with and redistribute
modifications.
Encouraging lots of improvement is a good thing, but users have a
right to know who is responsible for the software they are using.
Authors and maintainers have reciprocal right to know what they're
being asked to support and protect their reputations.
Accordingly, an open-source license must guarantee that
source be readily available, but may require that it be
distributed as pristine base sources plus patches. In this way,
"unofficial" changes can be made available but readily distinguished
from the base source.
In order to get the maximum benefit from the process, the maximum
diversity of persons and groups should be equally eligible to
contribute to open sources. Therefore we forbid any open-source
license from locking anybody out of the process.
Some countries, including the United States, have export restrictions
for certain types of software. An OSD-conformant license may warn
licensees of applicable restrictions and remind them that they are
obliged to obey the law; however, it may not incorporate such
restrictions itself.
The major intention of this clause is to prohibit license traps
that prevent open source from being used commercially. We want
commercial users to join our community, not feel excluded from it.
This clause is intended to forbid closing up software by indirect means
such as requiring a non-disclosure agreement.
This clause forecloses yet another class of license traps.
Distributors of open-source software have the
right to make their own choices about their own software.
Yes, the GPL is conformant with this requirement. GPLed libraries
`contaminate' only software to which they will actively be linked
at runtime, not software with which they are merely distributed.
Again, due to contraints placed upon the main body of this essay,
there is scant space in which to fairly and properly address these
items of contention. As such, the author finds it necessary to
instead provide a series of possible problems and flaws in the
essay. In the spirit of laissez faire, these are the
caveats - literally, the "Bewares".
These come from two sources. One is from the Author's own attempt
to self-critique. The second is from a small body of reviewers who
have offered varying degrees of insight. These people are listed at
length in Appendix I.
These caveats can take several 'genres':
But really, such categorisations are without limit. Instead, we
treat the caveats where they rise: essentially, from the logical
structure of the essay. Where either a premise is made or a
derivation drawn, there is room for contention. Indeed, at many
links in the logical chain, there arose caveats.
A summary of the essay's logical structure might be rendered
thusly:
If, at each stage, you accept the essay's assumptions and premises,
the logical flow is basically self-consistent and valid. But, if
you do not, there is room for contention.
The major caveats here are twofold. Firstly, there is the assignment
of factors of production. It may be that ideas are not Land at all,
but are merely Capital. In this case, the fixed factor of the essay
would be Capital.
Secondly, there is contention about whether it is a fixed factor at
all. Whilst the fixed-factor argument is justified, it is open to
attack. Primarily, by the argument that Bazaars are evolutionary
in nature; and thus, do not have a fixed set of ideas to work to at
all.
There is a minor issue of understanding, also, with the use of the
word "Free". The word "Free" appears in two contexts in Appendix
IV. The first is in the libertine sense: Free for Freedom. The other
is in the Economic sense Free for Free Good.
Richard Stallman noted that "This is a legitimate question for study, but to put the question in
context, it is important to note at the beginning that economics can
only partly explain what happens in free software. Free software
development as an activity often has strong noneconomic motivations,
including political idealism, and free software as a social phenomenon
is influenced by many noneconomic factors, such as community spirit."
Richard asked me to switch from using the term "Free" in reference to
Free Goods, in favour of "zero-price". Whilst this term would reduce
confusion, I felt it would fail to convey the additional meaning the
word "Free" carries in economics.
In particular, the exact wording of Brooks' Law is that "Adding
people to a late project makes it later"; whereas I had included
the "efficient partitioning" component.
In regards to the 'extended' version of Brooks' Law, I rely on Steve
McConnell's Code Complete as my authority. He describes
further analysis that Brooks applies to his own law, which includes
the 'extended' version.
Firstly, in the extreme case, this section may be entirely wrong.
The reasoning is by analogy and graphical comparison. There is no
formal or mathematical equivalence at work. The equivalence is
defendable only in qualitative terms.
Secondly, in the less extreme case, Eric S. Raymond says that
"I strongly suspect [that] ... Brooks's Law is not precisely equivalent to LODR, but is rather a special
case of it involving particular nonlinear scaling phenomena. Accordingly,
one may assert that the bazaar mode repeals Brooks's Law without making
any commitment about the applicability of the LODR in general."
Otherwise, this section was not contended by the reviewers. Most
notably, Eric S. Raymond (author of the paper in which the Bazaar
was introduced) did not contend.
For the purposes of the essay, it is assumed that the premises
are both true. If they are not, then this derived answer would
be thrown into doubt.
Firstly, there is the data itself. Whilst the GNOME project has
relatively usable data in its CVS system, this data is not
likely to be entirely accurate. This is due largely to the
problem of contributor ambiguity.
The CVS system is only able to keep track of contributors
based on the identity they supply. As such, there is a strong
possibility that some of the contributors are in fact 'duplicates',
where the work contributed by one GNOME project member is
logged against several identities.
Estimating the scale of this ambiguity is difficult. The
raw CVS data (as supplied to the author by Michael Zucchi)
included a list of 'possible other contributors'. These
were based on emails, but once again, email addresses can
(and do, in the GNOME project membership) change.
Second is the formation of the graphs themselves. There is
a possibility that the author's construction of the graphs
is flawed. The reader is advised that Appendix V includes
program listings for this essay, in the popular programming
language, Perl.
Third is the size of the data set. Whilst it was large
enough to cause the author headaches, it may not be large
enough to iron out statistical anomalies and was certainly
not large enough to reach a decisive conclusion.
A more in-depth study would ideally test data against an
expected value or values in any number of variables.
Yet very little in the way of research has been done in this area. It rests as
a vast, untapped goldmine for the social sciences. Certain elementary
rules appear have changed: but the changes have thus far been remarkably
successful. Reasons as to why this is so are likely to form the basis of
many fruitful years of research to come.
This appendix attempts to take some of the most elementary tenets of
Economics and apply them to the Free Software world. It serves two
functions. In the first and more general instance, it provides a kind
of "translating dictionary" function for both Economists and the
Free Software community. It hopefully allows for meaningful insight
on both sides of the fence.
In the second instance, it exists as a supporting document to the
main body of this essay. As well as highlighting areas of connection,
it shows which things are considered to be 'true' for the
purposes of the essay. This is an important issue, where even assigning
Free Software items into the Factors of Production is fraught with
possible oversights and flaws. This essay will point out these danger
zones, and indicate which 'truth' the essay relies upon for its
theses and conclusions.
This appendix will address these issues in order. This list may seem
like the table of contents for an economics textbook. This is
purely because typing " .. and Free Software" is annoying and
redundant for both author and reader.
Sourcecode is a series of instructions in a language
designed to be readable by humans. In the act of programming,
the most common process is to transcribe ideas into a computer
program by creating sourcecode describing the ideas.
This sourcecode is then 'compiled'. Compilation is the act
of turning the sourcecode into machine-code, the language that
the computer itself understands.
Once a program has been compiled into machine-code, there are
two major implications:
The first step is to try and classify Free Software as a Good.
In economics, we discuss 'Free Goods' and 'Economic Goods'. The
economic meaning of the word 'Free' is very different from the
meaning meant by 'Free Software', so it is worth clearing this
point up.
The word 'Free', as used in 'Free Software', explicitly refers
to the libertine aspect of the software. It does not refer to
the economic meanings of price or scarcity.
In order to define the sourcecode into Free or Economic categories,
we summarise the definitions of both.
An Economic Good is a good where:
For reasons that will be further outlined in this appendix, it
is arguable that Free Software is:
We define choice as the act of distinquishing between
alternative courses of action. Choice is usually based on the
concept of scarcity and opportunity cost, where
opportunity cost is "the next best choice".
However, we have already defined Free Software as being a
Free Good. Therefore, the opportunity cost of obtaining it
is near-zero. However, the opportunity cost of producing
Free Software is an entirely other item of concern. By
no means is there a near-zero opportunity cost for the
production of Free Software. It takes time and considerable
mental effort to create quality sourcecode. This is time and
mental effort that could be expended in any number of alternative
ways; some including increased monetary payment.
The Free Software programmer is part of a production process.
They are producing sourcecode. As with all production
processes, they are implicitly or explicitly resolving three
fundamental questions:
As regards to What to Produce, there are three basic
hypotheses:
This list is abbreviated from several competing, similar hypotheses
that have been proposed. Essentially, they cover the field. Some
also tend to include "Big Itch" in "Homesteading", as an example of
satisfaction through altruism.
As regards to How to Produce, the potential answers are varied.
One popular option is Raymond's "Bazaar" model. One of the claims
made about the Bazaar model is examined by this essay; and the
Bazaar model has already been outlined before.
As regards to For Whom to Produce, the answer is heavily
influenced by the What question. Programmers have either
worked in their self-interest (Developer's Itch, Homesteading); or
they have worked for 'everyone' towards an ideal or an
altruistic motive (Big Itch).
The economic meaning of the word "Land" probably causes more
confusion than all other economic jargon confined. While, certainly,
Land can include real estate, it is not only real estate
than can be Land. Land is considered to be any naturally-occuring
thing included in the production process. It can be the real-estate
used for a farm or a factory, it can be minerals extracted from
the ground, it can even (for a tourist destination) be constant,
dependable sunshine.
Assuming that Free Software takes place in a networked, online
environment (as the Bazaar states it must), real estate is of
almost no concern to a Free Software production process. The
only possibly naturally-occuring item in Free Software is the
ideas from which software is made. More on that below.
Labour is easier to map to general terms. Labour is the
component that does human work. In Free Software, the programmers
can be counted as Labour.
After Land, Capital is probably the next most confusing
hijacking of a word by economics. In a production process, Capital
does not mean cash. It refers to manufactured items which assist
production. Essentially, this means machines and components. Just
as with Land, it can either be a part of something (just as mineral
ores are used to make steel) or it can help something to occur
(just as sunshine encourages tourists to vist).
Capital in the Free Software world breaks down into two areas,
then: Tools and Libraries. Tools are programs that
facilitate the creation of sourcecode: sourcecode editing
programs, compilers, revision management systems and the like.
Libraries are components which can provide significant portions
of any new sourcecode's functionality.
Enterprise is the factor which draws all the others
together. It is the component that makes the decisions, takes
the risks, and receives the bulk of the profits for those
risks. In Free Software, Enterprise is usually a matter of
trust rather than official authority. Since there are no means
of punitive enforcement over the Labour, Enterprise programmers
can lead only by merit and consensus. Otherwise, their Labour
will desert them for other projects.
That ideas are Land - naturally occuring - has been an issue of
debate for thousands of years. Rather than engaging in an
epistomological argument, this essay assumes that ideas are
naturally-occuring. It also takes the line that sourcecode, as
a production process, turns ideas into sourcecode.
It can then be said that a program design is a selection of
which ideas to implement. Since most Free Software projects
work towards the implementation of only a few ideas at a time,
it is taken that Land (in the form of ideas) is the
fixed factor in Free Software production. This allows the essay's
use of the Law of Diminishing Returns to apply.
In private enterprise, is common that Ownership and Control
are effectively divided. Shareholders, while ultimately in
control of their holding, give authority to corporate executives
to take risks on their behalf. In return, they reap the
benefits.
In Free Software, Ownership rests with the author of the code.
This is in keeping with standard copyright laws. Control,
however, is markedly reduced. Free Software licenses specifically
waive much of the legally-enforced control that an author has
over their sourcecode.
This means that control effectively rests with community
consensus. If the sourcecode producing community wishes that
the sourcecode goes in a certain direction, the chances are
that it will. If there both a powerful want to take a
certain direction and powerful resistance to that direction,
sourcecode often 'forks'. This means that the community that
has grown up around certain sourcecode splits up, each taking
their own copy of the code. This is akin to genetic mutation
in ecosystems.
The same sourcecode is propagated across many points of
access, and anyone on the internet is able to access any
of these points easily. The sourcecode is identical or
close to identical across these points. The internet allows
for nearly-instantaneous transmission of information about
suppliers' products throughout the market.
Finally, there is a push in Free Software to sell by
product differentiation. One such firm is Red Hat, which
has established brand-recognition in the marketplace by
selling pre-packaged, easy-to-use Free Software. While
anyone else can use this software, Red Hat has differentiated
itself from the market effectively enough to assume a
commanding position.
These scripts were written for Perl, version 5.005_03 built for BeOS on
the x86 platform. Note that the auto-execution line ("#!/") is set to
the BeOS default, and will not work on linux or other unices without
adjustment.
2. Source Code
The program must include source code, and must allow distribution in
source code as well as compiled form. Where some form of a product is
not distributed with source code, there must be a well-publicized
means of obtaining the source code for no more than a reasonable
reproduction cost -- preferably, downloading via the Internet without
charge. The source code must be the preferred form in which a
programmer would modify the program. Deliberately obfuscated source
code is not allowed. Intermediate forms such as the output of a
preprocessor or translator are not allowed.
3. Derived Works
The license must allow modifications and derived works, and must allow
them to be distributed under the same terms as the license of the original
software.
4. Integrity of The Author's Source Code.
The license may restrict source-code from being distributed in modified
form only if the license allows the distribution of "patch files" with
the source code for the purpose of modifying the program at build time.
The license must explicitly permit distribution of software built from
modified source code. The license may require derived works to carry a
different name or version number from the original software.
5. No Discrimination Against Persons or Groups.
The license must not discriminate against any person or group of persons.
7. Distribution of License.
The rights attached to the program must apply to all to whom the program
is redistributed without the need for execution of an additional license
by those parties.
8. License Must Not Be Specific to a Product.
The rights attached to the program must not depend on the program's being
part of a particular software distribution. If the program is extracted
from that distribution and used or distributed within the terms of the
program's license, all parties to whom the program is redistributed should
have the same rights as those that are granted in conjunction with the
original software distribution.
9. License Must Not Contaminate Other Software.
The license must not place restrictions on other software that is distributed
along with the licensed software. For example, the license must not insist
that all other programs distributed on the same medium must be open-source
software.
Appendix III: Caveats
The deep waters of Free Software and research are untested. There
are areas where deep points of disagreement can arise in any
paper which strives (as this one does) to objectively analyse
the field.
1) Describe Free Software
in terms of elementary
Economic theory
|
2) Define Free Software
|
3) Describe Brooks' Law
|
4) Use analogy to equate
Brook's Law and the
Law of Diminishing
Returns
|
5) Introduce Raymond's
Bazaar, including
"Breaks Brooks' Law"
assertion.
|
6) From premise that
Brook's Law = LODR
and from premise
that Bazaars break
Brooks' Law, derive
that Bazaars can
break the LODR.
|
7) Select and introduce
the GNOME test-case.
|
8) Using GNOME data,
create graphs.
|
9) Use graphical patterns
to draw conclusions on
Raymond's assertion.
Describe Free Software in terms of elementary economic
theory.
This is not part of the main body, but appears largely in Appendix
IV.
Define Free Software
There were no real objections to this part of the essay. Reviewers
commented on the use of words, rather than the content. It does not
significantly alter the logical flow of the essay. Rather,
it serves to introduce the reader to the subject of study.
Describe Brooks' Law
There was some disagreement about my description of Brooks' Law.
Use analogy to equate Brooks' Law and the Law of Diminishing
Returns
This is the the most important part of the essay. There are
two levels of objection here.
Introduce Raymond's Bazaar, including "Breaks Brooks' Law"
assertion
This section does not have any caveats in and of itself. There is
the possibility that Raymond's hypotheses are in themselves flawed.
If this is the case, this essay is built upon a flawed base.
From premise that Brook's Law = LODR and from premise that Bazaars
break Brooks' Law, derive that Bazaars can break the LODR.
This section is, in terms of derivative logic, correct. If the
two premises are true, the conclusion must also be true.
Select and introduce the GNOME test-case.
There is some contention about the choice of the GNOME project
as a test-case. Some reviewers argued
that each major Free Software project has its own way of
organising itself, and that the conclusion would not be generally
applicable across all Free Software projects.
Using GNOME data, create graphs.
This is perhaps the richest vein of caveats.
Use graphical patterns to draw conclusions on Raymond's assertion.
Like the earlier use of graph comparison, the use of graphs
to draw conclusions is not perfect. At best, we will be able
to assert qualititative outcomes on quantitative data.
Appendix IV: Economic Principles of Free Software
As a field of research, the world of Free Software and/or Opensource Software
is relatively untouched. It is the child of a thriving subculture culture and
several obscure accidents of history; but its implications have changed,
change, and will continue to change the world. Among the children and siblings
of Free Software, we can count the Internet, email, the World Wide Web,
the GNU/Linux operating system; and we can count things that rely on such
software. Amongst these ranks are endless websites, commercial firms, governments
and educational institutions.
Sourcecode as a Good
This essay assumes that the sourcecode is the product.
Hence, the act of compilation has been ignored in this essay.
Instead, the focus is on sourcecode as the final outcome of the
Free Software production process.
A Free Good is a good where:
Therefore, it is reasonable to assert that such sourcecode is a
Free Good.
Scarcity and Choice
In Economics, our focus is on decisions made in conditions of
scarcity. Indeed, many textbooks cite economics as the
study of meeting 'unlimited wants with limited means'.
The Three Questions of Production
And so we shift from the scarcity-choice (or rather, the
abundance-choice) world of the sourcecode consumer to the
scarcity-choice world of the programmer.
The Factors of Production
The production process, in economic terms, is the act of putting
the four Factors of Production in a black box and shaking them
up a little. The four factors are Land, Labour, Capital and Enterprise.
We will show how these factors map to Free Software.
Ownership and Control
Ownership and Control are two aspects of the production
process that are related but seperate. Ownership is the means
whereby one gains the rewards for possession of something,
Control is the means whereby one takes risks with something.
The Marketplace
Sourcecode has a marketplace of consumers and producers.
Indeed, it is probably the purest form of the the Free Market
that there is. Specifically:
Appendix V: Program Listings
This appendix provides sourcecode listings for the Perl programs used
in the creation of the essay's graphs. In the spirit of Free Software,
every listing in this appendix is provided under the terms of the
GNU General Public License. No Warranty is provided; these programs are
provided As-Is.
strip_pvloc.pl
#!/boot/home/config/bin/perl -w
#
# prog_v_loc.pl: Produces data for Programmers vs LOC graphs based on CVS
# data extracts.
# Load file
print "\nstartup succeeds";
$InFile = 'scanall.out';
$OutFile = 'gnome.strip';
&SpliceToFile;
sub SpliceToFile {
open(CVSIN, $InFile) or die print "\nno such file.\n";
open(OUTFILE, ">$OutFile");
print "\nopen file succeeds\n";
# Parse file
$k = -1; # A nasty hack, as $i is outside the scope of the
# if block which outputs the stripped data to file.
# $k is set to -1 to get past the first line cleanly.
for($i=0;
strip_proj.pl
#!/boot/home/config/bin/perl -w
#
# prog_v_loc.pl: Produces data for Programmers vs LOC/Project graphs based on
# CVS data extracts.
# Load file
print "\nstartup succeeds";
$InFile = 'gnome.exp.csv';
$OutFile = 'gnome.proj.strip';
&SpliceToFile;
sub SpliceToFile {
open(CVSIN, $InFile) or die print "\nno such file.\n";
open(OUTFILE, ">$OutFile");
print "\nopen file succeeds\n";
# Parse file
$k = -1; # A nasty hack, as $i is outside the scope of the
# if block which outputs the stripped data to file.
# $k is set to -1 to get past the first line cleanly.
for($i=0;
prog_v_loc.pl
#!/boot/home/config/bin/perl -w
#
# Note that the above auto-exec line is for the BeOS, not Linux/Unix.
#
# prog_v_loc.pl: Produces data for Programmers vs LOC graphs based on massaged
# CVS data.
# Load file
print "\nstartup succeeds";
$InFile = 'gnome.strip.sorted';
$OutFile = 'gnome.pvloc.dat';
# Load up the @GraphData array.
open(INFILE, $InFile) or die print "\nno such file.\n";
for($i=0;
prog_v_projloc.pl
#!/boot/home/config/bin/perl -w
#
# Note that the above auto-exec line is for the BeOS, not Linux/Unix.
#
# prog_v_loc.pl: Produces data for Programmers vs LOC graphs based on massaged
# CVS data.
# Load file
print "\nstartup succeeds";
$InFile = 'gnome.strip.sorted';
$OutFile = 'gnome.pvloc.dat';
# Load up the @GraphData array.
open(INFILE, $InFile) or die print "\nno such file.\n";
for($i=0;
prog_v_mloc.pl
#!/boot/home/config/bin/perl -w
#
# Note that the above auto-exec line is for the BeOS, not Linux/Unix.
#
# prog_v_mloc.pl: Produces data for Marginal TO vs No. Programmers graphs based
# on massaged CVS data.
# Load file
#print "\nstartup succeeds";
$InFile = 'gnome.proj.strip-sorted';
$OutFile = 'gnome.tp_v_mloc.dat';
# Load up the @GraphData array.
open(INFILE, $InFile) or die print "\nno such file.\n";
for($i=0;