In March 2006, David A. Patterson wrote an article for entitled "Computer science education in the 21st century". In this article -- which, sadly, you cannot read unless you are an ACM member -- he advocated a few fundamental changes to computer science education. One of the changes he advocated: the inclusion of courses in open source software development in the standard undergraduate computer science curriculum.
David A. Patterson was, at the time, the president of the Association for Computer Machinery, the world's largest educational and scientific computing society. One might think that such a clarion call, made by someone of such obvious influence, would generate a groundswell of enthusiasm. When such a luminary proclaims "time to teach open source development," the world of academia must certainly follow, yes?
It's a little more complicated than that.
We've spent a lot of time over the past few years talking to computer science professors. Mostly we've asked lots of questions -- actually, the same ones over and over.
1. Do you use open source software in your classes? (Increasingly.)
2. Are your students interested in open source? (Increasingly.)
3. Do you or your students participate in open source software? (Rarely.)
4. Do you teach open source development practices? (Almost never.)
For these last two, the follow-up question is, invariably, why not?
And the answer is, invariably, ''because it's hard''.
There are good reasons why professors don't teach the practice of open source. It's easy for open source advocates to explain away these reasons. At a certain point, though, one must accept the idea that most professors are well-intentioned, but bound by circumstances that make it frustratingly difficult to introduce students to open source development.
So why bother?
The answer is simple: the skills required to succeed in an open source software project are the exact same skills required to succeed in any large software project. The biggest difference is that, with just a bit of guidance, anyone can build their programming skills in the open source world.
Our hope is that this textbook helps to provide that guidance to a whole generation of students.
Almost every modern computer science degree program requires its students to complete a Big Project. Sometimes it's the "Senior Project," and sometimes it's the "Capstone Project". Whatever it's called, the purpose of this Big Project is to expose students to "real" software engineering practices.
In other words: coding with other people. Which, up until this point in a student's education, is usually strictly discouraged as "cheating".
The problem is that these Big Projects actually tend to focus on extremely bounded problems. Most of the time, a small team of students works on a small project for a semester, and the result is, quite naturally, a small project. Which actually does very little to teach students about Real Big Projects.
To find Real Big Projects, one must venture out into the world, where there are Real Big Problems. The real world is full of gigantic applications that require build systems and revision control and defect tracking and prioritization of work. They are written in languages that one may or may not know, by people one may or may not ever meet. And in order to successfully navigate through these Real Big Projects, the novice developer must possess one skill above all others: the ability, in the words of co-author Dave Humphrey, to be "productively lost".
The great advantage of open source, for the learner, is that the Real Big Projects of the open source world provide unparalleled opportunities to be "productively lost". Complex codebases are immediately accessible by anyone who wants to participate. Which is crucial to the learner, since participating in an activity is by far the most effective way to learn that activity.
Sooner or later, the coder aspirant must work at scale, with teammates. Open source provides that opportunity when nothing else can.
This textbook exists because professors asked for it, but the textbook's fundamental approach -- teaching the basic skills of open source development incrementally, through real involvement in meaningful projects -- should make it suitable for self-learners as well. In either case, the student should follow three principles to get the most value out of this textbook.
First, always be contributing. The majority of exercises in this textbook are designed to lead to direct and useful contributions to a project, no matter how small. Even a simple act, like adding comments to a part of the code you don't understand, can add real value to a project; that's the great thing about community developed software. Contribution matters, and legitimate contributions, no matter how small, are always welcome.
Second, ask for help when you're stuck. If you have trouble with an exercise -- and at some point you will -- look to your fellow contributors for help. Your chosen project will have mechanisms for getting in touch with the more advanced developers: mailing lists, or IRC channels, or forums, or all of the above. Communicating with those around you is not only "not cheating," it's key to establishing greater understanding.
Remember, though: in the real world, people are most likely to help those who are trying to help themselves. Before you ask someone a question on IRC, ask the same question of Google. A good rule of thumb: if you can't figure something out in 15 minutes of searching, it's probably okay to ask for a bit of help.
And third, be bold. Try things. Break stuff. Don't be afraid to play around with the code; it's only code, after all, and if you break something, you can always revert to the previous version. The one exception: don't commit known broken code to the repository (and don't worry if you don't know what that means yet; we'll get to that crucial detail later on.)
This is, first and foremost, a textbook about how to create software collaboratively, using a community development model.
Some people call the result of such work by the name "free software". Some people call it "open source software". Some folks call it both: "free and open source software". Some people throw in "libre" for good measure, and call it "free/libre open source software". Frequently one will see these abbreviated into the terms "FOSS" or "FLOSS".
There are valid reasons for the usage of these different terms in different contexts, but for the sake of simplicity, we will use the term "open source software" exclusively in this book, with the following rationale:
1. The meaning of "free" can be ambiguous in English; it can be read as either "gratis" (free of charge) or "libre" (liberated). "Free software" refers to the latter, but is often confused with the former.
2. Acronyms are bad.
3. The difference between the terms, while important from a philosophical point of view, are negligible from the point of view of the practitioner. The "why" of participation may vary among the various communities, but the "how" does not.
In closing, let us offer the wisdom of Richard Stallman:
The term “open source” software is used by some people to mean more or less the same category as free software. It is not exactly the same class of software: they accept some licenses that we consider too restrictive, and there are free software licenses they have not accepted. However, the differences in extension of the category are small: nearly all free software is open source, and nearly all open source software is free.
Enough of the pep talk. It's time to get started.