Open Source Intelligence

The term Source Intelligence has two meanings. In the context of traditional intelligence operations (government and private secret services) it refers to information which can be extracted from publicly available sources, including newspapers, websites, and books. This is contrasted to information, gathered through espionage, surveillance or informants, which is not available outside the intelligence organization. An open source, in this context, is any information resource that can be accessed by the public. Recently, the term has also been used to denote a practice of collaborative knowledge-building inspired by the Free and Open Source Software movement. It has been used for the first time in this sense in an article published in the on-line journal First Monday in 2002 [1]. This dictionary entry focuses on this second meaning of the term.

Definition and context

Over the last two decades, the Free and Open Source Software movement has established a collaborative development processes capable producing high-quality information goods (for example, the operating system Linux). Yet this process is much closer to scientific knowledge-building than to the production of commercial goods. It rests on a few general principles: unrestricted access to information created by the community; peer-review of contributions by members of the community, authority-based rather than sanctions-based governance, flexible levels of involvement, and responsibility.

In practice, this means the following: everyone can access to all information created as part of the collaborative project. Nobody can remove information from the common pool. Copyleft licenses, such as the GNU General Public License, provide legal safeguards [2]. Community members decide collaboratively which contributions are valid enough to be entered into the common knowledge-pool and which will not be accepted. This peer-review process is important to ensure a high quality level of the common resource. Open Source projects are not chaotic. Most of them are lead by one or more people who have earned the trust of the community over the course of the project’s life cycle. They are usually the ultimate arbiters in case of conflict. Their leadership is based on the authority ascribed to them by the community, rather than on their ability to take sanctions against anyone. The entire process is based on voluntary contributions, hence the most important task of the project leader is to keep the community happy and motivated. Lastly, flexible levels of involvements allow nearly everyone to participate in a project, ranging from minor one-term contributions to major, long-term involvement, depending on interest, qualifications, and resources.

These principles, while pioneered in the development of software over the Internet, are now being applied to the creation of other forms of knowledge, a practice which is termed Open Source Intelligence.

History and practice

Collaborative knowledge-building on the Internet is as old as the Internet itself. For a long time free sharing of information has been a central aspect of internet culture (‘information wants to be free’ used to be a popular slogan). Simple collaborative environments such as email lists have existed since the early 1970s, and slightly more advanced systems such Usenet and Bulletin Board Systems (BBSs) since the early 1980s. However, with the commercialization of the Internet in 1990s, and the success of the Free and Open Source Software, it has been recognized that collaborative development of knowledge is a particular and ¬innovative practice that requires a distinct social, technical, and legal framework to succeed. Socially, it requires an understanding of the participants of the merits of collaboration and a willingness to just to share one’s own knowledge, but also to accept that other people might transform that knowledge in unexpected ways. Instead of a clear separation between author and audience, expert and novice, we have a situation in which everyone is entitled read and write. Technically, specialized platforms support this process by making it easy for people to build on other people’s contributions and by ensuring transparency how a resource has been transformed. Legally the traditional conception of copyright, which gives the author nearly unlimited control over the use of his/her works, is being replaced by copyleft which is based on the nearly unlimited rights to distribute and adapt works.

Perhaps the most prominent example of Open Source Intelligence project is the free on-line encyclopedia Wikipedia [3]. Technically, it is based on a Wiki platform, which allows everyone with a standard web browser to edit page. The software keeps a history of the modifications of a page, making it easy to see how the content developed over time and, if necessary, remove unqualified modifications or vandalism. Founded in early 2001, it has grown to over 450,000 articles by the end of 2004, all written by volunteers without central editing or coordination. It is published under the GNU Free Documentation license. Many of these articles can match the level of quality found in commercial encyclopedias while others are still below standard. As a recent IBM study showed [4] the project - despite its openness - is surprisingly stable and resistant to missuse. Its history so-far gives reason for optimism that articles will improve over time as knowledgeable people will fill in the gaps that still exists and collaborative editing will remove errors that are still present. Eric S. Raymond, a leading analyst of the Free and Open Source Software movement, once expressed the underlying assumption in the following way: “Given enough eyeballs, all bugs are shallow.” [5] He meant that people with different skills and knowledge domains will find different errors easy to spot and correct. What is a hard problem for one person, might be trivial for someone else. If there are enough people who look at page, all errors will be found and removed. Like all collaborative projects, Wikipedia is a living projects, that changes constantly, rather than a fixed product to which updates are made available to the public from time to time.

Other prominent example include the Indymedia network of community-based alternative news [5], the collaborative editing site kuro5hin [6], and, to a lesser degree, user built databases such as the CDDB, which contains information on sound files [7].

Potential

Open Source Intelligence seems to work best for domains in which knowledge is widely but unevenly distributed. There are tens of thousands of people who have some specialist knowledge with which to improve the Wikipedia, yet they all recognize that they are fundamentally dependent on others if they want to resource to be comprehensive. Nobody could write it alone.

For areas in which there is no established authority, either because the area is too dynamics, or because the knowledge is too specific to a community or a project, there is often no other way than to create knowledge collaboratively. The examples of Open Source Intelligence can help to organize this process efficiently.

Critical Issues

The central problem of Open Source Intelligence is how to validate the knowledge thus produced. For Free and Open Source Software, this is relatively simple, because one can always run the code and see what happens. If the new version is faster than the old, or can do new things, it has improved. It is crashes, it needs improvement. In the case of a traditional encyclopedia (or a newspaper), a team of specialized editors, supported by a staff of fact checkers, archivists and so on, are reviewing each article before its publication. We know (or at least assume) that all information that we see has gone through this processes. Thus, specialists validate information for us and we trust the publisher to employ the right people.

Most Open Source Intelligence projects, on the other hand, let everyone contribute, and information is made accessible immediately. In other words, we usually don’t know if something represent a first draft, or the collective wisdom of a community. Furthermore, an open editorial process might biased towards the conventional wisdom, the accepted majority opinion, which, in any community, might be wrong on certain issues [8]. Thus a large number of ‘eyeballs’ might reinforce, rather than remove a bug. There is no easy method, like running the code, by which to assess the quality of the collaborative work.

There is currently lots of discussion how to deal with these projects. Some propose to re-introduce the destinction between experts and novices [9], while others, such as the discussion platform Slashdot [10], have developed a moderating scheme at ranks contributions than then enables filtering based on this ranking. The issue is non-trivial and there does not exist a one-size-fits all solution. Rather, different contexts are likely to developed different solution, depending on their particular needs and capacities.

With the rise of blogs (web-logs, software that facilitates journal-type publication of individuals) the terms open source journalism, citizen journalism, or peer-to-peer journalism have been coined to describe similar collaborative processes in the area of the creation and discussion of news reporting. The most prominent advocate is based technology journalist Dan Gillmor, who has recently published a book on the phenomenon called We The Media [11].

References

[1] Stalder, Felix; Hirsh, Jesse (2002). Open Source Intelligence. First Monday (June). Vol. 7 No. 6 http://firstmonday.org/issues/issue...

[2] Liang, Lawrence (2004). Guide to Open Content Licenses. Rotterdam, NL, Piet Zwart Institute for Media Design http://pzwart.wdka.hro.nl/mdr/resea... This is the best introduction to open content licenses currently available.

[3] Wikipedia, http://www.wikipedia.org, main project in English, with many sizeable editions in other languages, including French and Spanish.

[4] IBM (2004). History Flow: visualizing dynamic, evolving documents and the interactions of multiple collaborating authors: a preliminary report. http://researchweb.watson.ibm.com/h... Shows with graphics the development of Wikipedia articles.

[5] Raymond, Eric S. (1998). The Cathedral and the Bazaar. First Monday. Vol. 3 No. 3 http://www.firstmonday.dk/issues/is... and http://www.catb.org/esr/writings/ca...

[6] http://www.kuro5hin.org

[7] http://www.freeccdb.org

[8] McHenry, Robert McHenry (2004). The Faith-Based Encyclopedia. Tech Central Station, Nov.15. http://www.techcentralstation.com/1... Critical discussion of the editorial process of Wikipedia by the former editor-in-chief of the encyclopedia Britannica.

[9] Larry Sanger (2004). Why Wikipedia Must Jettison Its Anti-Elitism. Kuro5hin.org (Dec.30). http://www.kuro5hin.org/story/2004/...

[10] URL: http://www.slashdot.org

[11] Gillmor, Dan (2004). We the Media: Grassroots Journalism by the People, for the People. Sebastopol, CA, O’Reilly http://www.oreilly.com/catalog/weme... This book contains many links and references to current projects.

16 January 2006

This text is an extract from the book Word Matters: multicultural perspectives on information societies. This book, which has been coordinated by Alain Ambrosi, Valérie Peugeot and Daniel Pimienta was released on November 5, 2005 by C & F Éditions.

The text is under the Creative Commons licence, by, non commercial.

Knowledge should be shared in free access... But authors and editors need an economy to keep on creating and working. If you can afford it, please buy the book on line (39 €)

Vecam

Open Source Intelligence