Courses/Computer Science/CPSC 203/CPSC 203 2007Fall L04/CPSC 203 2007Fall L04 TermProjects/Internet Plagiarism

Group Members
Maryan Crisostomo, Ryan Ferguson, Navdeep Hari, Ken Hsu, Theresa Duong, Jordan Valer

Project Statement
Since it’s launch in 2001, the website Wikipedia has expanded into an extremely popular online reference tool. However, by allowing individuals to personally and freely edit its contents, issues have arisen regarding the reliability and authenticity of the contributing articles. The prominence of the online encyclopedia and its widespread accessibility of information have resulted in mass incidents of copyright infringement and “copy and pasting” by its users. Our goal is to investigate the credibility of Wikipedia and look into the allegations of plagiarism that the website faces. We will also examine the occurrences of plagiarism by Wikipedia users and suggest improvements for reducing information bootlegging on the internet in general.

History of Wikipedia
Wikipedia was created January 15th, 2007. Nupedia, a leading expert driven internet encyclopeia, created wikipedia to increase user participation and to boost its popularity. Wikipedia quickly overtook Nupedia, as the amount of contributors led to much larger database. The international expansion of wikipedia began in March, 2001, when the Dutch version of the site launched. Following the Dutch lead many nations to create their own versions. Today it has grown to its current size and populatiry because of the users ability to edit the articles. Wikipedia now has several million articles in hundreds of languages. With millions of contributers everything you could possibly want to find is on wikipedia.



Although the developers of Wikipedia are disputing who created the website, it is widely accepted that Jimmy Wales came up with the idea of a encyclopedia where all users could contribute. Larry Sangler claims that it was him and not Wales who brought the idea of a wiki(user based internet page) forward. He also claims that he suggested it be added to Nupdia. Wales denies both claims on the fact that Sangler was working for him when the site was created.

The visual apperance of wikipedia has been changed 8 times of the sites history. Most of the changes have been minor

How Wikipedia Works
Wikipedia is a self-proclaimed “multi-lingual, web-based, free content encyclopedia project". The articles are freely written by contributors around the world and are edited completely by volunteers.  The website utilizes a simple mark up language known as "wiki mark up" that is extremely user-friendly and even offers on site tutorials and instructions.

Editing Pages
Anyone is allowed to edit the contents of Wikipedia articles, with the exception of a few semi-protected and protected pages. Changing and adding information is as easy as clicking the “Edit this Page” tab on any Wikipedia article. By clicking “Save Page” when finished, any alterations made become accessible to millions of users around the world.

The principle behind the success of Wikipedia relies on a community of self-policed writers and editors striving for the greatest authenticity and impartialness of its articles. However, not every individual is granted the same privileges and authority within the Wiki community.


 * Any Individual can access and edit pages


 * Registered Users can start new pages, edit semi-protected pages, re-name pages and upload images.


 * Administrative Privileges are granted to several hundred, long-time contributors, which allows them the ability to delete pages, block IP addresses and ban disruptive users.

In order to become an Administrator, editors must apply for adminship and display trustworthiness, impartiality, fairness and familiarity with Wikipedia. The Wiki community must then come to a consensus on whether or not the individual should be granted adminship.

Features

 * Minor Edit option- registered users can choose this option when making small, insignificant changes to an article, such as correcting a typo. This type of edit is not usually reviewed or disputed.
 * Major Edit option- signals a major alteration to an article that SHOULD be reviewed by editors.
 * Revert option- reverts a page to a previous stage if vandalism or incorrect editing occurs.
 * Revision history- provides a summary of the time and nature of all edits made on a page and the user that performed them
 * Watchlist- allows readers, editors and administrators to track the revisions made to certain articles.
 * Tags- used to flag areas which may seem suspicious or require fact-checking
 * Disclaimers- warns readers of possible disputes regarding the neutrality of some sensitive articles.

Who Pays for Wikipedia?
Wikipedia falls under the umbrella of the non-profit Wikimedia Foundation, which also manages a number of other related wiki sites and is based in St. Petersburg Florida. Wikimedia relies heavily on individual donations to keep their company running. However, they have also initiated a fundraising campaign that reportedly topped $1 million. Many of the contributions were made by large corporations such as Dell, Craiglist Inc., Virgin Unite and Two Sigma Investments. The funds help to cover the costs of daily expenditures and are also utilized to introduce server upgrades for their users.

What is Plagiarism?
Plagiarism is submitting another author’s published or unpublished work, in whole, in part, or in paraphrase, as one’s own work, without fully and properly crediting the other author with footnotes, citations or other bibliographical reference.

Online plagiarism
Internet has made plagiarism so much easier, since people just need to copying and pasting the stuff that they want from the websites. This phenomenon also known as content scraping and it’s affecting both established sites and blogs.

Copyright Laws
Copyright Law is the laws that protect our intellectual property. They make Plagiarism illegal. At one time, the laws only protect the work that have the copyright trademark (the © symbol). However, according to laws established in 1989, works are now copyright protected with or without the inclusion of this symbol.

Anyone who reproduces copyrighted material improperly can be prosecuted in a court of law. It does not matter if the form or content of the original has been altered -- as long as any material can be shown to be substantially similar to the original, it may be considered a violation of the Copyright Act.

However, not all the published works are copyright protected. The Copyright Act only protects works that express original ideas or information. People could borrow liberally from Works published by the government and facts that are not the result of original research (such as the fact that there are fifty U.S. states, or that carrot contain Vitamin A) without fear of being plagiarize.

Plagiarism on Wikipedia
Wikipedia, world wide, has endured many copyright disputes against its accuracy and creditability. Wikipedia has been accused in reguards to defiance in comprehensiveness because of its voluntary nature, many of which has gone unnoticed by a majority of the public. Ranging from minor misinformation to full blown lawsuits, Wikipedia has experienced it all.

First Study on Plagiarism on Wikipedia
A study done by Daniel Brandt has shown the legitimacy of the well known encyclopedia of the new era, Wikipedia. Brandt began analyzing 16,750 selected articles that came from Wikipedia for plagiarism. Through his filteration method, he would "quote" a sentence from the article and use them in a search engine. Because they are in quotations, the wording has to be precise for a match. He continued doing so, minipulating his trials with more sentences with each quotations.

Brandt's study showed that with the files averaged 2.38 sentences with context of plagiarism. There were 142 articles thats were brought to Wikipedia's attention.

Plagiarism by Wikipedia Users
Plagiarism by students can be deemed a very serious offence, leading to suspension or even being expelled. Students plagiarize by blatantly copying and pasting information, hoping the information from the source is actually true.

A study was done to determine the extent of University students that are plagiarizing their written work. The website www.ahfb.blogspot.com allows the public to view papers that are left behind the “Fishbowl”. Of these papers, many had taken their information from Wikipedia without properly citing the source.

Another incident was a English student in Michigan. The student failed to cite the information taken from an article from Wikipedia on the Columbine High School shooting.

Comments on Plagiarism
Wikipedia allows its users to add, change and delete informations regardless of expertise. Plagiarism is however, not uncommon among the world wide web, the information present should be acknowledged true, if the website claims to be an "encycolpeida". Wikipedia must improve and maintain a high standard and an intolerability of such acts of plagiarism.

"There is no need nor intention to be vindictive, but at the same time, we can not tolerate plagiarism. Let me say quite firmly that for me, the legal issues are important, but far far far more important are the moral issues. We want to be able, all of us, to point at Wikipedia and say: we made it ourselves, fair and square." -Jimmy Wales (founder of Wikipedia)

Negative Impact of Wikipedia's Credibility
Credibility is a huge concern for most users of Wikipedia. The ability for anyone with a registered account to change or post information, creates a problem for researchers. Often what happens is someone will input false information into an article and viewers may accept these inaccuracies as truths. This is why Wikipedia is constantly criticized and boycotted by some.

Criticisms
Some critics of wiki’s and online documents had these criticisms: Schneider (2005) finds it inherently untrustworthy, and questions the scope and balance of its articles. In a move inspired by Wikipedia’s success, when the Los Angeles Times started a wiki on its Web site for their readers to write about the Iraq war (Martinez, 2005), the paper’s former editorial page editor criticised it for diluting its reputation (Burgard, 2005). He said the content could not be claimed to be reasoned and informed, and that the paper should be checking all the claims made by the wiki’s authors which, like Wikipedia, it was not. Someone, he claims, needs to be guardian and trustee of the information that is published. Cronin (2005) feels the same way about Wikipedia. A recent study though disproves the above claims.

The Study
258 academics were asked to participate in the study. 27 percent agreed to take part with 21 percent actually completing the survey. The experts were given either one of two experimental conditions. Condition 1 consisted of an article written on Wikipedia that was related to their given field of expertise. Condition 2 consisted of the expert being asked to critique a random article. Wikipedia’s own random article selection feature was used to assign a different article to each Condition 2 respondent. After each respondent had completed the reading of the article they were asked to critique both the article’s credibility and the author’s credibility. To do so each respondent received the same online questionnaire to measure the credibility.

Results

The study conducted concluded that Wikipedia is a reliable source of information with a comparable mean of errors to published articles. Although the average percentage of errors in Wikipedia articles was 13%, Wikipedia proves to be a solid source of information. Table 1 shows that the mean amount of errors in both conditions is very close, and the credibility of Wikipedia is more than comparable to published articles.



Conclusion
The issue of Wikipedia's credibility can be summed up into one sentence: Where did you get that information? Did the user make it up? Did the user steal it from an article they have read, word for word? These are the questions that are not answered on the site. Concerns of credibility have caused a negative effect on Wikipedia’s reputation, and have discouraged use. The study cited in this article, hushes these worries.

Improvements
Plagiarism runs rampant on the internet and there are several ways to remedy this problem. One solution is to disable the right click/ Ctrl C commands. This will prevent a majority of the users from copying because these commands make it easy to transfer the information from the site to their document. Taking away this tool will force the users to put some effort into researching on their own.

Wikipedia is an online encyclopaedia that can be edited by the general public. To ensure that entries are not copied directly off of other sources, there should be a system where changes to a page should be verified before being posted. However since there can be thousands of changes, only the larger changes should be checked because the larger entries are most likely to be copied off another source. For example, there should be a 24 hour gap between the time the information is entered to when it is posted onto the website. In this time period respected users should verify that the information is legitimate and not plagiarised from an encyclopaedia or other sources.

Another method to prevent plagiarism on Wikipedia is to use a water mark. A water mark is an image or statement that is placed in a document after the material is copied and pasted. It could be a complete reference to Wikipedia, which could include a URL, the date, when the page was updated and other relevant information. If the water mark is deleted the information that was copied will also be erased. This will prevent users from taking information off of Wikipedia (or any other site) and claiming it is their own intellectual property.

Results
By investigating Wikipedia’s credibility and the allegations of plagiarism, we have obtained some conflicting results. On one hand, one study concluded that out of over 16,000 articles, 142 files averaged 2.38 sentences with context of plagiarism. On the other hand, another study found that the average amount of errors detected on Wikipedia articles is comparable to similar published works. So, Is Wikipedia still a good resource to use?

Plagiarism on the Internet is a major issue, and Wikipedia provides a forum where this problem can grow. It can be used as a basis for gathering general information, but should never be cited as a primary source. All users should carefully consider the sources of the article (which can be found at the bottom of the page) to determine the integrity of the information. Users should not only question the reliability of Wikipedia, but also be weary of information on other websites and always properly cite their sources in order to avoid plagiarism.