creadur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From codesite-nore...@google.com
Subject [apache-rat-pd] r47 committed - Most regular expressions for recognising functions have been modified...
Date Sun, 16 Aug 2009 23:44:36 GMT
Revision: 47
Author: maka82
Date: Sun Aug 16 16:43:54 2009
Log: Most regular expressions for recognising functions have been modified
to prevent to be too much greedy and recognise two or more functions
as one.
Also, regular expressions for recognising comments have been modified
to prevent StackOverflowError when comments are too long.
While testing, I have notice this bug - application breaks when in a
string can be found $. That bug is fixed now.
Java doc is written for almost all methods. BasicCodesearch is deleted  
because it is not used anymore.
Static methods in PlagiarismDetactor class are not static anynore.
http://code.google.com/p/apache-rat-pd/source/detail?r=47

Deleted:
  /trunk/src/main/java/org/apache/rat/pd/engines/BasicCodeSearchParser.java
Modified:
  /trunk/src/main/java/org/apache/rat/pd/core/PdCommandLine.java
  /trunk/src/main/java/org/apache/rat/pd/core/PlagiarismDetector.java
  /trunk/src/main/java/org/apache/rat/pd/core/SourceCodeAnalyser.java
  /trunk/src/main/java/org/apache/rat/pd/engines/ISearchEngine.java
  /trunk/src/main/java/org/apache/rat/pd/engines/Managable.java
  /trunk/src/main/java/org/apache/rat/pd/engines/RetryManager.java
  /trunk/src/main/java/org/apache/rat/pd/engines/SearchResult.java
   
/trunk/src/main/java/org/apache/rat/pd/engines/google/GoogleCodeSearchParser.java
   
/trunk/src/main/java/org/apache/rat/pd/engines/google/MultilineRegexGenerator.java
  /trunk/src/main/java/org/apache/rat/pd/engines/google/RegexGenerator.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/BruteForceHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/HeuristicCheckerResult.java
  /trunk/src/main/java/org/apache/rat/pd/heuristic/IHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ActionScriptCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CPPCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CSharpCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CobolCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ColdFusionCommentsheuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/DelphiCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/FortranCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/HTMLCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/JavaCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/JavaScriptCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/LispCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PHPCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PascalCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PerlCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PythonCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/RubyCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/SQLCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ShellCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/VisualBasicCommentHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/ActionScriptFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CPPFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CSharpFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/DelphiFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/FortranFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/FunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/JavaFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/JavaScriptFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/PHPFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/PascalFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/VisualBasicFunctionHeuristicChecker.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/misspellings/Dictionary.java
   
/trunk/src/main/java/org/apache/rat/pd/heuristic/misspellings/MisspellingsHeuristicChecker.java
  /trunk/src/main/java/org/apache/rat/pd/report/HtmlReportGenerator.java
  /trunk/src/main/java/org/apache/rat/pd/report/ReportEntry.java
   
/trunk/src/test/java/org/apache/rat/pd/engines/google/RegexGeneratorTest.java
   
/trunk/src/test/java/org/apache/rat/pd/heuristic/functions/CFunctionHeuristicCheckerTest.java
   
/trunk/src/test/java/org/apache/rat/pd/heuristic/functions/JavaFunctionHeuristicCheckerTest.java
   
/trunk/src/test/java/org/apache/rat/pd/heuristic/functions/PascalFunctionHeuristicCheckerTest.java

=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/engines/BasicCodeSearchParser.java	 
Wed May 13 15:37:57 2009
+++ /dev/null
@@ -1,59 +0,0 @@
-/*
- *
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- *
- */
-package org.apache.rat.pd.engines;
-
-
-import java.io.BufferedReader;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.InputStreamReader;
-import java.net.MalformedURLException;
-import java.net.URL;
-import java.net.URLConnection;
-
-public class BasicCodeSearchParser {
-	/**
-	 *  just reads url
-	 */
-	public String getPage(String query) {
-		String result = "";
-		try {
-			URL url = new URL(query);
-			URLConnection uc = url.openConnection();
-			InputStream content = (InputStream) uc.getInputStream();
-			BufferedReader in = new BufferedReader(new InputStreamReader(
-					content));
-			String line;
-			while ((line = in.readLine()) != null) {
-				result = result + line;
-				// System.out.println(line);
-			}
-			in.close();
-		} catch (MalformedURLException e) {
-			System.out.println("Invalid URL");
-			e.printStackTrace();
-		} catch (IOException e) {
-			System.out.println("IOException");
-		}
-		return result;
-	}
-
-}
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/core/PdCommandLine.java	Sat Aug   
8 08:59:03 2009
+++ /trunk/src/main/java/org/apache/rat/pd/core/PdCommandLine.java	Sun Aug  
16 16:43:54 2009
@@ -74,7 +74,7 @@
  			+ "\n\t.xml  - report will be xml document"
  			+ "\n\t.html - report will be html document"
  			+ "\n\t.txt  - report will be txt document"
-			+ "\nDefault is report.txt.";
+			+ "\nDefault is report.html.";
  	private static final String HELP_OPT_DESC = "print this message";
  	private static final String VERBOSE_OPT_SHORT = "v";
  	private static final String LANGUAGE_OPT_SHORT = "lang";
@@ -85,10 +85,10 @@
  	private static final String CODEBASE_OPT_SHORT = "cb";
  	private static final String REPORT_OPT_SHORT = "r";
  	private static final String HELP_OPT_SHORT = "h";
-	public static final int DEFAULT_LIMIT = 75;
+	public static final int DEFAULT_LIMIT = 10;
  	public static final String DEFAULT_HEURISTIC  
= "BruteForceHeuristicChecker";
  	public static final String DEFAULT_ENGINE = "GoogleCodeSearchParser";
-	public static final String DEFAULT_REPORT = "report.txt";
+	public static final String DEFAULT_REPORT = "report.html";
  	private static final String DEFAULT_HEURISTIC_MSG = "Heuristic is not  
set.In use is default checker: ";
  	private static final String DEFAULT_ENGINE_MSG = "Engines is not set.In  
use is default engine: ";
  	private static final String DEFAULT_LIMIT_MSG = "Limit is not set.In use  
is default limit: ";
@@ -171,6 +171,8 @@
  	 *
  	 * @param args
  	 *            command line arguments
+	 * @param out
+	 *            output stream for displaying current information
  	 */
  	public PdCommandLine(String[] args, PrintStream out) {
  		this.out = out;
@@ -283,7 +285,8 @@
  	 *
  	 * @param line
  	 *            CommandLine
-	 * @throws Exception
+	 * @throws NumberFormatException
+	 *             , IllegalArgumentException
  	 */
  	private void parseCommandLineArguments(CommandLine line)
  			throws NumberFormatException, IllegalArgumentException {
@@ -330,18 +333,31 @@
  		}
  	}

+	/**
+	 * @return limit property
+	 */
  	public int getLimit() {
  		return this.limit;
  	}

+	/**
+	 * @return verbose flag
+	 */
  	public boolean isVerbose() {
  		return this.verbose;
  	}

+	/**
+	 * @return filename of the report
+	 */
  	public String getReport() {
  		return this.report;
  	}

+	/**
+	 * @return format of report, PdCommandLine.HTML_MODE,  
PdCommandLine.XML_MODE
+	 *         or PdCommandLine.TEXT_MODE
+	 */
  	public String getReportFormat() {
  		String toRet = getReport() != null ? getReport().replaceFirst(".*\\.",
  				"") : "";
@@ -355,18 +371,30 @@
  		return toRet;
  	}

+	/**
+	 * @return path to code base
+	 */
  	public String getCodebase() {
  		return this.codebase;
  	}

+	/**
+	 * @return list of engine names
+	 */
  	public List<String> getEngines() {
  		return this.engines;
  	}

+	/**
+	 * @return list of heuristic names
+	 */
  	public List<String> getHeuristics() {
  		return this.heuristics;
  	}

+	/**
+	 * @return programming language name
+	 */
  	public String getLanguage() {
  		return this.language;
  	}
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/core/PlagiarismDetector.java	Sat  
Aug  8 08:59:03 2009
+++ /trunk/src/main/java/org/apache/rat/pd/core/PlagiarismDetector.java	Sun  
Aug 16 16:43:54 2009
@@ -86,12 +86,16 @@
   */
  public class PlagiarismDetector {

+	// stream for printing current information
+	private static PrintStream out;
  	/**
  	 * @param args
  	 * @throws RatReportFailedException
  	 */
  	public static void main(String[] args) throws RatReportFailedException,
  			IOException {
+		//instance of PlagiarismDetector
+		PlagiarismDetector pd = new PlagiarismDetector();

  		PdCommandLine pdCommandLine = new PdCommandLine(args, System.out);
  		if (!pdCommandLine.isAllArgumentsCorrect()) {
@@ -99,20 +103,20 @@
  			System.exit(1);
  		}
  		// stream for printing current information
-		final PrintStream out = getProperPrintStream(pdCommandLine);
+		out = pd.getProperPrintStream(pdCommandLine);
  		// we add all rules we are interested in
  		final List<IHeuristicChecker> algorithmsForChecking =
-			configureHeuristicCheckers( pdCommandLine, out);
+			pd.configureHeuristicCheckers( pdCommandLine);
  		// making one or more parsers
-		final List<ISearchEngine> searchEngines = configureSearchEngines(
-				pdCommandLine, getLanguage(pdCommandLine,
-						algorithmsForChecking, out), out);
+		final List<ISearchEngine> searchEngines = pd.configureSearchEngines(
+				pdCommandLine, pd.getLanguage(pdCommandLine,
+						algorithmsForChecking));

  		// we check whole directory
  		final DirectoryWalker walker = new DirectoryWalker(new File(
  				pdCommandLine.getCodebase()));
  		// report document
-		Report reportDocument = configureReport(pdCommandLine);
+		Report reportDocument = pd.configureReport(pdCommandLine);

  		// analyzer is sliding-window checker
  		final SourceCodeAnalyser analyser = new SourceCodeAnalyser(
@@ -123,7 +127,7 @@

  		walker.run(plagiarismDetectorReport);
  		// Saving generated report
-		saveReport(pdCommandLine.getReport(), reportDocument.getReport());
+		pd.saveReport(pdCommandLine.getReport(), reportDocument.getReport());
  	}

  	/**
@@ -132,15 +136,15 @@
  	 * @param pdCommandLine command line object to read report format from
  	 * @return
  	 */
-	private static Report configureReport(PdCommandLine pdCommandLine) {
+	private  Report configureReport(PdCommandLine pdCommandLine) {
  		Report reportDocument;
-		if (pdCommandLine.getReportFormat().equals(PdCommandLine.HTML_MODE)) {
-			reportDocument = new HtmlReportGenerator();
+		if (pdCommandLine.getReportFormat().equals(PdCommandLine.TEXT_MODE)) {
+			reportDocument = new TxtReportGenerator();
  		} else if (pdCommandLine.getReportFormat().equals(
  				PdCommandLine.XML_MODE)) {
  			reportDocument = new XmlReportGenerator();
  		} else {
-			reportDocument = new TxtReportGenerator();
+			reportDocument = new HtmlReportGenerator();
  		}
  		return reportDocument;
  	}
@@ -151,17 +155,16 @@
  	 *
  	 * @param pdCommandLine command line object to read engines from
  	 * @param language name of programming language
-	 * @param out stream where output messages are printed
-	 * @return
+	 * @return list of chosen search engine parsers
  	 */
-	private static List<ISearchEngine> configureSearchEngines(
-			PdCommandLine pdCommandLine, String language, PrintStream out) {
+	private  List<ISearchEngine> configureSearchEngines(
+			PdCommandLine pdCommandLine, String language) {
  		List<ISearchEngine> toRet = new ArrayList<ISearchEngine>();
  		for (String searchEngine : pdCommandLine.getEngines()) {
  			if (searchEngine.equals("GoogleCodeSearchParser")) {
  				toRet.add(new GoogleCodeSearchParser(language, out));
  			}
-			// TODO add more parsers
+			// add more parsers here(koders and krugle)
  		}
  		return toRet;
  	}
@@ -170,12 +173,12 @@
  	 * This method can create objects defined from command line heuristic
  	 * checker arguments
  	 *
-	 * @param pdCommandLine
-	 * @return
+	 * @param pdCommandLine command line object to read engines from
+	 * @return list of chosen heuristic checkers
  	 * @throws IOException
  	 */
-	private static List<IHeuristicChecker> configureHeuristicCheckers(
-			PdCommandLine pdCommandLine, PrintStream out) throws IOException {
+	private  List<IHeuristicChecker> configureHeuristicCheckers(
+			PdCommandLine pdCommandLine) throws IOException {
  		List<IHeuristicChecker> toRet = new ArrayList<IHeuristicChecker>();

  		for (String heuristicChecker : pdCommandLine.getHeuristics()) {
@@ -311,7 +314,7 @@
  	 * @param raportName file name
  	 * @param reportContent content of report file
  	 */
-	private static void saveReport(String raportName, String reportContent) {
+	private  void saveReport(String raportName, String reportContent) {
  		try {
  			FileManipulator.saveFile(raportName, reportContent);
  		} catch (FileNotFoundException e) {
@@ -322,13 +325,12 @@
  	}

  	/**
-	 * @param algorithmsForChecking
-	 * @param out
+	 * @param algorithmsForChecking list of heuristic checkers
  	 * @return Language of source code. It depends by selected heuristic
  	 *         checkers.
  	 */
-	private static String getLanguageFromHeuristic(
-			List<IHeuristicChecker> algorithmsForChecking, PrintStream out) {
+	private  String getLanguageFromHeuristic(
+			List<IHeuristicChecker> algorithmsForChecking) {
  		String toRet = IHeuristicChecker.ALL_LANGUAGES;
  		Set<String> langs = new TreeSet<String>();
  		for (IHeuristicChecker heuristicChecker : algorithmsForChecking) {
@@ -345,17 +347,16 @@
  	 * Decides which programming language will be used.
  	 *
  	 * @param commandLine commandLine object to read language information from
-	 * @param algorithmsForChecking
-	 * @param out
+	 * @param algorithmsForChecking list of heuristic checkers
  	 * @return programming language name
  	 */
-	private static String getLanguage(PdCommandLine commandLine,
-			List<IHeuristicChecker> algorithmsForChecking, PrintStream out) {
+	private  String getLanguage(PdCommandLine commandLine,
+			List<IHeuristicChecker> algorithmsForChecking) {
  		String toRet = IHeuristicChecker.ALL_LANGUAGES;
  		if (commandLine.getLanguage() != null) {
  			toRet = commandLine.getLanguage();
  		} else {
-			toRet = getLanguageFromHeuristic(algorithmsForChecking, out);
+			toRet = getLanguageFromHeuristic(algorithmsForChecking);
  		}
  		return toRet;
  	}
@@ -366,9 +367,9 @@
  	 * output stream.
  	 *
  	 * @param commandLine commandLine object to read verbose information from
-	 * @return
+	 * @return chosen output stream. If -v is not specified, all information  
will not be shown
  	 */
-	private static PrintStream getProperPrintStream(PdCommandLine  
pdCommandLine) {
+	private  PrintStream getProperPrintStream(PdCommandLine pdCommandLine) {
  		return pdCommandLine.isVerbose() ? System.out : new PrintStream(new  
OutputStream() {
  			@Override
  		    public void write(int b) {} // Do nothing when printing is requested.
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/core/SourceCodeAnalyser.java	Sat  
Aug  8 08:59:03 2009
+++ /trunk/src/main/java/org/apache/rat/pd/core/SourceCodeAnalyser.java	Sun  
Aug 16 16:43:54 2009
@@ -49,12 +49,22 @@
  	private final List<ISearchEngine> searchEngines;
  	private final List<IHeuristicChecker> algorithmsForChecking;
  	private final Report reportDocument;
-
+
  	private final PrintStream out;
  	private PauseListener pauseListener;
  	private volatile boolean paused = false;
  	private int progressMessageLength;

+	/**
+	 * @param searchEngines
+	 *            list of chosen search engine parsers
+	 * @param algorithmsForChecking
+	 *            list of chosen heuristic algorithms
+	 * @param reportDocument
+	 *            chosen report
+	 * @param out
+	 *            printStream wnere current information are be printed
+	 */
  	public SourceCodeAnalyser(List<ISearchEngine> searchEngines,
  			List<IHeuristicChecker> algorithmsForChecking,
  			Report reportDocument, PrintStream out) {
@@ -69,13 +79,12 @@
  	/**
  	 * initialize key listener for pause/resume
  	 */
-	private void initializeKeyListener(){
+	private void initializeKeyListener() {
  		this.pauseListener = new PauseListener() {
  			@Override
  			public void onPause() {
  				System.out.println("apache-rat-pd is paused...");
-				System.out
-						.println("to resume application, pres Enter");
+				System.out.println("to resume application, pres Enter");
  				SourceCodeAnalyser.this.paused = true;
  			}

@@ -86,29 +95,36 @@
  			}

  		};
-		//to ensure that this process will end together with the program
+		// to ensure that this process will end together with the program
  		pauseListener.setDaemon(true);
  		pauseListener.start();
-
-	}
-
+
+	}
+
  	/**
  	 * this method will make algorithm to sleep until flag paused == true
  	 */
  	private void pauseIfNeeded() {
  		while (paused)
  			try {
-				Thread.currentThread().sleep(1000);
+				Thread.sleep(1000);
  			} catch (InterruptedException e) {
  				e.printStackTrace();
  			}
  	}

+	/*
+	 * (non-Javadoc)
+	 *
+	 * @see
+	 *  
org.apache.rat.document.IDocumentAnalyser#analyse(org.apache.rat.document
+	 * .IDocument)
+	 */
  	@Override
  	public void analyse(IDocument document) throws  
RatDocumentAnalysisException {
  		try {

-			System.out.println(document.getName());
+			System.out.println("\n");
  			String code = readFile(document.reader());

  			checkOneFile(searchEngines, code, algorithmsForChecking, document
@@ -118,9 +134,15 @@
  		}
  	}

-	// TODO encoding is system default now!!!!
+	/**
+	 * @param reader
+	 *            object which can read a file
+	 * @return content of file
+	 * @throws IOException
+	 */
  	private String readFile(Reader reader) throws IOException {
  		String toret = "";
+		// TODO encoding is system default now!!!!
  		BufferedReader input = new BufferedReader(reader);
  		try {
  			String line = null;
@@ -136,11 +158,16 @@
  	/**
  	 * Implementation of sliding-window algorithm.
  	 *
-	 * @param searchEngine
+	 * @param searchEngines
+	 *            list of chosen search engine parsers
  	 * @param code
+	 *            code part to be checked
  	 * @param heuristicCheckers
+	 *            list of chosen heuristic checkers
+	 * @param fileName
+	 *            name of file we are currently checking
  	 */
-	public void checkOneFile(List<ISearchEngine> searchEngine, String code,
+	public void checkOneFile(List<ISearchEngine> searchEngines, String code,
  			List<IHeuristicChecker> heuristicCheckers, String fileName) {

  		String[] tokens = tokeniseString(code);
@@ -156,13 +183,13 @@

  				StringBuffer toCheck = combineTokens(tokens, i, j);
  				j++;
-				printProgress(tokens.length, j);
+				printProgress(tokens.length, j, fileName);

  				HeuristicCheckerResult heurResult = isCheckOnInternetNeaded(
  						heuristicCheckers, toCheck.toString());
  				if (heurResult.isCheckOnInternetNeaded()) {

-					ReportEntry lastEntry = searchOnInternet(searchEngine,
+					ReportEntry lastEntry = searchOnInternet(searchEngines,
  							heurResult.getCodeSuggestedToBeChecked(), fileName);
  					if (lastEntry != null) {
  						raportEntry = lastEntry;
@@ -173,7 +200,6 @@
  				}
  			}
  			i = j;
-			// print if there is something found
  			if (raportEntry != null) {
  				printReport(raportEntry);
  				reportDocument.addReportEntry(raportEntry);
@@ -190,9 +216,8 @@
  	 */
  	private void printReport(ReportEntry reportEntry) {
  		out.println("chunk of code copied:");
-		//out.println(reportEntry.getUrl());
  		out.println("--------------------------------------------");
-		out.println(reportEntry.getCode().toString());
+		out.println(reportEntry.getCode());
  		out.println("--------------------------------------------");
  	}

@@ -201,8 +226,12 @@
  	 * returned Otherwise, null will be returned.
  	 *
  	 * @param searchEngines
-	 * @param replace
-	 * @return
+	 *            list of chosen search engine parsers
+	 * @param code
+	 *            code to be checked
+	 * @param fileName
+	 *            source file name
+	 * @return an report entry
  	 */
  	private ReportEntry searchOnInternet(List<ISearchEngine> searchEngines,
  			String code, String fileName) {
@@ -221,11 +250,12 @@

  	/**
  	 * @param heuristicCheckers
+	 *            lit of chosen heuristic checkers
  	 * @param toCheck
+	 *            code to be checked
  	 *
-	 *            then in sliding window algorithm we use this checkers in  
part
-	 *            of algorithm where we know current window content
-	 * @return
+	 * @return heuristicCheckerResult with proper information for sliding  
window
+	 *         algorithm
  	 */
  	private HeuristicCheckerResult isCheckOnInternetNeaded(
  			List<IHeuristicChecker> heuristicCheckers, String toCheck) {
@@ -245,10 +275,10 @@
  	/**
  	 * Append tokens from start position until end.
  	 *
-	 * @param tokens
-	 * @param start
-	 * @param end
-	 * @return
+	 * @param tokens list of tokens to combine
+	 * @param start position
+	 * @param end position
+	 * @return appended tokens
  	 */
  	private StringBuffer combineTokens(String[] tokens, int start, int end) {

@@ -263,10 +293,13 @@

  	/**
  	 * extract tokens
+	 *
+	 * @param code to be tokenised
+	 * @return list of words
  	 */
-	private String[] tokeniseString(String file) {
-		file = file.replaceAll("\\n", "\n ");
-		String[] tokens = file.split(STRING_DELIMETER_REGEX);
+	private String[] tokeniseString(String code) {
+		code = code.replaceAll("\\n", "\n ");
+		String[] tokens = code.split(STRING_DELIMETER_REGEX);
  		// this simple tokeniser returns array {""} when "" is tokenised
  		// I must avoid that behavior
  		if (tokens.length == 1 && tokens[0].equals(""))
@@ -277,21 +310,23 @@
  	/**
  	 * This method just prints in console information about current progress.
  	 *<p>
-	 * Example: Progress: 2/200 (1%)
+	 * Example:Analyzing file : c:\HelloWorld.java Progress: 2/200 (1%)
  	 *
  	 * @param whole
  	 *            is number of all tokens we iterate
  	 * @param current
  	 *            is current position of iteration
+	 * @param fileName
+	 *            is file name of source file
  	 */
-	private void printProgress(int whole, int current) {
+	private void printProgress(int whole, int current, String fileName) {
  		// clear previous state
  		for (int i = 0; i < progressMessageLength; i++) {
  			System.out.print("\b");
  		}

-		String message = "Progress: " + current + "/" + whole + " ("
-				+ (current * 100 / whole) + "%)";
+		String message = "Analyzing file: " + fileName + " Progress: "
+				+ current + "/" + whole + " (" + (current * 100 / whole) + "%)";
  		this.progressMessageLength = message.length();
  		System.out.print(message);
  	}
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/engines/ISearchEngine.java	Sat  
Aug  8 08:59:03 2009
+++ /trunk/src/main/java/org/apache/rat/pd/engines/ISearchEngine.java	Sun  
Aug 16 16:43:54 2009
@@ -22,15 +22,13 @@

  import java.util.List;

-
  /**
- * @author maka: this is interface which must implement all code search  
engine
- *         parsers in this program
+ * This is interface which must implement all code search engine parsers  
in this
+ * program.
+ *
+ * @author maka:
   */
  public interface ISearchEngine {
-	// TODO maybe to change this interface according to
-	// API of google code search engine:
-	// http://code.google.com/intl/en/apis/codesearch/

  	/**
  	 * Checks if code part exist on this code search engine
@@ -41,9 +39,11 @@
  	boolean isCodeFound(String posibleCutAndPastedCode);

  	/**
-	 * This method can returnsearch results with link where we can see  
exactly what is found
+	 * This method can return search results with link where we can see  
exactly
+	 * what is found
  	 *
-	 * @return list of SearchResults for report if code is found on search  
engine
+	 * @return list of SearchResults for report if code is found on search
+	 *         engine
  	 */
  	List<SearchResult> getSearchResults();
  }
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/engines/Managable.java	Wed Jun  
10 15:44:45 2009
+++ /trunk/src/main/java/org/apache/rat/pd/engines/Managable.java	Sun Aug  
16 16:43:54 2009
@@ -28,7 +28,7 @@
   * This interface define one function to implement. This function can  
retrieve
   * information about potentially plagiarised code from search engine.
   *
- * @author Maka
+ * @author maka
   *
   */
  public interface Managable {
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/engines/RetryManager.java	Sat  
Aug  1 05:56:47 2009
+++ /trunk/src/main/java/org/apache/rat/pd/engines/RetryManager.java	Sun  
Aug 16 16:43:54 2009
@@ -29,14 +29,13 @@
   *
   * This is separate class because all ISearchEngine implementations will  
use it.
   *
- * @author Maka
+ * @author maka
   *
   */
  public class RetryManager {
  	private final Managable searchengine;
  	private final int numberOfRetry;
  	private final int timeout;
-	private final PrintStream out;

  	/**
  	 * @param searchengine
@@ -46,11 +45,10 @@
  	 * @param timeout
  	 *            time between two requests
  	 */
-	public RetryManager(Managable searchengine, int numberOfRetry, int  
timeout, PrintStream out) {
+	public RetryManager(Managable searchengine, int numberOfRetry, int  
timeout) {
  		this.searchengine = searchengine;
  		this.timeout = timeout;
  		this.numberOfRetry = numberOfRetry;
-		this.out = out;
  	}

  	/**
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/engines/SearchResult.java	Sat  
Aug  8 08:59:03 2009
+++ /trunk/src/main/java/org/apache/rat/pd/engines/SearchResult.java	Sun  
Aug 16 16:43:54 2009
@@ -28,8 +28,6 @@
   * about query and retrieved result.
   */
  public class SearchResult {
-	// unique identifier of class - projectName+package name+class name
-	// private String id;
  	// project name/title
  	private String projectName;
  	// class name or filename
@@ -54,114 +52,188 @@
  	// package name
  	private String packageName;

+	/**
+	 * @return package name of source code file
+	 */
  	public String getPackageName() {
  		return packageName;
  	}

+	/**
+	 * Set package name of source code file
+	 *
+	 * @param packageName
+	 *            package name of source code file
+	 */
  	public void setPackageName(String packageName) {
  		this.packageName = packageName;
  	}

-	public String getId() {
-		return this.projectName + ":" + this.packageName + this.className;
-	}
-
+	/**
+	 * Set license of source code
+	 *
+	 * @param licence
+	 *            license of source code
+	 */
  	public void setLicence(String licence) {
  		this.licence = licence;
  	}

+	/**
+	 * @return license of source code
+	 */
  	public String getLicence() {
  		return licence;
  	}

+	/**
+	 * Set programming language of source code
+	 *
+	 * @param language
+	 *            programming language of source code
+	 */
  	public void setLanguage(String language) {
  		this.language = language;
  	}

+	/**
+	 * @return programming language of source code
+	 */
  	public String getLanguage() {
  		return language;
  	}

+	/**
+	 * Set name of code owner
+	 *
+	 * @param owner
+	 *            name of code owner
+	 */
  	public void setOwner(String owner) {
  		this.owner = owner;
  	}

+	/**
+	 * @return name of code owner
+	 */
  	public String getOwner() {
  		return owner;
  	}

+	/**
+	 * Set project name
+	 *
+	 * @param projectName
+	 */
  	public void setProjectName(String projectName) {
  		this.projectName = projectName;
  	}

+	/**
+	 * @return project name
+	 */
  	public String getProjectName() {
  		return projectName;
  	}

+	/**
+	 * Set possible plagiarised code
+	 *
+	 * @param codeForQuery
+	 */
  	public void setCodeForQuery(String codeForQuery) {
  		this.codeForQuery = codeForQuery;
  	}

+	/**
+	 * @return possible plagiarised code
+	 */
  	public String getCodeForQuery() {
  		return codeForQuery;
  	}

+	/**
+	 * Set list of matched code parts
+	 *
+	 * @param matchedCode
+	 *            list of matched code parts
+	 */
  	public void setMatchedCode(List<String> matchedCode) {
  		this.matchedCode = matchedCode;
  	}

+	/**
+	 * @return list of matched code parts
+	 */
  	public List<String> getMatchedCode() {
  		return matchedCode;
  	}

+	/**
+	 * Set source file name
+	 *
+	 * @param className
+	 *            source file name
+	 */
  	public void setClassName(String className) {
  		this.className = className;
  	}

+	/**
+	 * @return source file name
+	 */
  	public String getClassName() {
  		return className;
  	}

+	/**
+	 * Set code in class
+	 *
+	 * @param codeInClass
+	 *            raw code in class
+	 */
  	public void setCodeInClass(String codeInClass) {
  		this.codeInClass = codeInClass;
  	}

+	/**
+	 * @return code in class
+	 */
  	public String getCodeInClass() {
  		return codeInClass;
  	}

+	/**
+	 * Set URL to site where code is found
+	 *
+	 * @param link
+	 *            URL to site where code is found
+	 */
  	public void setLink(String link) {
  		this.link = link;
  	}

+	/**
+	 * @return URL to site where code is found
+	 */
  	public String getLink() {
  		return link;
  	}

+	/**
+	 * Set name of CodeSearchEngine
+	 *
+	 * @param engine
+	 *            name of CodeSearchEngine
+	 */
  	public void setEngine(String engine) {
  		this.engine = engine;
  	}
-
-	public String getEngine() {
-		return engine;
-	}

  	/**
-	 * Idea is: when comparing results, result with more matching code is
-	 * greater that another one.
-	 *
-	 * @param o
-	 * @return
+	 * @return name of CodeSearchEngine
  	 */
-	public int compareTo(SearchResult o) {
-		int first = 0;
-		int second = 0;
-		for (String match : this.matchedCode) {
-			first += match.length();
-		}
-		for (String match : o.getMatchedCode()) {
-			second += match.length();
-		}
-		return Integer.valueOf(first).compareTo(Integer.valueOf(second));
+	public String getEngine() {
+		return engine;
  	}
  }
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/engines/google/GoogleCodeSearchParser.java	 
Sat Aug  8 08:59:03 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/engines/google/GoogleCodeSearchParser.java	 
Sun Aug 16 16:43:54 2009
@@ -22,7 +22,6 @@

  import java.io.IOException;
  import java.io.PrintStream;
-import java.io.PrintWriter;
  import java.net.MalformedURLException;
  import java.net.URL;
  import java.net.URLEncoder;
@@ -45,8 +44,7 @@
  import com.google.gdata.util.ServiceException;

  /**
- * This class is for communication between search engine and this program  
It is
- * now very basic....
+ * This class is for communication between search engine and this program.
   *
   * @author maka:
   *
@@ -93,11 +91,14 @@

  		/**
  		 * Decide are strings similar using various algorithms for string
-		 * similarity
+		 * similarity. If similarity is found based on any type of checking,
+		 * results will be true.
  		 *
-		 * @param myFeed
+		 * @param searchResults
+		 *            list of results for our query
  		 * @param posibleCutAndPastedCode
-		 * @return
+		 *            code currently being checked
+		 * @return information is there any match is found
  		 */
  		public boolean isMatchFound(List<SearchResult> searchResults,
  				String posibleCutAndPastedCode) {
@@ -125,8 +126,11 @@
  		 * Make decision are strings similar by calculation of string distance
  		 *
  		 * @param source
+		 *            source code
  		 * @param toMatch
-		 * @return
+		 *            code for comparation
+		 * @return information are code parts similar based on
+		 *         LevenshteinDistance algorithm
  		 */
  		boolean isMatchingByDistance(String source, String toMatch) {

@@ -144,8 +148,11 @@
  		 * Make decision are strings similar by matching by regular expression
  		 *
  		 * @param source
+		 *            source code
  		 * @param toMatch
-		 * @return
+		 *            code for comparation
+		 * @return information are code parts similar based on similarity by
+		 *         regular expression
  		 */
  		boolean isMatchingByRegex(String source, String toMatch) {

@@ -169,8 +176,7 @@
  	public GoogleCodeSearchParser(String language, PrintStream out) {
  		this.language = language;
  		this.out = out;
-		this.retryManager = new RetryManager(this, NUMBER_OF_RETRY, WAIT_TIME,
-				out);
+		this.retryManager = new RetryManager(this, NUMBER_OF_RETRY, WAIT_TIME);
  	}

  	private String getLanguageQuery() {
@@ -181,11 +187,23 @@
  		return toret;
  	}

+	/*
+	 * (non-Javadoc)
+	 *
+	 * @see org.apache.rat.pd.engines.ISearchEngine#getSearchResults()
+	 */
  	@Override
  	public List<SearchResult> getSearchResults() {
  		return searchResults;
  	}

+	/**
+	 * @param posibleCutAndPastedCode
+	 *            code for checking
+	 * @return proper URL for gdata-codesearch API
+	 * @throws MalformedURLException
+	 */
+	@SuppressWarnings("deprecation")
  	// FIXME URLEncoding is system dependent now
  	private URL createUrl(String posibleCutAndPastedCode)
  			throws MalformedURLException {
@@ -195,17 +213,29 @@
  				+ URLEncoder.encode(regexGenerator
  						.stringToRegex(posibleCutAndPastedCode))
  				+ "&max-results=" + RESULT_NUMBER);
+
  		out.println(posibleCutAndPastedCode + " -> "
  				+ regexGenerator.stringToRegex(posibleCutAndPastedCode));
  		return toret;
  	}

+	/*
+	 * (non-Javadoc)
+	 *
+	 * @see
+	 * org.apache.rat.pd.engines.ISearchEngine#isCodeFound(java.lang.String)
+	 */
  	@Override
  	public boolean isCodeFound(String posibleCutAndPastedCode) {
  		searchResults.clear();
  		return retryManager.isCodeFound(posibleCutAndPastedCode);
  	}

+	/*
+	 * (non-Javadoc)
+	 *
+	 * @see org.apache.rat.pd.engines.Managable#gueryEngine(java.lang.String)
+	 */
  	@Override
  	public boolean gueryEngine(String posibleCutAndPastedCode)
  			throws IOException, ServiceException {
@@ -245,8 +275,10 @@
  	 * length. For GoogleCodeSearch length is 1024.
  	 *
  	 * @param posibleCutAndPastedCode
+	 *            code to be checked
  	 * @param length
-	 * @return
+	 *            maximum length of query which code search engine can manage
+	 * @return list of url-s not longer then @length
  	 * @throws IOException
  	 */
  	List<URL> splitLongUrl(String posibleCutAndPastedCode, int length)
@@ -305,15 +337,19 @@
  	 * for debugging
  	 *
  	 * @param myFeed
+	 *            GoogleCodeSearch feed
  	 * @param entry
+	 *            Google CodeSearchEntry
  	 * @param out
+	 *            print stream for printing current information
  	 * @throws Exception
  	 */
  	private void printAdditionalInformation(CodeSearchFeed myFeed,
-			CodeSearchEntry entry, PrintStream out) {
+			CodeSearchEntry entry) {
  		out.println("\tgetEtag: " + entry.getEtag());
-		out.println("\tgetExtensionLocalName: "
-				+ entry.getExtensionLocalName());
+		out
+				.println("\tgetExtensionLocalName: "
+						+ entry.getExtensionLocalName());
  		out.println("\tgetId: " + entry.getId());
  		out.println("\tgetVersionId: " + entry.getVersionId());
  		out.println("\tgetCategories: " + entry.getCategories().size());
@@ -336,6 +372,16 @@
  		}
  	}

+	/**
+	 * Parse GoogleCodeSearch feed and pack retrieved information in proper to
+	 * us form.
+	 *
+	 * @param myFeed
+	 *            GoogleCodeSearch feed
+	 * @param posibleCutAndPastedCode
+	 *            code to be checked
+	 * @return list of search result retrieved from GoogleCodeSearch feed
+	 */
  	private List<SearchResult> createSearchResutl(CodeSearchFeed myFeed,
  			String posibleCutAndPastedCode) {
  		List<SearchResult> toRet = new ArrayList<SearchResult>();
@@ -359,8 +405,8 @@
  						m.getLineText().getPlainText());
  			}
  			toRet.add(searchResult);
-
-			printAdditionalInformation(myFeed, entry, out);
+
+			printAdditionalInformation(myFeed, entry);
  		}

  		return toRet;
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/engines/google/MultilineRegexGenerator.java	 
Tue Jul  7 17:06:24 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/engines/google/MultilineRegexGenerator.java	 
Sun Aug 16 16:43:54 2009
@@ -35,6 +35,9 @@
  	private static final String LINE_END_REGEX = "$";
  	private static final String BLANK_MARK_FOR_URL = " ";

+	/* (non-Javadoc)
+	 * @see  
org.apache.rat.pd.engines.google.RegexGenerator#stringToRegex(java.lang.String)
+	 */
  	@Override
  	public String stringToRegex(String sourceCode) {
  		String toret = "";
@@ -49,10 +52,6 @@
  				toret += ZERO_OR_MORE_BLANK_SPACE + LINE_END_REGEX
  						+ BLANK_MARK_FOR_URL;
  			}
-			// TODO BLANK_MARK_FOR_URL between each two line regex is only way
-			// to make
-			// GoogleCodeSearch to work.
-			// Maybe there is more elegant way to do it.
  		}
  		return toret;
  	}
@@ -60,8 +59,8 @@
  	/**
  	 * Exclude blank lines from array.
  	 *
-	 * @param lines
-	 * @return
+	 * @param lines list of lines of code
+	 * @return array of code lines without empty ones
  	 */
  	private String[] excludeBlankLines(String[] lines) {
  		int lentgh = lines.length;
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/engines/google/RegexGenerator.java	 
Sun Jul  5 17:15:14 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/engines/google/RegexGenerator.java	 
Sun Aug 16 16:43:54 2009
@@ -20,6 +20,17 @@
   */
  package org.apache.rat.pd.engines.google;

+/**
+ * It make regular expressions in POSIX extended regular expression syntax
+ * standard. This class can generate regular expressions understandable to
+ * GoogleCodeSearch parsing provided source code. It replace reserved  
symbols in
+ * POSIX extended regular expression syntax with proper regular  
expressions, and
+ * then replace regex matching whitespace to ensure that our search is non  
sensitive to
+ * whitespace and new line characters.
+ *
+ * @author Maka
+ *
+ */
  public class RegexGenerator {
  	// special signs in regex
  	protected final static String L_PARENTHESIS = "\\(";
@@ -37,7 +48,7 @@
  	protected final static String MINUS_SIGN = "\\-";
  	protected final static String VERTICAL_LINE = "\\|";
  	protected final static String BACK_SLASH = "\\";
-	//non literals
+	// non literals
  	protected final static String LES_THEN = "<";
  	protected final static String GREAT_THEN = ">";
  	protected final static String SEMICOLOMN = ";";
@@ -60,51 +71,75 @@
  	protected final static String ZERO_OR_MORE_BLANK_SPACE_REGEX  
= "(\\\\s?)+";

  	protected final static String[] regExpresions = { L_PARENTHESIS,
-			R_PARENTHESIS, L_SQUARE_BRACKET, R_SQUARE_BRACKET, L_CURLY_BRACKET,
-			R_CURLY_BRACKET, DOT, QUESTION_MARK, ASTERIX, POWER, DOLLAR_SIGN,
-			PLUS_SIGN, MINUS_SIGN, VERTICAL_LINE, LES_THEN, GREAT_THEN,
-			SEMICOLOMN, COLON, EQUALS, SLASH, DOUBLE_QUOTES};
+			DOLLAR_SIGN, R_PARENTHESIS, L_SQUARE_BRACKET, R_SQUARE_BRACKET,
+			L_CURLY_BRACKET, R_CURLY_BRACKET, DOT, QUESTION_MARK, ASTERIX,
+			POWER, PLUS_SIGN, MINUS_SIGN, VERTICAL_LINE, LES_THEN, GREAT_THEN,
+			SEMICOLOMN, COLON, EQUALS, SLASH, DOUBLE_QUOTES };

  	/**
  	 * Returns regular expression of source code
  	 *
  	 * @param sourceCode
-	 * @return
+	 *            code from which regex are generated
+	 * @return regex generated from source code
  	 */
  	public String stringToRegex(String sourceCode) {
-		String toret = sourceCode;
-		// add whitespace before any non-character
-		toret = addWhitespaceAroundNonLiteral(toret);
-		toret = replaceNonLiteralWithProperRegex(toret);
-		toret = replaceWhitespaceWithRegex(toret);
-		return toret;
+		// add whitespace before and after any non-character
+		sourceCode = addWhitespaceAroundNonLiteral(sourceCode);
+		// replace non literal with proper regex
+		sourceCode = replaceNonLiteralWithProperRegex(sourceCode);
+		// replace blank spaces with proper regex
+		sourceCode = replaceWhitespaceWithRegex(sourceCode);
+		return sourceCode;
  	}

-	protected String replaceWhitespaceWithRegex(String toret) {
-		toret = toret.replaceAll(" ", ZERO_OR_MORE_BLANK_SPACE_REGEX);
-		return toret;
+	/**
+	 * Replace strings matching whitespace with proper regex
+	 * @param code
+	 *            string with multiple blank spaces
+	 * @return string without multiple blank spaces
+	 */
+	protected String replaceWhitespaceWithRegex(String code) {
+		code = code.replaceAll(" ", ZERO_OR_MORE_BLANK_SPACE_REGEX);
+		return code;
  	}

-	protected String replaceNonLiteralWithProperRegex(String toret) {
+	/**
+	 * Replace reserved symbols in
+	 * POSIX extended regular expression syntax with proper regular  
expressions
+	 *
+	 * @param code
+	 *            string with symbols listed in RegexGenerator.regExpresions
+	 *            array
+	 * @return string with regular expressions
+	 */
+	protected String replaceNonLiteralWithProperRegex(String code) {
  		for (String regex : regExpresions) {
-			toret = toret.replaceAll(regex, BACK_SLASH + regex);
-		}
-		return toret;
+			if (!regex.equals(DOLLAR_SIGN)) {
+				code = code.replaceAll(regex, BACK_SLASH + regex);
+			} else {
+				// for some reason, String.replaceAll("\$", "anything") always
+				// breaks
+				code = code.replace("$", "\\$");
+			}
+		}
+		return code;
  	}

  	/**
-	 * add whitespace around non-literal
+	 * Add whitespace around non-literal
  	 * <p>
  	 * for example: "main(String[] args)" will be "main ( String [ ] args )"
  	 *
-	 * @param toret
-	 * @return
+	 * @param code
+	 *            string
+	 * @return string with whitespace added around non-literal
  	 */
-	protected String addWhitespaceAroundNonLiteral(String toret) {
+	protected String addWhitespaceAroundNonLiteral(String code) {
  		for (String regex : regExpresions) {
-			toret = toret.replaceAll(regex, BLANK_SPACE + regex + BLANK_SPACE);
-		}
-		toret = toret.replaceAll(ONE_OR_MORE_BLANK_SPACE, BLANK_SPACE);
-		return toret.trim();
+			code = code.replaceAll(regex, BLANK_SPACE + regex + BLANK_SPACE);
+		}
+		code = code.replaceAll(ONE_OR_MORE_BLANK_SPACE, BLANK_SPACE);
+		return code.trim();
  	}
  }
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/BruteForceHeuristicChecker.java	 
Sat Jul 25 03:08:02 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/BruteForceHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -49,6 +49,9 @@
  				true, codeToBeChecked, this.getClass());
  	}

+	/* (non-Javadoc)
+	 * @see org.apache.rat.pd.heuristic.IHeuristicChecker#getLanguage()
+	 */
  	@Override
  	public String getLanguage() {
  		return this.language;
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/HeuristicCheckerResult.java	 
Sun Jul 12 17:28:44 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/HeuristicCheckerResult.java	 
Sun Aug 16 16:43:54 2009
@@ -28,20 +28,6 @@
   *
   */
  public class HeuristicCheckerResult {
-
-	/**
-	 * @param canContinue
-	 * @param codeSugestedToCheck
-	 * @param foundBy
-	 */
-	public HeuristicCheckerResult(final boolean checkOnInternetNeaded,
-			final boolean shouldStretch, final String codeSugsestedToCheck,
-			final Class<? extends IHeuristicChecker> foundBy) {
-		this.checkOnInternetNeaded = checkOnInternetNeaded;
-		this.shouldStretch = shouldStretch;
-		this.codeSugsestedTobeChecked = codeSugsestedToCheck;
-		this.foundBy = foundBy;
-	}

  	/**
  	 * This parameter will be set as true if check on Internet is needed.
@@ -66,18 +52,49 @@
  	 */
  	private final String codeSugsestedTobeChecked;

+	/**
+	 * @param checkOnInternetNeaded
+	 *            information is search on on Internet needed
+	 * @param shouldStretch
+	 *            information for sliding algorithm to continue to stretch
+	 * @param codeSugestedToCheck
+	 *            code suggested for check by particular heuristic checker
+	 * @param foundBy
+	 *            name of heuristic checker found this code part
+	 */
+	public HeuristicCheckerResult(final boolean checkOnInternetNeaded,
+			final boolean shouldStretch, final String codeSugsestedToCheck,
+			final Class<? extends IHeuristicChecker> foundBy) {
+		this.checkOnInternetNeaded = checkOnInternetNeaded;
+		this.shouldStretch = shouldStretch;
+		this.codeSugsestedTobeChecked = codeSugsestedToCheck;
+		this.foundBy = foundBy;
+	}
+
+	/**
+	 * @return code suggested for check by particular heuristic checker
+	 */
  	public String getCodeSuggestedToBeChecked() {
  		return codeSugsestedTobeChecked;
  	}

+	/**
+	 * @return information for sliding algorithm to continue to stretch
+	 */
  	public boolean isShouldStretch() {
  		return shouldStretch;
  	}

+	/**
+	 * @return  name of heuristic checker found this code part
+	 */
  	public Class<? extends IHeuristicChecker> getFoundBy() {
  		return foundBy;
  	}

+	/**
+	 * @return  information is search on on Internet needed
+	 */
  	public boolean isCheckOnInternetNeaded() {
  		return checkOnInternetNeaded;
  	}
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/heuristic/IHeuristicChecker.java	 
Sat Jul 25 03:08:02 2009
+++ /trunk/src/main/java/org/apache/rat/pd/heuristic/IHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -36,8 +36,8 @@
  	 * This method knows is codeToBeChecked part of good-to-be-copied code. If
  	 * it is, method will return true in checkOnInternetNeaded property of  
HeuristicCheckerResult.
  	 *
-	 * @param codeToBeChecked
-	 * @return HeuristicCheckerResult
+	 * @param codeToBeChecked possible good-to-be-copied code
+	 * @return HeuristicCheckerResult information generated for sliding  
window algorithm
  	 */
  	HeuristicCheckerResult checkByHeuristic(String codeToBeChecked);

=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ActionScriptCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ActionScriptCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -32,7 +32,7 @@
   *  
http://www.adobe.com/support/flash/action_scripts/actionscript_dictionary/
   * actionscript_dictionary023.html
   *
- * @author Maka
+ * @author maka
   *
   */
  public class ActionScriptCommentHeuristicChecker extends
@@ -41,8 +41,12 @@
  	/**
  	 * This regular expression match comments in ActionScript.
  	 */
-	private static final String ACTION_SCRIPT_COMMENT_REGEX = "(/\\*(?:[^*]| 
(?:\\*+[^*/]))*\\*+/)|(//.*[\\n\\r])";
-
+	private static final String ACTION_SCRIPT_COMMENT_REGEX  
= "(/\\*[\\s\\S]*\\*+/[\\n\\r]*)|(//.*[\\n\\r])";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public ActionScriptCommentHeuristicChecker(int limit, PrintStream out) {
  		super(ACTION_SCRIPT_COMMENT_REGEX, limit, "actionscript", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CCommentHeuristicChecker.java	 
Sat Aug  8 08:59:03 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -28,7 +28,7 @@
   * and ends with * /. More info on:
   * http://www.psgd.org/paul/docs/cstyle/cstyle03.htm
   *
- * @author maka:
+ * @author maka
   *
   */
  public class CCommentHeuristicChecker extends CommentHeuristicChecker {
@@ -38,12 +38,13 @@
  	 * http://ostermiller.org/findcomment.html
  	 *
  	 */
-//	private static final String C_COMMENT_REGEX = "/\\*(?:[^*]| 
(?:\\*+[^*/]))*\\*+/[\\n\\r]*";
  	private static final String C_COMMENT_REGEX  
= "/\\*[\\s\\S]*\\*+/[\\n\\r]*";
-
-
-	public CCommentHeuristicChecker(int limit,PrintStream out) {
-		super(C_COMMENT_REGEX, limit, "c",out);
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
+	public CCommentHeuristicChecker(int limit, PrintStream out) {
+		super(C_COMMENT_REGEX, limit, "c", out);

  	}

=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CPPCommentHeuristicChecker.java	 
Sat Aug  8 08:59:03 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CPPCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -29,17 +29,19 @@
   * comments, without text before or after it. More info on:
   * http://www.functionx.com/cpp/references/comments.htm
   *
- * @author maka:
+ * @author maka
   */
  public class CPPCommentHeuristicChecker extends CommentHeuristicChecker {
  	/**
  	 * This regular expression match comments in C++. More info on:{@link}
  	 * http://ostermiller.org/findcomment.html
  	 */
-	// private static final String CPP_COMMENT_REGEX =
-	// "(/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(//.*[\\n\\r])";
  	private static final String CPP_COMMENT_REGEX  
= "(/\\*[\\s\\S]*\\*+/[\\n\\r]*)|(//.*[\\n\\r])";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public CPPCommentHeuristicChecker(int limit, PrintStream out) {
  		super(CPP_COMMENT_REGEX, limit, "c++", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CSharpCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CSharpCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -27,7 +27,7 @@
   * line ends; second starts with /* and ends with * /. More info on:
   * http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=336#4
   *
- * @author maka:
+ * @author maka
   */
  public class CSharpCommentHeuristicChecker extends CommentHeuristicChecker  
{

@@ -35,8 +35,12 @@
  	 * This regular expression match comments in C#. More info on: {@link}
  	 * http://ostermiller.org/findcomment.html
  	 */
-	private static final String C_SHARP_COMMENT_REGEX = "(/\\*(?:[^*]| 
(?:\\*+[^*/]))*\\*+/)|(//.*[\\n\\r])";
-
+	private static final String C_SHARP_COMMENT_REGEX  
= "(/\\*[\\s\\S]*\\*+/[\\n\\r]*)";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public CSharpCommentHeuristicChecker(int limit, PrintStream out) {
  		super(C_SHARP_COMMENT_REGEX, limit, "c#", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CobolCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CobolCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -23,7 +23,6 @@

  import java.io.PrintStream;

-//TODO check if it really have digits for first 6 position
  /**
   * Comments in Cobol are defined with an asterisk at seventh character  
position
   * and ends when line ends, while the first six character positions are  
reserved
@@ -40,6 +39,10 @@
  	 */
  	private static final String COBOL_COMMENT_REGEX = "^\\d{6}\\*.*[\\n\\r]";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public CobolCommentHeuristicChecker(int limit, PrintStream out) {
  		super(COBOL_COMMENT_REGEX, limit, "cobol", out);

=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ColdFusionCommentsheuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ColdFusionCommentsheuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -29,16 +29,20 @@
   * coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp
   * .htm?context=ColdFusion_Documentation&file=00000868.htm
   *
- * @author Maka
+ * @author maka
   *
   */
  public class ColdFusionCommentsheuristicChecker extends  
CommentHeuristicChecker {
-	// TODO Should inner comments be found?
+
  	/**
  	 * This regular expression match comments in ColdFusion.
  	 */
  	private static final String COLD_FUSION_COMMENT_REGEX = "<![  
\\r\\n\\t]*(---([^\\-]|[\\r\\n]|-[^\\-])*---[ \\r\\n\\t]*)>";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public ColdFusionCommentsheuristicChecker(int limit, PrintStream out) {
  		super(COLD_FUSION_COMMENT_REGEX, limit, "coldfusion", out);

=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/CommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -40,6 +40,12 @@
  	private final String language;
  	private final PrintStream out;

+	/**
+	 * @param regex regular expression to match comment in current language
+	 * @param limit minimal length of comment which will be considered
+	 * @param language programming language
+	 * @param out print stream for logging purposes
+	 */
  	public CommentHeuristicChecker(String regex, int limit, String language,  
PrintStream out) {
  		this.COMMENT_REGEX = regex;
  		this.limit = limit;
@@ -77,6 +83,9 @@
  		return toret;
  	}

+	/* (non-Javadoc)
+	 * @see org.apache.rat.pd.heuristic.IHeuristicChecker#getLanguage()
+	 */
  	@Override
  	public String getLanguage() {
  		return this.language;
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/DelphiCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/DelphiCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -37,8 +37,12 @@
  	/**
  	 * This regular expression match comments in Delphi.
  	 */
-	private static final String DELPHI_COMMENT_REGEX = "(//.*[\\n\\r])| 
(\\(\\*(?:[^*]|(?:\\*+[^*\\(]))*\\*+\\))|(\\{(?:[^*]|(?:\\{+[^*]))*\\}+)";
-
+	private static final String DELPHI_COMMENT_REGEX  
= "(\\{[\\s\\S]*\\}[\\n\\r]*)|(\\(\\*[\\s\\S]*\\*\\))|(//.*[\\n\\r])";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public DelphiCommentHeuristicChecker(int limit, PrintStream out) {
  		super(DELPHI_COMMENT_REGEX, limit, "pascal", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/FortranCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/FortranCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -35,10 +35,12 @@
  	/**
  	 * This regular expression match comments in Fortran.
  	 */
-	// TODO check it again if there has to be space between C and rest of the
-	// comment
  	private static final String FORTRAN_COMMENT_REGEX = "(!.*[\\n\\r])| 
(^C .*[\\n\\r])|(^c .*[\\n\\r])|(^\\* .*[\\n\\r])";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public FortranCommentHeuristicChecker(int limit, PrintStream out) {
  		super(FORTRAN_COMMENT_REGEX, limit, "fortran", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/HTMLCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/HTMLCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -31,7 +31,7 @@
   * contain any occurrence of "--". More info about comments in HTML on:
   * http://htmlhelp.com/reference/wilbur/misc/comment.html
   *
- * @author maka:
+ * @author maka
   */
  public class HTMLCommentHeuristicChecker extends CommentHeuristicChecker {

@@ -42,6 +42,10 @@
  	 */
  	private static final String HTML_COMMENT_REGEX = "(<!((--([^\\-]|[\\r\\n]| 
-[^\\-])*--(\\s)*))*>)";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public HTMLCommentHeuristicChecker(int limit, PrintStream out) {
  		super(HTML_COMMENT_REGEX, limit, IHeuristicChecker.ALL_LANGUAGES, out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/JavaCommentHeuristicChecker.java	 
Sat Aug  8 08:59:03 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/JavaCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -27,21 +27,19 @@
   * ends; second starts with /* and ends with * /. More info on:
   * http://hanuska.blogspot.com/2006/08/comments-in-java-code.html
   *
- * @author maka:
+ * @author maka
   */
  public class JavaCommentHeuristicChecker extends CommentHeuristicChecker {

  	/**
-	 * This regular expression match comments in Java. More info on:{@link}
-	 * http://ostermiller.org/findcomment.html
-	 */
-	//private static final String JAVA_COMMENT_REGEX = "(/\\*(?:[^*]| 
(?:\\*+[^*/]))*\\*+/[\\n\\r]*)|(//.*[\\n\\r])";
-
-	/**
-	 * by using this second regular expression StackOverflowError is avoided
+	 * This regular expression match comments in Java.
  	 */
  	private static final String JAVA_COMMENT_REGEX  
= "(/\\*[\\s\\S]*\\*+/[\\n\\r]*)|(//.*[\\n\\r])";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public JavaCommentHeuristicChecker(int limit, PrintStream out) {
  		super(JAVA_COMMENT_REGEX, limit, "java", out);

=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/JavaScriptCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/JavaScriptCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -27,15 +27,19 @@
   * ends; second starts with /* and ends with * /. More info on:
   * http://www.techotopia.com/index.php/Comments_in_JavaScript
   *
- * @author maka:
+ * @author maka
   */
  public class JavaScriptCommentHeuristicChecker extends  
CommentHeuristicChecker {

  	/**
  	 * This regular expression match comments in JavaScript.
  	 */
-	private static final String JAVA_SCRIPT_COMMENT_REGEX = "(/\\*(?:[^*]| 
(?:\\*+[^*/]))*\\*+/)|(//.*[\\n\\r])";
-
+	private static final String JAVA_SCRIPT_COMMENT_REGEX  
= "(/\\*[\\s\\S]*\\*+/[\\n\\r]*)|(//.*[\\n\\r])";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public JavaScriptCommentHeuristicChecker(int limit, PrintStream out) {
  		super(JAVA_SCRIPT_COMMENT_REGEX, limit, "javascript", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/LispCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/LispCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -36,8 +36,12 @@
  	/**
  	 * This regular expression matches comments in Lisp.
  	 */
-	private static final String LISP_COMMENT_REGEX = "(#\\|(?:[^*]|(?:\\| 
+[^(\\|#)]))*\\|+#)|(;.*[\\n\\r])";
-
+	private static final String LISP_COMMENT_REGEX = "(#\\|[\\s\\S]*\\|#)| 
(;.*[\\n\\r])";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public LispCommentHeuristicChecker(int limit, PrintStream out) {
  		super(LISP_COMMENT_REGEX, limit, "lisp", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PHPCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PHPCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -37,8 +37,12 @@
  	/**
  	 * This regular expression match comments in PHP.
  	 */
-	private static final String PHP_COMMENT_REGEX = "(#.*[\\n\\r])| 
((/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(//.*[\\n\\r]))";
-
+	private static final String PHP_COMMENT_REGEX = "(#.*[\\n\\r])| 
(/\\*[\\s\\S]*\\*+/[\\n\\r]*)|(//.*[\\n\\r])";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public PHPCommentHeuristicChecker(int limit, PrintStream out) {
  		super(PHP_COMMENT_REGEX, limit, "php", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PascalCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PascalCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -40,6 +40,10 @@
  	 */
  	private static final String PASCAL_COMMENT_REGEX = "(\\{.*\\})| 
(\\{.*\\*\\))|(\\(\\*.*\\*\\))|(\\(\\*.*\\})";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public PascalCommentHeuristicChecker(int limit, PrintStream out) {
  		super(PASCAL_COMMENT_REGEX, limit, "pascal", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PerlCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PerlCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -39,6 +39,10 @@
  	 */
  	private static final String PERL_COMMENT_REGEX = "(#.*[\\n\\r])";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public PerlCommentHeuristicChecker(int limit, PrintStream out) {
  		super(PERL_COMMENT_REGEX, limit, "perl", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PythonCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/PythonCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -28,15 +28,19 @@
   * block comments. More info on:
   * http://mail.python.org/pipermail/tutor/2004-February/028432.html
   *
- * @author maka:
+ * @author maka
   */
  public class PythonCommentHeuristicChecker extends CommentHeuristicChecker  
{

  	/**
  	 * This regular expression match comments in Python.
  	 */
-	private static final String PYTHON_COMMENT_REGEX = "(#.*[\\n\\r])| 
(\"\"\"[ [\\n\\r](\\w)]*\"\"\")";
-
+	private static final String PYTHON_COMMENT_REGEX = "(#.*[\\n\\r])| 
(\"\"\"[\\s\\S]*\"\"\")";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public PythonCommentHeuristicChecker(int limit, PrintStream out) {
  		super(PYTHON_COMMENT_REGEX, limit, "python", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/RubyCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/RubyCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -26,7 +26,7 @@
   * There is only one type of comments in Ruby. Ruby uses comments that  
start
   * with a #, and continue till the end of the line.
   *
- * @author maka:
+ * @author maka
   */
  public class RubyCommentHeuristicChecker extends CommentHeuristicChecker {

@@ -37,6 +37,10 @@
  	 */
  	private static final String RUBY_COMMENT_REGEX = "(#.*[\\n\\r])";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public RubyCommentHeuristicChecker(int limit, PrintStream out) {
  		super(RUBY_COMMENT_REGEX, limit, "ruby", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/SQLCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/SQLCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -27,15 +27,19 @@
   * with a / * and ends with * /. More info on: http://www.redbooks.ibm.
   * com/pubs/html/as400/v4r5/ic2924/index.htm?info/db2/rbafzmst67.htm
   *
- * @author Maka
+ * @author maka
   *
   */
  public class SQLCommentHeuristicChecker extends CommentHeuristicChecker {
  	/**
  	 * This regular expression match comments in SQL.
  	 */
-	private static final String SQL_COMMENT_REGEX = "(/\\*(?:[^*]| 
(?:\\*+[^*/]))*\\*+/)|(--.*[\\n\\r])";
-
+	private static final String SQL_COMMENT_REGEX  
= "(/\\*[\\s\\S]*\\*+/[\\n\\r]*)|(--.*[\\n\\r])";
+
+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public SQLCommentHeuristicChecker(int limit, PrintStream out) {
  		super(SQL_COMMENT_REGEX, limit, "sql", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ShellCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/ShellCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -26,7 +26,7 @@
   * There is only one type of comments in Shell. Shell uses comments that  
start
   * with a #, and continue till the end of the line.
   *
- * @author maka:
+ * @author maka
   */
  public class ShellCommentHeuristicChecker extends CommentHeuristicChecker {

@@ -37,6 +37,10 @@
  	 */
  	private static final String SHELL_COMMENT_REGEX = "(#.*[\\n\\r])";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public ShellCommentHeuristicChecker(int limit, PrintStream out) {
  		super(SHELL_COMMENT_REGEX, limit, "shell", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/VisualBasicCommentHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/comment/VisualBasicCommentHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -25,11 +25,11 @@
  import org.apache.rat.pd.heuristic.IHeuristicChecker;

  /**
- * There is onew type of comments in Visual Basic. It starts with a ' end  
last
+ * There is one type of comments in Visual Basic. It starts with a ' end  
last
   * till the end of a line. More info on:
   * http://www.aivosto.com/vbtips/codedoc.html
   *
- * @author Maka
+ * @author maka
   *
   */
  public class VisualBasicCommentHeuristicChecker extends  
CommentHeuristicChecker {
@@ -38,6 +38,10 @@
  	 */
  	private static final String VISUAL_BASIC_COMMENT_REGEX  
= "(\\s*\\'.*[\\n\\r])+";

+	/**
+	 * @param limit minimal length of comment which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public VisualBasicCommentHeuristicChecker(int limit, PrintStream out) {
  		super(VISUAL_BASIC_COMMENT_REGEX, limit,
  				IHeuristicChecker.ALL_LANGUAGES, out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/ActionScriptFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/ActionScriptFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -47,8 +47,12 @@
  	 * <p>
  	 * More info on {@link} http://en.wikipedia.org/wiki/ActionScript
  	 */
-	private static final String ACTION_SCRIPT_FUNCTION_REGEX = "function  
+\\w+ *\\([\\s\\S]*\\)(( *\\: *\\w+)){0,1}\\s*\\{[\\s\\S]*\\}[\n\r]*";
-
+	private static final String ACTION_SCRIPT_FUNCTION_REGEX = "function  
+\\w+ *\\([^()]*?\\)(( *\\: *\\w+)){0,1}\\s*\\{[\\s\\S]*\\}[\n\r]*";
+
+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public ActionScriptFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, ACTION_SCRIPT_FUNCTION_REGEX,
  				ACTION_SCRIPT_OPENED_BRACKET,  
ACTION_SCRIPT_CLOSED_BRACKET, "actionscript", out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CFunctionHeuristicChecker.java	 
Sat Aug  8 08:59:03 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -48,7 +48,7 @@
  	 * <li>{@link} http://www2.its.strath.ac.uk/courses/c/ <li>{@link}
  	 * http://www.greenend.org.uk/rjk/2003/03/inline.html
  	 */
-	private static final String C_FUNCTION_REGEX = "^[\t ]*\\w+ +\\w+  
*\\([\\s\\S]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
+	private static final String C_FUNCTION_REGEX = "\\w+\\s+[*&]?\\s*\\w+  
*\\([^()]*?\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";

  	/**
  	 * This regular expression match macro in C which are multilines. For
@@ -92,7 +92,7 @@
  	 * inline void Recycle( char* aBuffer) { nsMemory::Free(aBuffer); }
  	 */

-	private static final String C_INLINE_FUNCTION_1  
= "inline(\\s)*\\w+(\\s*)(.*)\\([\\s\\S]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
+	private static final String C_INLINE_FUNCTION_1  
= "inline(\\s)*\\w+(\\s*)(.*)\\([^()]*?\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
  	/**
  	 * This regular expression match inline functions in C which have no
  	 * parametrs.
@@ -112,6 +112,10 @@
  	private static final String FUNCTION_REGEX = "(" + C_FUNCTION_REGEX + ")"
  			+ OR_REGEX + C_MACRO_REGEX + OR_REGEX + C_INLINE_FUNCTION;

+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public CFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, FUNCTION_REGEX, C_OPENED_BRACKET, C_CLOSED_BRACKET, "c",  
out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CPPFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CPPFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -38,7 +38,7 @@
  	/**
  	 * This regular expression match functions in C++.
  	 */
-	private static final String CPP_FUNCTION_REGEX_1 = "^[\t ]*([(\\w+)]|(  
+))+ +(\\w+) *\\([\\s\\S]*\\)(\\s)*"
+	private static final String CPP_FUNCTION_REGEX_1 = "^[\t ]*([(\\w+)]| 
(\\s+))+\\s+(\\w+) *\\([^()]*\\)(\\s)*"
  			+ "\\{[\\s\\S]*\\}[\n\r]*";

  	/**
@@ -48,11 +48,10 @@
  	 * MyType MyType::Add(const MyType & rhs) { return MyType(itsVal+
  	 * rhs.GetItsVal()); }
  	 */
-	private static final String CPP_FUNCTION_REGEX_2 = "^[\t ]*(\\w+  
*){1,2}\\:\\:\\w+\\s*\\([\\s\\S]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
+	private static final String CPP_FUNCTION_REGEX_2 = "^[\t ]*(\\w+  
*){1,2}\\:\\:\\w+\\s*\\([^()]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";

  	// TODO add regular expression which will match functions with a reference
  	// to a string as a returning type for example.
-
  	private static final String CPP_FUNCTION_REGEX = "(" +  
CPP_FUNCTION_REGEX_1
  			+ ")|(" + CPP_FUNCTION_REGEX_2 + ")";

@@ -87,7 +86,7 @@
  	 * and have parametrs. For example: inline void Recycle( char* aBuffer) {
  	 * nsMemory::Free(aBuffer); }
  	 */
-	private static final String CPP_INLINE_FUNCTION_1  
= "inline\\s+\\w+\\s+(.*)\\([\\s\\S]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
+	private static final String CPP_INLINE_FUNCTION_1  
= "inline\\s+\\w+\\s+(.*)\\([^()]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";

  	/**
  	 * This regular expression match inline functions in C++ which have no
@@ -101,14 +100,14 @@
  	 * return filename_.size(); }
  	 *
  	 */
-	private static final String CPP_INLINE_FUNCTION_3  
= "inline\\s+\\w+\\s+\\w+\\:\\:\\w+\\s*\\([\\s\\S]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
+	private static final String CPP_INLINE_FUNCTION_3  
= "inline\\s+\\w+\\s+\\w+\\:\\:\\w+\\s*\\([^()]*\\)(\\s)*\\{[\\s\\S]*\\}[\n\r]*";

  	/**
  	 * This regular expression matches inline functions in C++ which contains
  	 * mark "::" and,after parameters, key word const. For example: inline
  	 * size_t SearchResultMessageArg::nTried() const{ return  
filename_.size();}
  	 */
-	private static final String CPP_INLINE_FUNCTION_4  
= "inline\\s+\\w+\\s+\\w+\\:\\:\\w+\\s*\\([\\s\\S]*\\)  
*const(\\s)*\\{[\\s\\S]*\\}[\n\r]*";
+	private static final String CPP_INLINE_FUNCTION_4  
= "inline\\s+\\w+\\s+\\w+\\:\\:\\w+\\s*\\([^()]*\\)  
*const(\\s)*\\{[\\s\\S]*\\}[\n\r]*";

  	/**
  	 * This regular expression match inline functions in C++.
@@ -126,8 +125,13 @@
  	private static final String FUNCTION_REGEX = CPP_FUNCTION_REGEX + OR_REGEX
  			+ CPP_MACRO_REGEX + OR_REGEX + CPP_INLINE_FUNCTION;

+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public CPPFunctionHeuristicChecker(int limit, PrintStream out) {
-		super(limit, FUNCTION_REGEX, CPP_OPENED_BRACKET,  
CPP_CLOSED_BRACKET, "c++", out);
+		super(limit, FUNCTION_REGEX, CPP_OPENED_BRACKET, CPP_CLOSED_BRACKET,
+				"c++", out);
  	}

  }
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CSharpFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/CSharpFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -23,23 +23,27 @@
  import java.io.PrintStream;

  /**
- * This class can match C# functions.
- * More info on: {@link}  
http://en.wikipedia.org/wiki/C_Sharp_(programming_language)
+ * This class can match C# functions. More info on: {@link}
+ * http://en.wikipedia.org/wiki/C_Sharp_(programming_language)
   *
   * @author maka
- *
+ *
   */
  public class CSharpFunctionHeuristicChecker extends  
FunctionHeuristicChecker {
  	private static final String C_SHARP_CLOSED_BRACKET = "\\}";
  	private static final String C_SHARP_OPENED_BRACKET = "\\{";

  	/**
-	 * This is regular expression for C# function matching .
-	 * More info on: {@link}  
http://en.wikipedia.org/wiki/C_Sharp_(programming_language)
+	 * This is regular expression for C# function matching . More info on:
+	 * {@link} http://en.wikipedia.org/wiki/C_Sharp_(programming_language)
  	 */
-	private static final String C_SHARP_FUNCTION_REGEX = "^[\t  
]*[(public)(protected)(private)(static)(void)(abstract)\\w+]*  
+(.*)\\([\\s\\S]*\\)(\\s)*"
+	private static final String C_SHARP_FUNCTION_REGEX = "^[\t  
]*[(public)(protected)(private)(static)(void)(abstract)\\w+]*  
+(.*)\\([^()]*?\\)(\\s)*"
  			+ "\\{[\\s\\S]*\\}[\n\r]*";
-
+
+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public CSharpFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, C_SHARP_FUNCTION_REGEX, C_SHARP_OPENED_BRACKET,
  				C_SHARP_CLOSED_BRACKET, "c#", out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/DelphiFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/DelphiFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -39,12 +39,12 @@
  	/**
  	 * This regular expression matches functions in Delphi.
  	 */
-	private static final String DELPHI_FUNCTION_REGEX = "(?i:function)  
+(.*)(\\([\\s\\S]*\\))*(\\s)*\\: *\\w+  
*;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)[\n\r]*";
+	private static final String DELPHI_FUNCTION_REGEX = "(?i:function)  
+(.*)(\\([^()]*?\\))*(\\s)*\\: *\\w+  
*;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)[\n\r]*";

  	/**
  	 * This regular expression matches procedures in Delphi.
  	 */
-	private static final String DELPHI_PROCEDURE_REGEX = "(?i:procedure)  
+(.*)(\\([\\s\\S]*\\))*(\\s)*;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)[\n\r]*";
+	private static final String DELPHI_PROCEDURE_REGEX = "(?i:procedure)  
+(.*)(\\([^()]*?\\))*(\\s)*;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)[\n\r]*";

  	/**
  	 * This regular expression matches subroutines in Delphi-both functions  
and
@@ -53,6 +53,10 @@
  	private static final String DELPHI_SUBROUTINE_REGEX = "("
  			+ DELPHI_FUNCTION_REGEX + ")|(" + DELPHI_PROCEDURE_REGEX + ")";

+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public DelphiFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, DELPHI_SUBROUTINE_REGEX, DELPHI_OPENED_BRACKET,
  				DELPHI_CLOSED_BRACKET, "pascal", out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/FortranFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/FortranFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -45,25 +45,30 @@
  	 * insensitive followed by one or more blanks.
  	 */
  	public static final String FUNCTION_KEYWORD = "(?i:FUNCTION) +";
+
  	/**
  	 * This regular expression matches key word SUBROUTINE which is case
  	 * insensitive followed by one or more blanks.
  	 */
  	public static final String SUBROUTINE_KEYWORD = "(?i:SUBROUTINE) +";
+
  	/**
  	 * This regular expression matches name of a functions starting with a key
  	 * word FUNCTION
  	 */
  	public static final String FUNCTION_NAME_REGEX = "(?i:FUNCTION) +\\w+";
+
  	/**
  	 * This regular expression matches name of a subroutine starting with a  
key
  	 * word SUBROUTINE.
  	 */
  	public static final String SUBROUTINE_NAME_REGEX = "(?i:SUBROUTINE)  
+\\w+";
+
  	/**
  	 * This regular expression matches functions header in Fortran.
  	 */
  	public static final String FUNCTION_HEADER_REGEX = "\\w+ +(?i:FUNCTION)  
+\\w+ *\\(.*\\).*\\n";
+
  	/**
  	 * This regular expression matches subroutines header in Fortran.
  	 */
@@ -82,8 +87,8 @@

  	@Override
  	public HeuristicCheckerResult checkByHeuristic(String codeToBeChecked) {
-		HeuristicCheckerResult toret = new HeuristicCheckerResult(false,
-				true, codeToBeChecked, this.getClass());
+		HeuristicCheckerResult toret = new HeuristicCheckerResult(false, true,
+				codeToBeChecked, this.getClass());
  		boolean isSearchNeaded = false;
  		String functionHeader = getFunctionHeader(codeToBeChecked,
  				FUNCTION_HEADER_REGEX);
@@ -91,11 +96,11 @@
  		if (functionHeader != null) {
  			functionName = getFunctionName(functionHeader, FUNCTION_KEYWORD,
  					FUNCTION_NAME_REGEX);
-			// TODO ovaj regularni izraz koci. Proveri zasto ili napisi bolji
+			// TODO This regular expression can hung up an application.
  			String functionEndRegex = // "(?i:END) +(?i:FUNCTION) +" +
  			"" + functionName;
-			isSearchNeaded = isFortranFunction(codeToBeChecked,  
FUNCTION_HEADER_REGEX,
-					functionEndRegex);
+			isSearchNeaded = isFortranFunction(codeToBeChecked,
+					FUNCTION_HEADER_REGEX, functionEndRegex);
  		} else {

  			functionHeader = getFunctionHeader(codeToBeChecked,
@@ -108,8 +113,8 @@
  						SUBROUTINE_HEADER_REGEX, functionName);
  			}
  		}
-		toret = new HeuristicCheckerResult(isSearchNeaded,
-				false, codeToBeChecked, this.getClass());
+		toret = new HeuristicCheckerResult(isSearchNeaded, false,
+				codeToBeChecked, this.getClass());
  		return toret;
  	}

@@ -126,8 +131,7 @@
  	boolean isFortranFunction(String codeToBeChecked, String functionHeader,
  			String functionEndRegex) {
  		boolean toret = false;
-		String functionRegex = functionHeader + "[\\s\\S]*"
-				+ functionEndRegex;
+		String functionRegex = functionHeader + "[\\s\\S]*" + functionEndRegex;
  		out.println(functionRegex);
  		Pattern p = Pattern.compile(functionRegex, Pattern.MULTILINE);
  		// Run matches
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/FunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/FunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -42,8 +42,25 @@
  	private final String language;
  	private final PrintStream out;

+	/**
+	 * @param limit
+	 *            minimal length of comment which will be considered
+	 * @param regex
+	 *            regular expression to match comment in current language
+	 * @param openedBrackets
+	 *            opened brackets in particular programming language( for
+	 *            example "{" in c)
+	 * @param closedBrackets
+	 *            closed brackets in particular programming language( for
+	 *            example "}" in c)
+	 * @param language
+	 *            programming language
+	 * @param out
+	 *            print stream for logging purposes
+	 */
  	public FunctionHeuristicChecker(int limit, String regex,
-			String openedBrackets, String closedBrackets, String language,  
PrintStream out) {
+			String openedBrackets, String closedBrackets, String language,
+			PrintStream out) {
  		this.limit = limit;
  		this.FUNCTION_REGEX = regex;
  		this.CLOSED_BRACKET = closedBrackets;
@@ -52,10 +69,13 @@
  		this.out = out;
  	}

+	/* (non-Javadoc)
+	 * @see  
org.apache.rat.pd.heuristic.IHeuristicChecker#checkByHeuristic(java.lang.String)
+	 */
  	public HeuristicCheckerResult checkByHeuristic(String codeToBeChecked) {
-		HeuristicCheckerResult toret = new HeuristicCheckerResult(false,
-				true, codeToBeChecked, this.getClass());
-
+		HeuristicCheckerResult toret = new HeuristicCheckerResult(false, true,
+				codeToBeChecked, this.getClass());
+
  		Pattern p = Pattern.compile(FUNCTION_REGEX, Pattern.MULTILINE);
  		// Run matches
  		Matcher m = p.matcher(codeToBeChecked);
@@ -65,8 +85,8 @@
  			if (codeToBeChecked.endsWith(found)) {
  				boolean isSearchNeaded = found.length() > limit
  						&& filterWrongMatches(found) ? true : false;
-				toret = new HeuristicCheckerResult(isSearchNeaded,
-						false, found, this.getClass());
+				toret = new HeuristicCheckerResult(isSearchNeaded, false,
+						found, this.getClass());
  				out.println("Found function: " + found);
  			}
  		}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/JavaFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/JavaFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -42,9 +42,13 @@
  	 * http://java.sun.com/docs/books/tutorial/java/javaOO/methods.html
  	 *
  	 */
-	private static final String JAVA_FUNCTION_REGEX = "^[\t  
]*([(\\w+)<>\\[\\]]|( +))+ +(\\w+) *\\([\\s\\S]*\\) *(throws +\\w+)*\\s*"
+	private static final String JAVA_FUNCTION_REGEX = "^[\t  
]*([(\\w+)<>\\[\\],\\?]|( +))+ +(\\w+) *\\([^()]*?\\) *(throws  
+\\w+([\\s,\\w+]*?))*?\\s*"
  			+ "\\{[\\s\\S]*\\}[\n\r]*";

+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public JavaFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, JAVA_FUNCTION_REGEX, JAVA_OPENED_BRACKET,
  				JAVA_CLOSED_BRACKET, "java", out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/JavaScriptFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/JavaScriptFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -40,8 +40,12 @@
  	 * http://www.w3schools.com/jS/js_functions.asp
  	 *
  	 */
-	private static final String JAVA_SCRIPT_FUNCTION_REGEX = "function +\\w+  
*\\([\\s\\S]*\\)\\s*\\{[\\s\\S]*\\}[\n\r]*";
-
+	private static final String JAVA_SCRIPT_FUNCTION_REGEX = "function +\\w+  
*\\([^()]*?\\)\\s*\\{[\\s\\S]*\\}[\n\r]*";
+
+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public JavaScriptFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, JAVA_SCRIPT_FUNCTION_REGEX, JAVA_SCRIPT_OPENED_BRACKET,
  				JAVA_SCRIPT_CLOSED_BRACKET, "javascript", out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/PHPFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/PHPFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -37,8 +37,12 @@
  	 * This regular expression match PHP functions. More info about functions  
in PHP on:
  	 * {@link} http://www.w3schools.com/PHP/php_functions.asp
  	 */
-	private static final String PHP_FUNCTION_REGEX = "function +\\w+  
*\\([\\s\\S]*\\)\\s*\\{[\\s\\S]*\\}[\n\r]*";
-
+	private static final String PHP_FUNCTION_REGEX = "function +\\w+  
*\\([^()]*?\\)\\s*\\{[\\s\\S]*\\}[\n\r]*";
+
+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public PHPFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, PHP_FUNCTION_REGEX, PHP_OPENED_BRACKET,  
PHP_CLOSED_BRACKET, "php", out);
  	}
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/PascalFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/PascalFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -37,12 +37,12 @@
  	/**
  	 * This regular expression match functions in Pascal.
  	 */
-	private static final String PASCAL_FUNCTION_REGEX = "(?i:function)  
+(.*)\\([\\s\\S]*\\)(\\s)*\\: *\\w+ *;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)";
+	private static final String PASCAL_FUNCTION_REGEX = "(?i:function)  
+(.*)\\([^()]*?\\)(\\s)*\\: *\\w+ *;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)";

  	/**
  	 * This regular expression match procedures in Pascal.
  	 */
-	private static final String PASCAL_PROCEDURE_REGEX = "(?i:procedure)  
+(.*)\\([\\s\\S]*\\)(\\s)*;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)";
+	private static final String PASCAL_PROCEDURE_REGEX = "(?i:procedure)  
+(.*)\\([^()]*?\\)(\\s)*;(\\s)*(?i:BEGIN)[\\s\\S]*(?i:END;)";

  	/**
  	 * This regular expression matches procedures and functions in Pascal.  
Since
@@ -54,6 +54,10 @@
  	private static final String PASCAL_SUBPROGRAM_REGEX = "("
  			+ PASCAL_FUNCTION_REGEX + ")|(" + PASCAL_PROCEDURE_REGEX + ")";

+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public PascalFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, PASCAL_SUBPROGRAM_REGEX, PASCAL_OPENED_BRACKET,
  				PASCAL_CLOSED_BRACKET, "pascal", out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/VisualBasicFunctionHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/functions/VisualBasicFunctionHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -48,7 +48,7 @@
  	 * <p>
  	 * End Function
  	 */
-	private static final String VISUAL_BASIC_FUNCTION_REGEX_1  
= "((?i:private)|(?i:public)) +Function +\\w+ *\\([\\s\\S]*\\) *(As)  
+\\w+[\\s\\S]*End +Function[\n\r]*";
+	private static final String VISUAL_BASIC_FUNCTION_REGEX_1  
= "((?i:private)|(?i:public)) +Function +\\w+ *\\([^()]*?\\) *(As)  
+\\w+[\\s\\S]*End +Function[\n\r]*";

  	/**
  	 * This regular expression match functions in Visual Basic which don't  
have
@@ -60,7 +60,7 @@
  	 * <p>
  	 * End Function
  	 */
-	private static final String VISUAL_BASIC_FUNCTION_REGEX_2 = "Function  
+\\w+ *\\([\\s\\S]*\\) *(As) +\\w+[\\s\\S]*End +Function[\n\r]*";
+	private static final String VISUAL_BASIC_FUNCTION_REGEX_2 = "Function  
+\\w+ *\\([^()]*?\\) *(As) +\\w+[\\s\\S]*End +Function[\n\r]*";

  	/**
  	 * This regular expression matches functions in Visual Basic.
@@ -79,7 +79,7 @@
  	 * <p>
  	 * End Sub
  	 */
-	private static final String VISUAL_BASIC_SUBROUTINES_REGEX_1  
= "((?i:private)|(?i:public)) +Sub +\\w+ *\\([\\s\\S]*\\)[\\s\\S]*End  
+Sub[\n\r]*";
+	private static final String VISUAL_BASIC_SUBROUTINES_REGEX_1  
= "((?i:private)|(?i:public)) +Sub +\\w+ *\\([^()]*?\\)[\\s\\S]*End  
+Sub[\n\r]*";

  	/**
  	 * This regular expression match subroutines in Visual Basic which do not
@@ -91,7 +91,7 @@
  	 * <p>
  	 * End Sub
  	 */
-	private static final String VISUAL_BASIC_SUBROUTINES_REGEX_2 = "Sub +\\w+  
*\\([\\s\\S]*\\)[\\s\\S]*End +Sub[\n\r]*";
+	private static final String VISUAL_BASIC_SUBROUTINES_REGEX_2 = "Sub +\\w+  
*\\([^()]*?\\)[\\s\\S]*End +Sub[\n\r]*";

  	/**
  	 * This regular expression matches subroutines in Visual Basic.
@@ -107,6 +107,10 @@
  	private static final String VISUAL_BASIC_SUBPROGRAMS_REGEX =  
VISUAL_BASIC_FUNCTION_REGEX
  			+ "|" + VISUAL_BASIC_SUBROUTINES_REGEX;

+	/**
+	 * @param limit minimal length of function which will be considered
+	 * @param out print stream for logging purposes
+	 */
  	public VisualBasicFunctionHeuristicChecker(int limit, PrintStream out) {
  		super(limit, VISUAL_BASIC_SUBPROGRAMS_REGEX,
  				VISUAL_BASIC_OPENED_BRACKET, VISUAL_BASIC_CLOSED_BRACKET,  
IHeuristicChecker.ALL_LANGUAGES, out);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/misspellings/Dictionary.java	 
Mon Jun 29 16:57:33 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/misspellings/Dictionary.java	 
Sun Aug 16 16:43:54 2009
@@ -41,6 +41,10 @@
  	 */
  	private TreeSet<String> dictionary = new TreeSet<String>();

+	/**
+	 * @param is input stream for dictionary
+	 * @throws IOException
+	 */
  	public Dictionary(InputStream is) throws IOException {
  		initializeBasicSet();
  		readDictionary(is);
=======================================
---  
/trunk/src/main/java/org/apache/rat/pd/heuristic/misspellings/MisspellingsHeuristicChecker.java	 
Sat Aug  1 05:56:47 2009
+++  
/trunk/src/main/java/org/apache/rat/pd/heuristic/misspellings/MisspellingsHeuristicChecker.java	 
Sun Aug 16 16:43:54 2009
@@ -43,6 +43,11 @@
  	private final String language;
  	private final PrintStream out;

+	/**
+	 * @param dictionary dictionary
+	 * @param language programming language
+	 * @param out print stream for logging purposes
+	 */
  	public MisspellingsHeuristicChecker(IDictionary dictionary, String  
language, PrintStream out) {
  		this.spellChecker = dictionary;
  		this.language = language;
=======================================
--- /trunk/src/main/java/org/apache/rat/pd/report/HtmlReportGenerator.java	 
Sat Aug  8 08:59:03 2009
+++ /trunk/src/main/java/org/apache/rat/pd/report/HtmlReportGenerator.java	 
Sun Aug 16 16:43:54 2009
@@ -20,10 +20,10 @@
   */
  package org.apache.rat.pd.report;

-import java.io.IOException;
-import java.io.InputStream;
+import java.util.List;

  import org.apache.rat.pd.engines.SearchResult;
+import org.apache.rat.pd.engines.google.RegexGenerator;
  import org.apache.rat.pd.util.FileManipulator;

  /**
@@ -52,13 +52,13 @@
  	private final String CODE_TO_BE_CHECKED = "CODE_TO_BE_CHECKED";
  	private final String SEARCH_RESULTS_LIST = "SEARCH_RESULTS_LIST";

+	private RegexGenerator regexGenerator = new RegexGenerator();
+
  	/**
  	 * Can generate explanation about plagiarized code in html format.
  	 */
  	public HtmlReportGenerator() {
  		super();
-		InputStream is = ClassLoader.getSystemClassLoader()
-				.getResourceAsStream("report/reportEntryTemplate.html");
  		this.reportEntryTemplate = FileManipulator
  				.convertStreamToString(ClassLoader.getSystemClassLoader()
  						.getResourceAsStream("report/reportEntryTemplate.html"));
@@ -92,7 +92,9 @@
  					reportEntry.getFileName());

  			current = current.replace(this.CODE_TO_BE_CHECKED, reportEntry
-					.getCode().replaceAll("\\n", "<BR>"));
+					.getCode().replaceAll("\\n", "<BR>").replaceAll("\\t",
+							"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;")
+					.replaceAll(" ", "&nbsp;"));

  			current = current.replace(this.SEARCH_RESULTS_LIST,
  					createSearchResultList(reportEntry));
@@ -116,8 +118,9 @@
  					searchResult.getPackageName());
  			current = current.replace(this.SEARCH_RESULT_LICENSE, searchResult
  					.getLicence());
-			current = current.replace(this.SEARCH_RESULT_CODE, searchResult
-					.getCodeInClass());
+			current = current.replace(this.SEARCH_RESULT_CODE, formatBetter(
+					searchResult.getMatchedCode(),
+					 
searchResult.getCodeForQuery()).replaceAll("\\n", "<BR>").replaceAll("\\t", "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"));
  			current = current.replace(this.SEARCH_RESULT_OWNER, searchResult
  					.getOwner());
  			current = current.replace(this.SEARCH_RESULT_LANGUAGE, searchResult
@@ -130,5 +133,16 @@
  		}
  		return toRet;
  	}
+
+	protected String formatBetter(List<String> matchedCode, String  
codeInClass) {
+		String toRet = codeInClass;
+//		for (int i = 0; i < matchedCode.size(); i++) {
+//			toRet = toRet.replaceAll("(?i:("
+//					+ regexGenerator.stringToRegex(matchedCode.get(i)) + "))",
+//					"<b><span style=\"color: #800000\">" + matchedCode.get(i)
+//							+ "</span></b>");
+//		}
+		return toRet;
+	}

  }
=======================================
***Additional files exist in this changeset.***

Mime
View raw message