Hi.
In short and in my opinion, I think that you are re-inventing the wheel.
There exist already numerous open-source programs which analyse web logs, and generally
produce nice-looking graphics etc.. from them. And they do the splitting-up work
properly, as long as you feed them the correct log format. Their documentation indicates
how to do that.
Look up webalizer, awstats etc..
Also, these programs are open-source, so you can look inside at how they do things, if you
really want to write your own code.
yang Yang wrote:
> Hi:
> I am trying to develop a web based tool to track page hit counts, user
> session activity and etc of our own sites.
>
> I meet some problems:
>
> 1) How to distinguish a request target is a page or a resource?
>
> For example,the following two logs(remove some parts):
>
> #1-> [17/Sep/2010:11:38:26 +0800] "POST /test.jsp?name=test HTTP/1.1" 200
> "test.jsp"
> #2-> [17/Sep/2010:11:40:11 +0800] "POST /example/test.jpg HTTP/1.1" 200
> "/example/test.jpg"
> #3-> [17/Sep/2010:11:44:26 +0800] "POST /example/testServlet HTTP/1.1" 200
> "test.jsp"
> the pattern used in the above log is : '%t "%r" %s "%U"'.
>
> The log #1 show a page request with a parameter, it can be use to calculate
> the most frequently visited pages.
>
> Log #2 show a resource(it is a image here) request, it can be used to
> calculate the most frequently visited files.
>
> Log#3 show a requst with nothing(it is a servlet),in fact it is a page.
>
> That's to say, they are different request types,so how to distinguish them
> in my codes?
>
> 2)Log parser.
> I can read the log file line by line. But how to extract the value of each
> attribute?
> They are all in one line. Split them using the string.split() method? But
> how if the value itself contains the separator?
>
> For example, I use the split(" ") to split the log#1,but the value "POST
> /example/test.jpg HTTP/1.1" will be splitted also,and this maybe
> inefficient, so I wonder if there is a tool can make me do this easily?
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
|