tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Lin <tcw00l...@yahoo.com>
Subject Re: JDBC & ORACLE implementation !
Date Thu, 20 Feb 2003 14:07:19 GMT

 
I think I have a better idea of what you are trying to do. If you goal is to data mine, your
best option may not be Oracle. That might sound counter-intuitive, but here are some reasons.
Oracle will handle large multi-meg files just fine. In fact GIS (geographical information
systems) use oracle. though databases that have extreme storage needs like GIS tend to use
specialized databases and implement specialized search algo's. A simple example of this might
be, "find all routes from point A to B, which use side streets that are not one way." If you
had to do this kind of query in oracle it would mean doing some very complex joins and would
most likely require additional processing outside of oracle in an app server.
 
when you talk about data mining, there's a wide variety. Not everyone uses "data mining" to
mean the same thing. If you're talking about KDD (knowledge discover in databases), then there's
two routes: buy a data mining package, or write your own. Again, if your needs are fairly
simple like statistical analysis of the text, then you're better off storing those files locally
and have some other deamon process mine the data in the back ground. For example, you wouldn't
want to write a stored procedure to mind the data. You could, but that might cause you to
pull your hair out.
 
Again, if you're dealing with essays or papers, you're better off processing those in the
background and storing those summaries in Oracle. Storing the entire text in oracle won't
make your life easier. A common practice in AI for text handling is statistical analysis.
The basic idea is, filter out all the words that aren't important, like verbs, adverbs and
so on. Count the frequency of the nouns and store those summaries in the database.  I hope
that helps. you're going to have to do more research on this to get a good understanding of
mining techniques.
 
peter
 
 Swapneel Dange <s_dange@hotmail.com> wrote:hey peter !

ur right, that there is no transaction involved in this process here. only 
thing i will be doing is receiving files on the server using the servlets. 
now may be it was too much thinking on my part to say that i will use 
ORAVCLE. what do u say that for atleast 7200 files a day of size max 1MB, 
shouldnt i use ORACLE ? should i try some other options and if YES then what 
kind of database can i implement.

right now i have the FILE SYSTEM implemented here. but i think it has 
limited my ability to do pattern searching and data mining, thats why i was 
trying to move to something more stable and robust such as a database which 
can support TOUGHER queries.

awaiting reaply !

Swapneel Dange
505-642-4126
http://www.cs.nmsu.edu/~sdange








>From: Peter Lin 
>Reply-To: "Tomcat Users List" 
>To: Tomcat Users List 
>Subject: Re: JDBC & ORACLE implementation !
>Date: Wed, 19 Feb 2003 18:09:36 -0800 (PST)
>
>
>
>First off, you probably should be use Oracle enterprise edition, unless 
>you're on a box with less than 128meg of memory.
>
>Oracle personal edition for 8i and 9i is really designed for simple uses. 
>The scenario you've described will probably mean storing the text as a clob 
>or in multiple columns. keep in mind if you store it as a clob, it limits 
>your ability to search performance. breaking the text into columns will 
>allow you to index the content easier. If query time is important, you may 
>want to generate summaries of the text and use that for your indexes 
>instead.
>
>as far as connecting to oracle, it's fairly straight forward. databases are 
>handy, but take care with how you implement the application. If you don't 
>need to index the content, or do not need transaction capabilities, you're 
>better off using file system to store the text. RDBMS are designed 
>specifically to handle relational data. If your data isn't relational, 
>using Oracle is a bit over kill. Using the right tool will make your life 
>easier in the long run.
>
>peter


_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE* 
http://join.msn.com/?page=features/junkmail


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message