need help on web crawling... kinsa kahibaw. from browser to database.
provide proper scope please...
browser? to database. or web content to database?
data mining?, i made a web crawler before using VBA!!
See this;
Apache Lucene - Overview
I'm not really sure of what you're trying to do, but if this about scanning web contents and indexed them, you can use Apache Lucene
1. Spider the website using an appropriate tool, like wget.
2. Parse the text using a scripting language of your choice (i.e., PHP). Use Regular Expressions to extract patterns of text from that data.
3. Once the the required text have been extracted, it's now a simple matter of inserting them into the DB.
web content to database... like number and urls. data mining
Last edited by lestat1116; 08-23-2011 at 08:03 PM.
Similar Threads |
|