Customisable Site Search Facility
|
Pamela Teipel |
|
2/11/2007 11:27:36 PM |
I suppose you could index all the pages into a single textfile, but depending on the size of th e site, the file could become very large and make the search pretty slow. I can't think of anyt hing else, but I'm no search engine expert...
Post Comments |
|
Re: Customisable Site Search Facility
|
Dan D'Souza |
|
2/11/2007 11:30:18 PM |
At the moment I'll consider anything
The site currently has in the region of 300 pages, an d I think about 100 are added per year. Would this be too much for the text file solution?
I f not, could you tell me a bit more about how it works and how I might go about setting it up?
Thanks
Post Comments |
|
Re: Customisable Site Search Facility
|
Pamela Teipel |
|
2/11/2007 11:32:06 PM |
I have never written anything like that, so I was only really implying that it's possible, in t heory; you could take a college course in search engines and still not know everything there is to know.
If I were to write something with my current knowledge, it would have no concept o f keyword relevance, nor ranking. It would simply spit out a list of URLs and maybe page titles in no particular order. I don't think a client would be very happy with it, not to mention the ir customers, but of course since you have the limitations you have, they should be ecstatic to get anything at all.
This definitely isn't the best way -- or possibly even a good way -- t o do it, but you could make an indexing script that opens all your html files, strips out all o f the html, css, etc. leaving only the page text. Then add it to an indexing text file in a way that it's easily retrievable. Something like:
Quote:
#$ [URL]
#*[Pagetitle]
pagetext paget ext, etc, etc
yadda yadda
more words, words, words
#~
Each page would have an entry lik e that. Then, whenever a word is searched for, it would open the index text file, look for the first #$, store the url in a stringvar, read the next line and store the page title in a string var. From the next line until it reaches a #~ on a single line, it stores all that into a singl e stringvar as well. Then it searches the variable text for the keyword(s), and if found, it's a positive match:
response.write("" & pgtitlevar & " ")
Loop, read #$ into url stringvar.... until you reach the end of the file.
It wouldn't take too much extra work to display [x] characters of text on either side of th e keyword(s) either, sorta like google does.
Like I said, it would be slow: it would have to search the entire file every time the search is used. And every time the slightest change was made to any of the pages, you would need to rerun the indexing script to keep it updated, which would be rather slow as well.
Post Comments |
|
Re: Customisable Site Search Facility
|
Dan D'Souza |
|
2/11/2007 11:34:17 PM |
Thanks for this. It really does help.
Cheers
Dan Post Comments |
|