Google
Web         Gaudiya Discussions
Gaudiya Discussions Archive » TECH ISSUES
PC problems, recommended software, tips and tricks, coding and so forth. Things that make your life in the cyberspace easier.

Towards A Better Search -



braja - Wed, 25 May 2005 06:50:53 +0530
Has anyone given any thought to making a search interface for google that tries to cover common variants, HK, common fonts, etc.--a script of some kind that will dynamically convert a search for "krishna dasa" into a search for "krsnadas" "krsnadasa" "kåñëadäs" (Balaram) etc.

It seems Greasemonkey might be the place to start but I'm wondering whether the better approach would be a list of words with variants or whether, more simply, it could be based upon some form of logic, e.g. adding a final 'a' after a consonant, mapping a HK 'J' to both 'n' and the Balaramic 'ï' (~n)

thinking.gif
Madhava - Fri, 27 May 2005 05:58:31 +0530
First of all, one would have to settle on the formula in which the variants are produced. Basically the variants would then be fed to Google as OR separated. Of course, then you couldn't really run complex queries such as "searching for five word sentences" due to the incremental variants you'd end up with.

Mapping non-diacritic words to Balaram encoding would require a large dictionary of Sanskrit words, I don't think that's a feasible approach for a browser-end application. It would work server-end.

Developing a logic for sane conversions may end up being a mighty task. From H-K and Balaram to plain text and between the two isn't an issue, that's simple mechanics. However, guessing the H-K and Balaram equivalents of plain text terms is a mighty task as logic-based. However as a database based solution with the aforesaid dictionary with plain text equivalents it wouldn't be that hard.

Then again, cross-referencing large dictionaries in an application such as a front-end to a search engine may turn out to be rather resource intensive for a server.

Off the top of my head.