Search in contents
Added by Ulrik Kautsky 1296 days ago
I cannot search on contents. I succeeded to index a repository and search *.doc files and was impressed. Then I tested to search on a known word in a .txt file and it didn't work. This is what I entered (PC windows)
Query one:C:\temp\SupoSE-0.4.0.0RC2>supose.cmd search --index index.Tes2t --fields revisio n messsage --query +filename:/*.doc Query: ' +filename:/*.doc' Field[0]=revision Field[1]=messsage 1. revision: 3 F:/Test revisonnr.doc K:A 2. revision: 4 F:/Test revisonnr.doc K:M 3. revision: 5 F:/Test revisonnr.doc K:M 4. revision: 6 F:/Test revisonnr.doc K:M 5. revision: 7 F:/Test revisonnr.doc K:M 6. revision: 8 F:/Test revisonnr.doc K:M
which worked fine.
Then i tested Query two search for the word BIOTRAC which is in a plain textfile in the repository
C:\temp\SupoSE-0.4.0.0RC2>supose.cmd search --index index.Tes2t --fields revisio n messsage --query contents:BIOTRAC Query: ' contents:BIOTRAC' Field[0]=revision Field[1]=messsage
as you see nothing found. I fiddled around with the query because I am uncertain about the syntax tested among other variation e.g.
--query +contents:BIOTRAC --query +contents: BIOTRAC --query +contents:BIOTRAC* --query +contents:"BIOTRAC"
with no hits. I also tested with the word "The".I have no clue, either the syntax is wrong, where can I figure out the syntax, or is it something else?
Ulrik
Replies
RE: Search in contents - Added by Karl Heinz Marbaise 1294 days ago
Hi,
i have added the code highlighter to your message.
RE: Search in contents - Added by Karl Heinz Marbaise 1294 days ago
Hi Ulrik,
Then i tested Query two search for the word BIOTRAC which is in a plain textfile in the repository
Is the mentioned word separated by whitespaces before and after the word from the rest ?
as you see nothing found. I fiddled around with the query because I am uncertain about the syntax tested among other variation e.g.
Ooh sorry i have to mentioned that in a better way:
http://redmine.soebes.de/wiki/supose/RepositoryQuestions and here you find a link to the Lucene page where the syntax of the query is described in detail.
with no hits. I also tested with the word "The".I have no clue, either the syntax is wrong, where can I figure out the syntax, or is it something else?
It seemed to me that this is a Bug....
Kind regards
Karl Heinz Marbaise
RE: Search in contents - Added by Ulrik Kautsky 1294 days ago
Hi
I tested further when I at least got familar with the query syntax. Suprised to discover the search works fine when the word is entered in lowercase, but not uppercase as it is written in the document. I noticed also for another word "Press" in a beginning of a sentence wasn't found, if I didn't enter "press". OK I thought only lowercase, but strangely I could find "Tortoise*" as well as "tortoise*". Little confused. But anyhow I got impressed of its ability to find a lot of words in my checked in documents doc, xls, pdf
- 1. Swedish has as many other languages funny letters eg. å,ä,ö which are coded differently on different platforms, maybe there is a standards solution for that.
- 2. zip files contain a lot of info I have not tested if SupoSE will be able to find that
Ulrik