search - Lucene (.NET) Document stucture and performance suggestions -

I am indexing around 100 M documents containing some string identifiers and hundred or so numerical words I range inquiries Is being done, so I have not done much in the numeric field too, but I am not thinking of choosing the right here.

My problem is that when I start adding or criteria for my query .. all my questions are on specific numerical terms .. Then a document looks like stringfield: [someString] and N DataField: [someNumber] .. Then I ask something like DataField: ((+ 1 + (2 3)) (+75 + (3 52 52)) (+99 + 88 + (102 155 199))

At present, these questions take about 7 to 16 seconds to run on my laptop. I want to make sure that they are really Not can best .. I'm open to suggestions on the field structure and query structure: -).

thanks

Josh

PS: I have already read here and other Lucene exhibition discussions, and D. imiagination on Lusen Wiki and Lucy .. I am a little more rabbit hole below that ...

Since you have mentioned that If you are questioning specific numbers and are not checking the range, I would like to take a look at you really fast numerical range query in Lucin 3.0 I will not suggest.

According to your description, I think, scoring problem is causing. When you have many nested boolean queries, the scoring is getting complicated and the points with floating point numbers, arithmetic, are slow. If you do not care about scores, writing custom is a good idea. You can see examples, I have linked to Javadock, your own writing.

New Tmime

Search This Blog

search - Lucene (.NET) Document stucture and performance suggestions -

Comments

Post a Comment