Search Web......

Lucene Search Engine: Optimization Techniques

  • Most obvious optimization - always call writer.optimize() after index update. This will merge index into one segment, and only one Searcher will be created during search.
  • Use combined search field for all text fields instead (or on the top) of indexing them separately and searching with complex query like field1:query OR field2:query ... OR fieldN:query
  • Reducing number of field make indexing and search much faster. Use combined field instead or on the top of separate fields if needed
  • Do not use compound file format during indexing. writer.setUseCompoundFile(false); If you want to keep index in single file, turn it on when indexing is complete.
  • Do not create Searcher and Analyzer for each search. When you create Searcher, it reads small dictionary into memory to speed up term lookups. It takes time. Keep Searcher opened and reuse it for every search. When index changes make sure to recreate you Searcher.
  • Increase heap size for the java virtual machine
  • Use quotes “” only to perform exact phrase searches. field:term is much better than field:”term”

Merge Factor

Lucene’s IndexWriter has parameter called Merge Factor. This is a number of segments writer keeps in memory (RAMDirectory) before flushing them to disk. Lucene documentation claims that if you want to speed up indexing, you should increase Merge Factor. Though it sounds logical, there is not much to gain here. And drawback with using big Merge Factor is that there could be too many open files. Keeping it to default 10 or 20 turns out to be the most reasonable choice.