2.2.1. Search configuration

2.2.1.1. Global search index
2.2.1.2. Indexing tuning

Search is an important function in JCR, so it is quite necessary for you to know how to configure the JCR Search tool. Before going deeper into the JCR Search tool, you need to learn about the .xml configuration file and its parameters as follows.

XML Configuration

This is the JCR index configuration under the repository-configuration.xml file which can be found in various locations.


<repository-service default-repository="db1">
  <repositories>
    <repository name="db1" system-workspace="ws" default-workspace="ws">
       ....
      <workspaces>
        <workspace name="ws">
       ....
          <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
            <properties>
              <property name="index-dir" value="${java.io.tmpdir}/temp/index/db1/ws" />
              <property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
              <property name="synonymprovider-config-path" value="/synonyms.properties" />
              <property name="indexing-configuration-path" value="/indexing-configuration.xml" />
              <property name="query-class" value="org.exoplatform.services.jcr.impl.core.query.QueryImpl" />
            </properties>
          </query-handler>
        ... 
        </workspace>
     </workspaces>
    </repository>        
  </repositories>
</repository-service>

Configuration parameters

Followings are parameters of JCR index configuration:

ParameterDefaultDescriptionSince
index-dirnoneThe location of the index directory. This parameter is mandatory.1.0
use-compoundfiletrueAdvise Lucene to use compound files for the index files.1.9
min-merge-docs100Minimum number of nodes in an index until segments are merged.1.9
volatile-idle-time3Idle time in seconds until the volatile index part is moved to a persistent index even though minMergeDocs is not reached.1.9
max-merge-docsInteger.MAX_VALUEMaximum number of nodes in segments that will be merged. 1.9
merge-factor10Determine how often segment indices are merged.1.9
max-field-length10000The number of words that are fulltext indexed at most per property.1.9
cache-size1000Size of the document number cache. This cache maps uuids to Lucene document numbers.1.9
force-consistencycheckfalseRun a consistency check on every startup. If false, a consistency check is only performed when the search index detects a prior forced shutdown.1.9
auto-repairtrueErrors detected by a consistency check are automatically repaired. If false, errors are only written to the log.1.9
query-classQueryImplClass name that implements the javax.jcr.query.Query interface. This class must also extend from the org.exoplatform.services.jcr.impl.core.query.AbstractQueryImpl class.1.9
document-ordertrueIf 'true' is set and the query does not contain an 'order by' clause, result nodes will be in 'document order'. For better performance when queries return a lot of nodes, set this parameter to 'false'.1.9
result-fetch-sizeInteger.MAX_VALUEThe number of results when a query is executed. The default value is Integer.MAX_VALUE.1.9
excerptprovider-classDefaultXMLExcerptThe name of the class that implements org.exoplatform.services.jcr.impl.core.query.lucene.ExcerptProvider and should be used for the rep:excerpt() function in a query.1.9
support-highlightingfalseIf set to true additional information is stored in the index to support highlighting using the rep:excerpt() function.1.9
synonymprovider-classnoneThe name of a class that implements org.exoplatform.services.jcr.impl.core.query.lucene.SynonymProvider. The default value is null (not set).1.9
synonymprovider-config-pathnoneThe path to the synonym provider configuration file. This path is interpreted relatively to the path parameter. If there is a path element inside the SearchIndex element, then this path is interpreted and relative to the root path of the path. Whether this parameter is mandatory or not, it depends on the synonym provider implementation. The default value is null. 
indexing-configuration-pathnoneThe path to the indexing configuration file.1.9
indexing-configuration-classIndexingConfigurationImplThe name of the class that implements org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfiguration.1.9
force-consistencycheckfalseIf "true" is set, a consistency check is performed, depending on the forceConsistencyCheck parameter. If setting to false, no consistency check is performed on startup, even if a redo log had been applied.1.9
spellchecker-classnoneThe name of a class that implements org.exoplatform.services.jcr.impl.core.query.lucene.SpellChecker.1.9
spellchecker-more-populartrueIf "true" is set, spellchecker returns only the suggest words that are as frequent or more frequent than the checked word. If "false" set, spellchecker returns null (if checked word exit in dictionary), or spellchecker will return the most close suggested word.1.10
spellchecker-min-distance0.55fMinimal distance between checked word and the proposed suggested word.1.10
errorlog-size50(Kb)The default size of error log file in Kb.1.9
upgrade-indexfalseAllow JCR to convert an existing index into the new format. You have to run an automatic migration: Start JCR with -Dupgrade-index=true. The old index format is then converted in the new index format. After the conversion, the new format is used. On the next start, you do not need this option anymore. As the old index is replaced and a back conversion is not possible, you should take a backup of the index before.1.12
analyzerorg.apache.lucene.analysis.standard.StandardAnalyzerClass name of a lucene analyzer to use for fulltext indexing of text.1.12

Note

The maximum number of clauses permitted per BooleanQuery can be changed via the org.apache.lucene.maxClauseCount System property. The default value of this parameter is Integer.MAX_VALUE.

Copyright ©. All rights reserved. eXo Platform SAS
blog comments powered byDisqus