Search is an important function in JCR, so it is quite necessary for you to know how to configure the JCR Search tool.
Before going deeper into the JCR Search tool, you need to learn about the .xml
configuration file and its parameters as follows.
This is the JCR index configuration under the repository-configuration.xml
file which can be found in various locations.
<repository-service default-repository="db1">
<repositories>
<repository name="db1" system-workspace="ws" default-workspace="ws">
....
<workspaces>
<workspace name="ws">
....
<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
<properties>
<property name="index-dir" value="${java.io.tmpdir}/temp/index/db1/ws" />
<property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
<property name="synonymprovider-config-path" value="/synonyms.properties" />
<property name="indexing-configuration-path" value="/indexing-configuration.xml" />
<property name="query-class" value="org.exoplatform.services.jcr.impl.core.query.QueryImpl" />
</properties>
</query-handler>
...
</workspace>
</workspaces>
</repository>
</repositories>
</repository-service>
Followings are parameters of JCR index configuration:
Parameter | Default | Description | Since |
---|---|---|---|
index-dir | none | The location of the index directory. This parameter is mandatory. | 1.0 |
use-compoundfile | true | Advise Lucene to use compound files for the index files. | 1.9 |
min-merge-docs | 100 | Minimum number of nodes in an index until segments are merged. | 1.9 |
volatile-idle-time | 3 | Idle time in seconds until the volatile index part is moved to a persistent index even though minMergeDocs is not reached. | 1.9 |
max-merge-docs | Integer.MAX_VALUE | Maximum number of nodes in segments that will be merged. | 1.9 |
merge-factor | 10 | Determine how often segment indices are merged. | 1.9 |
max-field-length | 10000 | The number of words that are fulltext indexed at most per property. | 1.9 |
cache-size | 1000 | Size of the document number cache. This cache maps uuids to Lucene document numbers. | 1.9 |
force-consistencycheck | false | Run a consistency check on every startup. If false, a consistency check is only performed when the search index detects a prior forced shutdown. | 1.9 |
auto-repair | true | Errors detected by a consistency check are automatically repaired. If false, errors are only written to the log. | 1.9 |
query-class | QueryImpl | Class name that implements the javax.jcr.query.Query interface. This class must also extend from the org.exoplatform.services.jcr.impl.core.query.AbstractQueryImpl class. | 1.9 |
document-order | true | If 'true' is set and the query does not contain an 'order by' clause, result nodes will be in 'document order'. For better performance when queries return a lot of nodes, set this parameter to 'false'. | 1.9 |
result-fetch-size | Integer.MAX_VALUE | The number of results when a query is executed. The default
value is Integer.MAX_VALUE . | 1.9 |
excerptprovider-class | DefaultXMLExcerpt | The name of the class that implements
org.exoplatform.services.jcr.impl.core.query.lucene.ExcerptProvider
and should be used for the rep:excerpt() function in a
query. | 1.9 |
support-highlighting | false | If set to true additional information is stored in the
index to support highlighting using the rep:excerpt() function. | 1.9 |
synonymprovider-class | none | The name of a class that implements
org.exoplatform.services.jcr.impl.core.query.lucene.SynonymProvider .
The default value is null (not set). | 1.9 |
synonymprovider-config-path | none | The path to the synonym provider configuration file. This
path is interpreted relatively to the path parameter. If there is a
path element inside the SearchIndex element, then this path is
interpreted and relative to the root path of the path. Whether
this parameter is mandatory or not, it depends on the synonym
provider implementation. The default value is null. | |
indexing-configuration-path | none | The path to the indexing configuration file. | 1.9 |
indexing-configuration-class | IndexingConfigurationImpl | The name of the class that implements org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfiguration . | 1.9 |
force-consistencycheck | false | If "true" is set, a consistency check is performed,
depending on the forceConsistencyCheck parameter. If setting to
false, no consistency check is performed on startup, even if a redo log had been applied. | 1.9 |
spellchecker-class | none | The name of a class that implements org.exoplatform.services.jcr.impl.core.query.lucene.SpellChecker . | 1.9 |
spellchecker-more-popular | true | If "true" is set, spellchecker returns only the suggest words that are as frequent or more frequent than the checked word. If "false" set, spellchecker returns null (if checked word exit in dictionary), or spellchecker will return the most close suggested word. | 1.10 |
spellchecker-min-distance | 0.55f | Minimal distance between checked word and the proposed suggested word. | 1.10 |
errorlog-size | 50(Kb) | The default size of error log file in Kb. | 1.9 |
upgrade-index | false | Allow JCR to convert an existing index into the new
format. You have to run an
automatic migration: Start JCR with -Dupgrade-index=true . The old
index format is then converted in the new index format. After the
conversion, the new format is used. On the next start, you do not
need this option anymore. As the old index is replaced and a back
conversion is not possible, you should take a backup of the
index before. | 1.12 |
analyzer | org.apache.lucene.analysis.standard.StandardAnalyzer | Class name of a lucene analyzer to use for fulltext indexing of text. | 1.12 |
The maximum number of clauses permitted per BooleanQuery
can be changed via the org.apache.lucene.maxClauseCount
System property.
The default value of this parameter is Integer.MAX_VALUE
.