It is also called "Excerpt" (see Excerpt configuration in the Search Configuration section and in the Searching Repository).
The goal of this query is to find words "eXo" and "implementation" with fulltext search and high-light these words in the result value.
High-lighting is not the default feature so you must set it in
jcr-config.xml
, also excerpt provider must be defined:
<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
<properties>
...
<property name="support-highlighting" value="true" />
<property name="excerptprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.WeightedHTMLExcerpt"/>
...
<properties>
</query-handler>
Also, remember that you can make indexing rules as in the example below:
Write rules for all nodes with the 'nt:unstructed'
primary node type
where 'rule
' property equals to the "excerpt
" string. For those
nodes, you will exclude the "title
" property from high-lighting and set the "text
"
property as highlightable. Indexing-configuration.xml
must contain the
next rule:
<index-rule nodeType="nt:unstructured" condition="@rule='excerpt'">
<property useInExcerpt="false">title</property>
<property>text</property>
</index-rule>
You have a single node with the 'nt:unstructured'
primary type.
document (nt:unstructured)
rule = "excerpt"
title = "eXoJCR"
text = "eXo is a JCR implementation"
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT rep:excerpt() FROM nt:unstructured WHERE CONTAINS(*, 'eXo implementation')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(., 'eXo implementation')]/rep:excerpt(.)";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Now, see on the result table:
String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
Row row = rit.nextRow();
// get values of the row
Value[] values = row.getValues();
}
Table content is
rep:excerpt() | jcr:path | jcr:score |
---|---|---|
<div><span><strong>eXo<strong>is JCR<strong>implementation<strong><span><div> | /testroot/node1 | 335 |
As you see, words "eXo" and "implementation" are highlighted.
Also, you can get exactly the "rep:excerpt
" value:
RowIterator rows = result.getRows();
Value excerpt = rows.nextRow().getValue("rep:excerpt(.)");
// excerpt will be equal to "<div><span\><strong>eXo</strong> is a JCR <strong>implementation</strong></span></div>"
In this example, you will set different boost values for predefined
nodes, and check effect by selecting those nodes and order them by
jcr:score
.
The default boost value is 1.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield a higher score value and appear as more relevant.
See Search configuration.
In the indexing-config.xml
, set boost values for nt:ustructured
nodes 'text' property.
<!--
This rule actualy do nothing. 'text' property has default boost value.
-->
<index-rule nodeType="nt:unstructured" condition="@rule='boost1'">
<!-- default boost: 1.0 -->
<property>text</property>
</index-rule>
<!--
Set boost value as 2.0 for 'text' property in nt:unstructured nodes where property 'rule' equal to 'boost2'
-->
<index-rule nodeType="nt:unstructured" condition="@rule='boost2'">
<!-- boost: 2.0 -->
<property boost="2.0">text</property>
</index-rule>
<!--
Set boost value as 3.0 for 'text' property in nt:unstructured nodes where property 'rule' equal to 'boost3'
-->
<index-rule nodeType="nt:unstructured" condition="@rule='boost3'">
<!-- boost: 3.0 -->
<property boost="3.0">text</property>
</index-rule>
Repository contains many nodes with the "nt:unstructured
" primary type.
Each node contains the 'text
' property and the 'rule
' property with different
values.
root
node1(nt:unstructured) rule='boost1' text='The quick brown fox jump...'
node2(nt:unstructured) rule='boost2' text='The quick brown fox jump...'
node3(nt:unstructured) rule='boost3' text='The quick brown fox jump...'
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE CONTAINS(text, 'quick') ORDER BY jcr:score() DESC";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(@text, 'quick')] order by @jcr:score descending";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Let's get nodes:
NodeIterator it = result.getNodes();
if(it.hasNext())
{
Node findedNode = it.nextNode();
}
NodeIterator will return nodes in next order "node3", "node2", "node1".
This example will exclude some 'text' property of the
nt:unstructured
node from indexing. Therefore, node will not be found
by the content of this property, even if it accepts all constraints.
First of all, add rules to the indexing-configuration.xml
file:
<index-rule nodeType="nt:unstructured" condition="@rule='nsiTrue'">
<!-- default value for nodeScopeIndex is true -->
<property>text</property>
</index-rule>
<index-rule nodeType="nt:unstructured" condition="@rule='nsiFalse'">
<!-- do not include text in node scope index -->
<property nodeScopeIndex="false">text</property>
</index-rule>
Repository contains the "nt:unstructured
" nodes with the same 'text' property
and different 'rule' properties (even null).
root
node1 (nt:unstructured) rule="nsiTrue" text="The quick brown fox ..."
node2 (nt:unstructured) rule="nsiFalse" text="The quick brown fox ..."
node3 (nt:unstructured) text="The quick brown fox ..." // as you see this node not mentioned in indexing-coniguration
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE CONTAINS(*,'quick')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(., 'quick')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Get nodes:
NodeIterator it = result.getNodes();
if(it.hasNext())
{
Node findedNode = it.nextNode();
}
NodeIterator will return "node1" and "node3". Node2, as you see, is not in result set.
Also, you can get a table:
String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
Row row = rit.nextRow();
// get values of the row
Value[] values = row.getValues();
}
Table content is:
This example explains how to configure indexing in the next way. All
properties of nt:unstructured
nodes must be excluded from search, except
properties whoes names end with the 'Text' string. First of all, add rules to
the indexing-configuration.xml
file:
<index-rule nodeType="nt:unstructured"">
<property isRegexp="true">.*Text</property>
</index-rule>
Now, check this rule with a simple query by selecting all nodes with the 'nt:unstructured'
primary type and with the 'quick'
string (fulltext
search by full node).
Repository contains the "nt:unstructured
" nodes with different
'text'-like named properties.
root
node1 (nt:unstructured) Text="The quick brown fox ..."
node2 (nt:unstructured) OtherText="The quick brown fox ..."
node3 (nt:unstructured) Textle="The quick brown fox ..."
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:unstructured WHERE CONTAINS(*,'quick')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,nt:unstructured)[jcr:contains(., 'quick')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Get nodes:
NodeIterator it = result.getNodes();
if(it.hasNext())
{
Node findedNode = it.nextNode();
}
NodeIterator will return "node1" and "node2". "node3", as you see, is not in result set.
Also, you can get a table:
String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
Row row = rit.nextRow();
// get values of the row
Value[] values = row.getValues();
}
Table content is:
Find all mix:title nodes where title contains synonyms to 'fast' word.
See also about the synonym provider configuration in Searching for repository content.
The synonym provider must be configured in the indexing-configuration.xml
file:
<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
<properties>
...
<property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
<property name="synonymprovider-config-path" value="../../synonyms.properties" />
...
</properties>
</query-handler>
The synonym.properties
file contains the next synonyms list:
ASF=Apache Software Foundation quick=fast sluggish=lazy
Repository contains mix:title
nodes, where jcr:title
has different
values.
root
document1 (mix:title) jcr:title="The quick brown fox jumps over the lazy dog."
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(jcr:title, '~fast')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*,mix:title)[jcr:contains(@jcr:title, '~fast')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Get nodes:
NodeIterator it = result.getNodes();
if(it.hasNext())
{
Node findedNode = it.nextNode();
}
NodeIterator will return expected document1. This is a purpose of synonym providers. Find by a specified word, but return by all synonyms.
Check the correct spelling of phrase 'quik OR (-foo bar)' according to data already stored in index.
See also SpellChecker configuration in Searching for repository content.
SpellChecker must be settled in query-handler config.
See the test-jcr-config.xml
file as below:
<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
<properties>
...
<property name="spellchecker-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker$FiveSecondsRefreshInterval" />
...
</properties>
</query-handler>
Repository contains node with the "The quick brown fox jumps over the lazy dog" string property.
root
node1 property="The quick brown fox jumps over the lazy dog."
Query looks for the root node only, because spell checker looks for suggestions by full index. So complicated query is redundant.
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT rep:spellcheck() FROM nt:base WHERE jcr:path = '/' AND SPELLCHECK('quik OR (-foo bar)')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "/jcr:root[rep:spellcheck('quik OR (-foo bar)')]/(rep:spellcheck())";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Get suggestion of the correct spelling as follows:
RowIterator it = result.getRows();
Row r = rows.nextRow();
Value v = r.getValue("rep:spellcheck()");
String correctPhrase = v.getString();
So, correct spelling for phrase "quik OR (-foo bar)" is "quick OR (-fox bar)".
Find similar nodes to node by the '/baseFile/jcr:content'
path.
In this example, the baseFile
node will contain text where "terms" word happens
many times. That is a reason why the existence of this word will be used as a
criteria of node similarity (for the baseFile
node).
See also similarity and configuration in Searching for repository content.
Highlighting support must be added to the test-jcr-config.xml
configuration
file:
<query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
<properties>
...
<property name="support-highlighting" value="true" />
...
</properties>
</query-handler>
Repository contains many "nt:file
" nodes:
root
baseFile (nt:file)
jcr:content
(nt:resource
) jcr:data="Similarity" is
determined by looking up terms that are common to nodes. There
are some conditions that must be met for a term to be considered. This is required
to limit the number possibly relevant terms.
Only terms with at least 4 characters are considered.
Only terms that occur at least 2 times in the source node are considered.
Only terms that occur in at least 5 nodes are considered."
target1 (nt:file)
jcr:content (nt:resource) jcr:data="Similarity is determined by looking up terms that are common to nodes."
target2 (nt:file)
jcr:content (nt:resource) jcr:data="There is no you know what"
target3 (nt:file)
jcr:content (nt:resource) jcr:data=" Terms occur here"
SQL
// make SQL query
QueryManager queryManager = workspace.getQueryManager();
// create query
String sqlStatement = "SELECT * FROM nt:resource WHERE SIMILAR(.,'/baseFile/jcr:content')";
Query query = queryManager.createQuery(sqlStatement, Query.SQL);
// execute query and fetch result
QueryResult result = query.execute();
XPath
// make XPath query
QueryManager queryManager = workspace.getQueryManager();
// create query
String xpathStatement = "//element(*, nt:resource)[rep:similar(., '/testroot/baseFile/jcr:content')]";
Query query = queryManager.createQuery(xpathStatement, Query.XPATH);
// execute query and fetch result
QueryResult result = query.execute();
Let's get nodes:
NodeIterator it = result.getNodes();
if(it.hasNext())
{
Node findedNode = it.nextNode();
}
NodeIterator will return "/baseFile/jcr:content","/target1/jcr:content" and "/target3/jcr:content".
As you see the base node is also in the result set.
You can also get a table:
String[] columnNames = result.getColumnNames();
RowIterator rit = result.getRows();
while (rit.hasNext())
{
Row row = rit.nextRow();
// get values of the row
Value[] values = row.getValues();
}
The table content is:
jcr:path | ... | jcr:score |
---|---|---|
/baseFile/jcr:content | ... | 2674 |
/target1/jcr:content | ... | 2674 |
/target3/jcr:content | ... | 2674 |