4.3.2. Consistency checker

Warning

It is highly recommended to back up your data before repairing inconsistencies (either automatically or manually). It is also recommended to store the results of queries that check the data consistency. This may be useful for the support team in case of deeper restoration process.

Production and any systems may have faults in some days. They may be caused by hardware and/or software problems, human faults during updates and in many other circumstances. It is important to check integrity and consistency of the system if it is not backed up or stale, or it takes the recovery process much time. The eXo JCR implementation offers an innovative JMX-based complex checking tool. Running inspection, this tool checks every major JCR component, such as persistent data layer and index. The persistent layer includes JDBC Data Container and Value Storage if they are configured. The database is verified using the set of complex specialized domain-specific queries. The Value Storage tool checks the existence and access to each file. Index verification contains two-way pass cycle, existence of each node in the index checks on persistent layer along with opposite direction, when each node from Data Container is validated in the index. Access to the checking tool is exposed via the JMX interface (RepositoryCheckController MBean) with the following operations available:

OperationDescription
checkAll() Inspects the full repository data (database, value storage and search indexes).
checkDataBase() Inspects only the DB.
checkValueStorage() Inspects only the value storage.
checkIndex() Inspects only the search indexes.

Among the list of known inconsistencies described in the next section, see below what can be checked and repaired automatically:

Note

The only inconsistency that cannot be fixed automatically is Corrupted VALUE records. Both STORAGE_DESC and DATA fields contain not null value. Since there is no way to determinate which value is valid: either on the file system or in the database.

The list of ValueStorage inconsistencies which can be checked and repaired automatically:

The following is the list of SearchIndex inconsistencies which can be checked. To repair them, you need to reindex the content completely, what also can be done using JMX:

OperationDescription
repairDataBase()Repairs DB inconsistencies declared above.
repairValueStorage()Repairs value storage inconsistencies declared above.

All tool activities are stored into a file, which can be found in the app directory. The syntax of the file name is report-<repository name>-dd-MMM-yy-HH-mm.txt.

Note

You can use the nThreads JMX parameter to set the number of threads used for checking and repairing repository (the RepositoryCheckController uses a single thread by default).

Warning

When the multi-threaded mode is used, the RepositoryCheckController uses more memories. So, it is recommended to avoid setting a large number of threads.

OperationDescription
getCheckingThreadPoolSize()Gets the number of threads used for checking and repairing the repository.
setCheckingThreadPoolSize()Sets the number of threads used for checking and repairing the repository.

Note

By default, the RepositoryCheckController uses only one thread to check and repair your repository.

Recommendations on how to fix corrupted JCR

Here are examples of corrupted JCR and ways to eliminate them:

Note

  • It is assumed that queries for single and multiple database configurations differ only by the JCR_xITEM table name, otherwise queries will be explicitly introduced.

  • In some examples, you will be asked to replace some identificators with their corresponding values. This basically means that you need to insert values, from each row result of query executed during the issue detection stage, to the corresponding place. Explicit explanation of what to do will be introduced in case replacing is needed to be fulfilled in other way.

  1. Items have no parent

    • To detect this issue, you need to execute the following query:

      select * from JCR_SITEM I where NOT EXISTS(select * from JCR_SITEM P where P.ID = I.PARENT_ID)
    • Fix description: Assign root as parent node to be able to delete this node later if the node is not needed anymore.

    • To fix this problem, do the following:

      • For all query results rows containing items belonging to I_CLASS = 1 (nodes):

        Execute the next query by replacing ${ID} and ${CONTAINER_NAME} with corresponding values:

        • Single DB

          update JCR_SITEM set PARENT_ID='${CONTAINER_NAME}00exo0jcr0root0uuid0000000000000' where ID = '${ID}'
        • Multiple DB

          update JCR_MITEM set PARENT_ID='00exo0jcr0root0uuid0000000000000' where ID = '${ID}'
      • For all query results rows containing items belonging to the I_CLASS = 2 (property):

        delete from JCR_SREF where PROPERTY_ID = '${ID}'
        delete from JCR_SVALUE where PROPERTY_ID = '${ID}'
        delete from JCR_SITEM where PARENT_ID = '${ID}' or ID='${ID}' 
  2. A node has a single valued property with no declaration in the VALUE table.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SITEM P where P.I_CLASS=2 and P.P_MULTIVALUED=0 and NOT EXISTS (select * from JCR_SVALUE V where V.PROPERTY_ID=P.ID)

      Note

      P_MULTIVALUED=0 should be replaced by P_MULTIVALUED='f' for PostgreSQL.

    • Fix description: Simply remove corrupted properties.

    • To fix every row, execute next queries by replacing ${ID} with a corresponding value:

      delete from JCR_SREF where PROPERTY_ID = '${ID}'
      delete from JCR_SITEM where ID = '${ID}'
  3. Nodes have no primary type property.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SITEM N where N.I_CLASS=1 and NOT EXISTS (select * from JCR_SITEM P where P.I_CLASS=2 and P.PARENT_ID=N.ID and P.NAME='[http://www.jcp.org/jcr/1.0]primaryType')
      
    • Fix description: Remove node, all its children, properties, values and reference records.

    • To fix this problem, do the following:

      • Recursively traver to the bottom of the tree until query results are in empty value:

        select * from JCR_SITEM where PARENT_ID='${ID}' and I_CLASS=1

        You will receive a tree structure containing a node, its children and properties.

      • Execute the following steps with tree structure elements in reverse order (from leaves to head).

        Execute a query for tree element's ${ID}.

        select * from JCR_SITEM where PARENT_ID='${ID}'

        Execute queries for each ${ID} received during the query execution mentioned above.

        delete from JCR_SREF where PROPERTY_ID = '${ID}'
        delete from JCR_SVALUE where PROPERTY_ID = '${ID}'
        delete from JCR_SITEM where PARENT_ID = '${ID}' or ID='${ID}' 
  4. All value records have no related property record.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SVALUE V where NOT EXISTS(select * from JCR_SITEM P where V.PROPERTY_ID = P.ID and P.I_CLASS=2)
    • Fix description: Remove these unnecessary records from the JCR_SVALUE table.

    • To fix this problem, execute next queries by replacing ${ID} with a corresponding value as below for every row.

      delete from JCR_SVALUE where ID = '${ID}'
  5. Corrupted VALUE records. Both STORAGE_DESC and STORAGE_DESC fields contain non-null value.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SVALUE where (STORAGE_DESC is not null and DATA is not null)
    • Fix description: Set null for the STORAGE_DESC field by assuming that the value stored in database is valid.

    • To fix this problem, execute next queries by replacing ${ID} with the corresponding value as below for every row.

      update JCR_SVALUE set STORAGE_DESC = null where ID = '${ID}'

    Note

    For Sybase DB, "DATA is not null" must be replaced with "not DATA like null".

  6. Item is its own parent.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SITEM I where I.ID = I.PARENT_ID and I.NAME <> '__root_parent'
    • Fix description: Assign root as parent node to be able to delete later if node is not needed to use anymore.

    • To fix this problem, do the following:

      • For all query results rows containing items belonging to I_CLASS = 1 (nodes):

        Execute the next query by replacing ${ID} and ${CONTAINER_NAME} with corresponding values:

        • Single DB

          update JCR_SITEM set PARENT_ID='${CONTAINER_NAME}00exo0jcr0root0uuid0000000000000' where ID = '${ID}'
        • Multiple DB

          update JCR_MITEM set PARENT_ID='00exo0jcr0root0uuid0000000000000' where ID = '${ID}'
      • For all query results rows containing items belonging to I_CLASS = 2 (property):

        delete from JCR_SREF where PROPERTY_ID = '${ID}'
        delete from JCR_SVALUE where PROPERTY_ID = '${ID}'
        delete from JCR_SITEM where PARENT_ID = '${ID}' or ID='${ID}' 
  7. Several versions of the same item.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SITEM I where EXISTS (select * from JCR_SITEM J WHERE I.CONTAINER_NAME = J.CONTAINER_NAME and I.PARENT_ID = J.PARENT_ID AND I.NAME = J.NAME and I.I_INDEX = J.I_INDEX and I.I_CLASS = J.I_CLASS and I.VERSION != J.VERSION)
    • Fix description: Keep the newest version and remove the others.

    • To fix this problem, do the following:

      • Grouping

        select max(VERSION) as MAX_VERSION, PARENT_ID, NAME, CONTAINER_NAME, I_CLASS, I_INDEX  from JCR_SITEM WHERE I_CLASS=2 GROUP BY PARENT_ID, CONTAINER_NAME, NAME, I_CLASS, I_INDEX HAVING count(VERSION) > 1
      • Execute the following query by replacing ${PARENT_ID} and ${CONTAINER_NAME}, ${NAME}, ${I_CLASS}, ${I_INDEX}, ${MAX_VERSION} with corresponding values contained in results of the query mentioned above:

        • Single DB:

          select * from JCR_SITEM where I.CONTAINER_NAME='${CONTAINER_NAME}' and PARENT_ID='${PARENT_ID}' and NAME='${NAME}' and I_CLASS='${I_CLASS}' and I_INDEX='${I_INDEX}' and VERSION < ${MAX_VERSION}
        • Multiple DB:

          select * from JCR_SITEM where PARENT_ID='${PARENT_ID}' and NAME='${NAME}' and I_CLASS='${I_CLASS}' and I_INDEX='${I_INDEX}' and VERSION < ${MAX_VERSION}

        Execute the following queries by replacing ${ID} with corresponding values of newly obtained results.

        delete from JCR_SREF where PROPERTY_ID = '${ID}'
        delete from JCR_SVALUE where PROPERTY_ID = '${ID}'
        delete from JCR_SITEM where ID='${ID}'
  8. Reference properties without reference records.

    • To detect this issue, you need to execute the following query:

      select * from JCR_SITEM P, JCR_SVALUE V where P.ID = V.PROPERTY_ID and P.P_TYPE=9 and NOT EXISTS (select * from JCR_SREF R where P.ID=R.PROPERTY_ID)
    • Fix description: Remove broken reference properties.

    • To fix this problem, do the following:

      Execute the query replacing ${ID} with a corresponding value.

      delete from JCR_SVALUE where PROPERTY_ID = '${ID}'
      delete from JCR_SITEM where ID = '${ID}'
  9. A node which is considered to be locked in the lockmanager data is not locked according to the JCR data or the opposite situation.

    • To detect this issue, you need:

      First, get all locked nodes IDs in repository, mentioned in the JCR_xITEM table, by executing a query:

      select distinct PARENT_ID from JCR_SITEM where I_CLASS=2 and
                      (NAME='[http://www.jcp.org/jcr/1.0]lockOwner' or NAME='[http://www.jcp.org/jcr/1.0]lockIsDeep')
                    

      Then, compare it to nodes IDs from LockManager's table:

      • JBC

        Note

        • During comparing results, be aware that for single DB configurations, you need to cut off the ID prefix representing the workspace name for results obtained from the JCR_xITEM table.

        • Though a single lock table is usually used for the whole repository, it is possible to configure separate db lock tables for each workspace. In this case, to obtain information over the repository, you need to execute queries for each table.

        • Non shareable

          select fqn from ${LOCK_TABLE} where parent='/$LOCKS'
        • Shareable

          Replace ${REPOSITORY_NAME} with its corresponding value.

          select fqn from ${LOCK_TABLE} where parent like '/${REPOSITORY_NAME}%/$LOCKS/' 
      • ISPN

        Note

        For ISPN lock tables which are defined for each workspace separately, you must execute queries for all lock tables to obtain information over repository.

        To get all set of locked node IDs in the repository, you must execute the following query for each workspace.

        select id from ${LOCK_TABLE}
    • Fix description: Remove inconsistent lock entries and properties. Remove entries in LOCK_TABLE that have no corresponding properties in the JCR_xITEM table and remove the JCR_xITEM properties that have no corresponding entries in the LOCK_TABLE table.

    • To fix this problem, do the following:

      First, remove property values, replace ${ID} with a corresponding node ID:

      delete from JCR_SVALUE where PROPERTY_ID in (select ID from JCR_SITEM where PARENT_ID='${ID}' and (NAME = '[http://www.jcp.org/jcr/1.0]lockIsDeep' or NAME = '[http://www.jcp.org/jcr/1.0]lockOwner'))

      Then, remove property items themselves, replace ${ID} with a corresponding node ID:

      delete from JCR_SITEM where PARENT_ID='${ID}' and (NAME = '[http://www.jcp.org/jcr/1.0]lockIsDeep' or NAME = '[http://www.jcp.org/jcr/1.0]lockOwner')

      Replace ${ID} and ${FQN} with the corresponding node ID and FQN.

      • JBC

        delete from ${LOCK_TABLE} where fqn = '${FQN}'
      • ISPN

        Execute the following query for each workspace:

        delete from ${LOCK_TABLE} where id = '${ID}'
  10. A property's value is stored in the file system, but its content is missing.

    This cannot be checked via simple SQL queries.

Copyright ©. All rights reserved. eXo Platform SAS
blog comments powered byDisqus