2.2.7.3. Content Addressable Value Storage (CAS) support

JCR supports Content-addressable storage feature for Values storing.

Note

Content-addressable storage, also referred to as associative storage and abbreviated CAS, is a mechanism for storing information that can be retrieved based on its content, not its storage location. It is typically used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations.

Content Addressable Value Storage stores unique content once. Different properties (values) with the same content will be stored as one data file shared between those values. You can tell the Value content will be shared across some Values in storage and will be stored on one physical file.

Storage size will be decreased for application which governs potentially same data in the content.

Note

For example: if you have 100 different properties containing the same data (for example: mail attachment), the storage stores only one single file. The file will be shared with all referencing properties.

If property Value changes, it is stored in an additional file. Alternatively, the file is shared with other values, pointing to the same content.

The storage calculates Value content address each time the property is changed. CAS write operations are much more expensive compared to the non-CAS storages.

Content address calculation is based on the java.security.MessageDigest hash computation and tested with the MD5 and SHA1 algorithms.

Note

CAS storage works most efficiently on data that does not change often. For data that changes frequently, CAS is not as efficient as location-based addressing.

CAS support can be enabled for Tree and Simple File Value Storage types.

To enable CAS support, just configure it in JCR Repositories configuration as you do for other Value Storages.


<workspaces>
    <workspace name="ws">
        <container class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
        <properties>
            <property name="source-name" value="jdbcjcr"/>
            <property name="dialect" value="oracle"/>
            <property name="multi-db" value="false"/>
            <property name="update-storage" value="false"/>
            <property name="max-buffer-size" value="200k"/>
            <property name="swap-directory" value="target/temp/swap/ws"/>
        </properties>
        <value-storages>
<!------------------- here ----------------------->
        <value-storage id="ws" class="org.exoplatform.services.jcr.impl.storage.value.fs.CASableTreeFileValueStorage">
            <properties>
(1)                <property name="path" value="target/temp/values/ws"/>
(2)                <property name="digest-algo" value="MD5"/>
(3)                <property name="vcas-type" value="org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImpl"/>
(4)                <property name="jdbc-source-name" value="jdbcjcr"/>
                <property name="jdbc-dialect" value="oracle"/>
            </properties>
            <filters>
                <filter property-type="Binary"/>
            </filters>
        </value-storage>
    </value-storages>

1

digest-algo: Digest hash algorithm (MD5 and SHA1 were tested).

2

vcas-type: Value CAS internal data type, JDBC backed is currently implemented org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImpl.

3

jdbc-source-name: the JDBCValueContentAddressStorageImpl specific parameter, database will be used to save CAS metadata. It is simple to use the same as in workspace container.

4

jdbc-dialect: the JDBCValueContentAddressStorageImpl specific parameter, database dialect. It is simple to use the same as in workspace container.

Copyright ©. All rights reserved. eXo Platform SAS
blog comments powered byDisqus