Documentum, Docbrokers and NAT

This note is for my future reference. Each time I am setting up docbroker translation rulesΒ  I am confused how it should be done correctly. This note will help me to do it quickly in future πŸ™‚ Perhaps this will be also useful for others…

So, first of all, if you want to have a docbase behind NAT you need two docbrokers. The first one is an internal one which serves all requests from internal network (behind the firewall/NAT). The second docbroker serves all external requests, coming from, for example, VirtualBox/VMWare host clients or other computers on the external network.

It means that the external docbroker will need to have some extra translation configuration.

Configuration for the external docbroker will be:

[DOCBROKER_CONFIGURATION]
host=10.0.2.15
port=1491

[TRANSLATION]
port=<CS service port>=<CS service port>
host=<external IP>=10.0.2.15

where the <external IP> is the IP of the host on which the Virtualbox/VMWare is running (or the public IP). Port number is the port on which the docbase is listening (CS service port, you can find it out by checking /etc/services on Linux hosts).

The internal docbroker has standard configuration, the only difference is that it should be (obviously) running on a different port than the external docbroker.

[DOCBROKER_CONFIGURATION]
host=10.0.2.15
port=1489

Extremely important bit: for each docbroker there must be TWO ports allocated. For example, if you have an internal docbroker on port 1489, then the external docbroker CAN NOT be configured on 1490, the next free port will be 1491. If you don’t do that you will see an error like:


Documentum Docbroker: dispatcher failed (104) on line 1070 -- non-fatal

This is described in http://support.emc.com/kb/110976

It is also important to remember about configuration of the projection targets in server.ini. Both docbrokers must be defined like this:


[DOCBROKER_PROJECTION_TARGET]
host = server
port = 1489
[DOCBROKER_PROJECTION_TARGET_1]
host = server
port = 1491

When configuring docbase on VirtualBox VM some additional NAT configuration has to be done. Usually VirtualBox has two interfaces, localhost and the NAT one (ethX, with IP for example 10.0.2.15). It is important to configure Content Server and docbrokers to listen on the NAT interface. To configure it in Content Server edit server.ini and add “host=10.0.2.15”.

The last piece of configuration is the port forwarding:

After setting it up restart the system and hopefully it should be working all fine πŸ™‚

 

cmis-in-batch & Documentum 6.7

I have just pushed changeset which enables Documentum 6.7 support in cmis-in-batch. Generally speaking OpenCMIS supports Documentum 6.7 out of the box, but cmis-in-batch required some additional modifications due to the way how Documentum handles paths to objects. Documentum allows multiple objects with the same name in a single folder and therefore it is impossible to uniquely address an object using a typical path. To solve this problem Documentum uses different path segments, each object name is represented in following form:

with name: 0900000b80001234_docname
without name: 0900000b80001234_

Other limitation of Documentum 6.7 CMIS is that it exposes only CMIS object types (document, folder and relationship) and not custom ones, therefore the fail-when-type function will not work as expected.

You can get the latest source code using Mercurial like this:


hg clone https://cmis-in-batch.googlecode.com/hg/ cmis-in-batch 

cmis-in-batch

I have just uploaded an initial version of tool which tries to simplify data import and manipulation in CMIS enabled repositories.

More details as well as source code (under Apache License 2.0) are available here:
https://code.google.com/p/cmis-in-batch/

It is still Work In Progress but it works, at least now, with Alfresco. I have just finished installation of Documentum 6.7 with newly added CMIS support and I am going to test it as soon as possible.

Groovy is great!

It is not only great mix of useful and convenient language constructs of Python, Ruby and other languages but it also turned out that it is a very good code obfuscator πŸ™‚ Here is an example, code below:


    public static IDfSession getSession(String user,
        String password, String docbase) throws DfException {

        IDfClientX clientx = new DfClientX();
        IDfClient client = clientx.getLocalClient();
        IDfLoginInfo li = new DfLoginInfo();
        li.setUser(user);
        li.setPassword(password);
        return client.newSession(docbase, li);
    }

was translated to:


    public static IDfSession getSession(String user, String password, String docbase)
        throws DfException
    {
        CallSite acallsite[] = $getCallSiteArray();
        com.documentum.com.IDfClientX clientx = ((com.documentum.com.IDfClientX) (acallsite[0].callConstructor($get$$class$com$documentum$com$DfClientX())));
        IDfClient client = (IDfClient)ScriptBytecodeAdapter.castToType(acallsite[1].call(clientx), $get$$class$com$documentum$fc$client$IDfClient());
        com.documentum.fc.common.IDfLoginInfo li = ((com.documentum.fc.common.IDfLoginInfo) (acallsite[2].callConstructor($get$$class$com$documentum$fc$common$DfLoginInfo())));
        acallsite[3].call(li, user);
        acallsite[4].call(li, password);
        return (IDfSession)ScriptBytecodeAdapter.castToType(acallsite[5].call(client, docbase, li), $get$$class$com$documentum$fc$client$IDfSession());
    }

Awesome πŸ™‚

Data importer for CMIS repositories

When developing Stamper I missed a tool which would allow me to easily import some files into a repository. Alfresco provides import/export functionality through ACP files which can be used for that purpose but what about other repositories? What about Nuxeo, Documentum? Documentum has Composer/Application Builder, Nuxeo perhaps also has it’s own mechanisms, but I wanted one tool which would work with all those repositories without need to create a separate installation package for each system.

Hopefully there is a CMIS standard which can make this happen.

CMISetuper (working name :-)) will be a tool which will be able to:

  • connect to any repository which supports CMIS. This will be possible thanks to the OpenCMIS library.
  • import files
  • create folders
  • validate presence of types, objects, folders
  • modify content (replace, version etc.)
  • link and unlink objects from folders

Those actions will be described by a declarative language named SDL – Simple Declarative Language. Below is a fragment of a script:


execute {
    import-files "/myFiles" {
        "image1.jpg" cm_name="Some title" metasys_taken_on_date=2009/10/05
        "image2.jpg" metasys_color=false metasys_taken_on_date=2011/01/05
    }

    import-file "/Documents" "someotherfile.doc"
    delete-file "/Old document.doc" all-versions="true"

    replace-content "/Pictures/picture.jpg" "/tmp/newContent.jpg"
    update-properties "/Documents/someotherfile.doc" cm_title="another title"
    link-to-folder "/myHouse.jpg" "/myFiles"
    unlink-from-folder "myHouse.jpg" "/myFiles"
}

Basically the tool will execute scripts in a few stages:

  • pre-validate stage – this stage can check presence of types, files, folders etc. in repository and fail when they are present (or not).
  • prepare stage – at this stage some preparations can take place, for example creation of folders for which objects will be imported.
  • execute stage – this is the main stage which takes care of actual data importing, linking etc.
  • validation stage – the last stage which checks whether everything was imported correctly

Some global settings will be available which will control overwrite mode as well as whether to stop on errors.
I have also plan to implement rollback mode but perhaps not in the first version πŸ™‚

At this moment I would say that around 60% of above functionality is implemented and I hope to have rest ready by end of this week, so stay tuned for updates!

If you have some new ideas please share them with me.

Disabling TBO (BOF v2) in Documentum 5.3+

Sometimes during development it is useful to disable a Typed Based Object (TBO). In Documentum 5.3 (and older) when Business Object Framework was at version 1 it was easy, all that has to be done was commenting out the TBO definition line in $DOCUMENTUM/config/dbor.properties. From now on the TBO will be disabled in that particular DFC instance.

In BOF version 2 things have complicated because dbor.properties is no longer used (or at least recommended), instead all BOF objects are defined in the repository as dmc_module object instances. There is an easy way to disable a TBO globally by renaming the dmc_module object in /System/Modules/TBO but it will disable that TBO for all users and sometimes when doing troubleshooting on, for example, production system it is desirable to disable TBO only on one client.

In order to disable TBO on a single client machine we have to do a trick. Each DFC instance has it’s own BOF cache, which usually is in $DOCUMENTUM/cache. We have to locate the dmc_jar implementing the TBO and replace it with a dummy one which will not contain any business logic. How to locate the right dmc_jar? Well, it requires some investigation, you can either use DQL to look for the r_object_id of dmc_jar or just check files in the cache folder and look for JAR which will contain class implementing given TBO.

Next step is to prepare a dummy TBO. Dummy TBO is in all cases almost the same, what changes is the name of the class and package name. Sometimes different TBO objects are implementing different interfaces (for example IDfDynamicInheritance is not a required one) so in worst case you can look it up in the sources (or decompile the original TBO JAR file) and see how it was declared.

Typical Java code is following:


package com.documentum.cvp.common;

import com.documentum.fc.client.DfDocument;
import com.documentum.fc.client.IDfBusinessObject;
import com.documentum.fc.common.DfException;
import com.documentum.fc.common.IDfDynamicInheritance;

public class CvpDocument extends DfDocument
    implements IDfBusinessObject, IDfDynamicInheritance
{

    public CvpDocument()
    {
    }

    public String getVersion()
    {
        return "1.0";
    }

    public String getVendorString()
    {
        return "Documentum";
    }

    public boolean isCompatible(String version)
    {
        return version.compareTo("1.0") == 0;
    }

    public boolean supportsFeature(String feature)
    {
        return false;
    }
}

When dealing with DfFolder just change DfDocument to DfFolder and it should be fine.

After compiling the Java class and replacing the old JAR in the cache client application has to be restarted. It is also worth mentioning that the cache is periodically refreshed so it is a good idea to check whether our dummy TBO implementation was not overwritten by the real one. From my experience I can say that it happens very rarely.

I have tested this approach and I can confirm that it works, it saved me a lot of troubles during data migration between systems using FirstDoc.

Writing Mavenized JUnit tests in Alfresco.

Writing JUnit tests in Alfresco projects using Maven is not very straightforward. The main problem is lack of a “parent POM” which would gather all Alfresco dependencies into one convenient package. I will show you how to create such parent POM as well as how to create a simple JUnit test project.

Prerequisites are:

  • Alfresco (I have used 3.3g)
  • Python
  • Maven (of course :))

We will use dependencies from Alfresco WAR file, perhaps the right way is to get them from Alfresco SDK, but since I haven’t experienced any problems with using JARs from alfresco.war and since it is also easier to build pom.xml from alfresco.war then this is what I recommend.

So, to create our parent pom.xml with all Alfresco dependencies we will need a small script:


import os
import sys

files = os.listdir(sys.argv[1])

groupId="alfresco-sdk"
version="3.3g"
classifier="community"

print """
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.metasys</groupId>
<artifactId>alfresco-sdk-parent</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>pom</packaging>
<name>alfresco-sdk-parent</name>
<dependencies>
"""

for file in files:
artifactId = file[:-4]
c = "mvn install:install-file -Dfile=" + os.path.join(sys.argv[1], file) + \
       " -DgroupId=" + groupId + \
       " -DartifactId=" + artifactId + \
       " -Dversion=" + version + \
       " -Dpackaging=jar -Dclassifier=" + classifier + \
       " -DgeneratePom=true -DcreateChecksum=true"
s = "\t<dependency>\n\t<scope>test</scope>\n\t<groupId>" + groupId + \
       "</groupId>\n\t<artifactId>"+ artifactId + \
       "</artifactId>\n\t<version>"+ version + \
       "</version>\n\t<classifier>community</classifier>\n\t</dependency>\n"
print s
if len(sys.argv) > 2 and sys.argv[2] == 'install':
      os.system(c)

print """
</dependencies>
</project>
"""

There are some hardcoded values in the script. Change them if you want, for example if you’re working with Alfresco 3.4 then change the version variable.
We will invoke it like this:


mkdir alfresco-parent-pom
cd alfresco-parent-pom
python alfresco-2-maven.py <Path to Alfresco's WEB-INF/lib folder> >pom.xml

Now, move the pom.xml to a newly created folder and install it using:


mvn install

At this moment we have a Maven project ready to use but we still don’t have dependencies in the Maven repository. To install them start the script again, this time with ‘install’ parameter:


python alfresco-2-maven.py <Path to Alfresco's WEB-INF/lib folder> install

This will take some time to finish but eventually you will end up with a local repository with all Alfresco dependencies.

No, let’s move on to a real project with some JUnit tests. Our test will have to start an Alfresco repository, each Alfresco repository apart from database and filestore needs configuration. You can find Alfresco configuration in alfresco/WEB-INF/classes/alfresco folder. In order to start the tests we will need that folder in our CLASSPATH or create a JAR file with configuration files. I prefer the second solution. So to prepare the configuration artifact follow these steps:


cd Alfresco/tomcat/webapps/alfresco/WEB-INF/classes
zip -r /tmp/config.jar .
mvn install:install-file -Dfile=config.jar -DgroupId=alfresco-sdk -DartifactId=config -Dversion=3.3g -Dpackaging=jar -Dclassifier=community -DgeneratePom=true -DcreateChecksum=true

You can cleanup the config.jar before installing it in the local Maven repository. If you want to have multiple configurations then just modify the classifier parameter or simply use different version modifier.

Now, finally, we can move on to our test class. We start with the pom.xml.


<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.metasys</groupId>
    <artifactId>test-suite</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>
    <name>test-suite</name>

    <parent>
        <groupId>com.metasys</groupId>
        <artifactId>alfresco-sdk-parent</artifactId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <build>
        <resources>
            <resource>
                <filtering>false</filtering>
                <directory>src/test/resources</directory>
            </resource>
        </resources>
        <plugins>
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.3.2</version>
                <configuration>
                    <source>1.5</source>
                    <target>1.5</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
    <dependencies>
        <dependency>
            <scope>test</scope>
            <groupId>alfresco-sdk</groupId>
            <artifactId>config</artifactId>
            <version>3.3g</version>
        </dependency>
  <dependency>
            <groupId>javax</groupId>
            <artifactId>servlet</artifactId>
            <version>1.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.6</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

As you can see apart from referencing parent POM I have also added Mysql JDBC driver and servlet-api as dependencies. These two jars are present in Alfresco SDK but not in alfresco.war and that’s why they were not picked up the Python script. Of course there is also the Alfresco Configuration dependency we created before.

Now, since we have the pom.xml let’s move on to creating the Java class, we will put it in a package named com.metasys.tests, so let’s create the folder structure first:


mkdir -p JUnitTest/src/test/java/com/metasys/tests
mkdir -p JUnitTest/src/test/resources/alfresco/extension
mkdir -p JUnitTest/src/test/resources/alfresco/desktop

Change folder to src/test/java/com/metasys/tests and create a new Java class:


package com.metasys.tests;

import org.alfresco.model.ContentModel;
import org.alfresco.repo.security.authentication.AuthenticationComponent;
import org.alfresco.service.ServiceRegistry;
import org.alfresco.service.cmr.action.ActionService;
import org.alfresco.service.cmr.repository.ContentWriter;
import org.alfresco.service.cmr.repository.NodeRef;
import org.alfresco.service.cmr.repository.StoreRef;
import org.alfresco.service.cmr.search.ResultSet;
import org.alfresco.service.cmr.search.SearchService;
import org.alfresco.util.BaseAlfrescoTestCase;
import org.junit.Test;

public class SimpleTest extends BaseAlfrescoTestCase {

    protected NodeRef companyHomeRef;
    protected NodeRef rootFolderTestRef;

    @Override
    protected void setUp() throws Exception {
        setUpContext();

        this.serviceRegistry = (ServiceRegistry) ctx.getBean(ServiceRegistry.SERVICE_REGISTRY);
        this.nodeService = serviceRegistry.getNodeService();
        this.contentService = serviceRegistry.getContentService();
        this.authenticationComponent = (AuthenticationComponent) ctx.getBean("authenticationComponent");
        this.actionService = (ActionService) ctx.getBean("actionService");
        this.transactionService = serviceRegistry.getTransactionService();

        authenticationComponent.setCurrentUser("admin");
        SearchService searchService = this.serviceRegistry.getSearchService();
        ResultSet rs = searchService.query(StoreRef.STORE_REF_WORKSPACE_SPACESSTORE, SearchService.LANGUAGE_XPATH,
                "/app:company_home");
        if (rs.length() != 1) {
            fail("Could not find company home");
        }
        companyHomeRef = rs.getNodeRef(0);
    }

    @Test
    public void testSimpleWatermarking() throws Throwable {
        rootFolderTestRef = serviceRegistry.getNodeService().createNode(
                companyHomeRef, ContentModel.ASSOC_CONTAINS,
                ContentModel.TYPE_FOLDER, ContentModel.TYPE_FOLDER).getChildRef();
  assert(rootFolderTestRef != null);
        nodeService.setProperty(rootFolderTestRef, ContentModel.PROP_NAME, "TestObject");
  assert(nodeService.getProperty(rootFolderTestRef, ContentModel.PROP_NAME).equals("TestObject"));
    }
}

We will also need bootstrap configuration, normally these files are somewhere in Tomcat folder tree, since we’re not using Tomcat for starting up the repository we will have to find a place for them.

First file is dev-contex.xml, it has to be saved in src/test/resources/alfresco/extension:


<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>

<beans>
    <bean id="global-properties" class="org.alfresco.config.JndiPropertiesFactoryBean">
        <property name="locations">
            <list>
                <value>classpath:alfresco/repository.properties</value>
                <value>classpath:alfresco/domain/transaction.properties</value>
  <!-- <value>classpath:alfresco/jndi.properties</value> -->
  <!--  Overrides supplied by modules -->
                <value>classpath*:alfresco/module/*/alfresco-global.properties</value>
  <!--  Installer or user-provided defaults -->
                <value>classpath*:alfresco-global.properties</value>
                <value>classpath:alfresco/extension/dev.properties</value>
            </list>
        </property>
        <property name="systemPropertiesModeName">
            <value>SYSTEM_PROPERTIES_MODE_OVERRIDE</value>
        </property>
<!-- Extra properties that have no defaults that we allow to be defined through JNDI or System properties -->
        <property name="systemProperties">
            <list>
                <value>hibernate.dialect</value>
                <value>hibernate.query.substitutions</value>
                <value>hibernate.jdbc.use_get_generated_keys</value>
                <value>hibernate.default_schema</value>
            </list>
        </property>
    </bean>
</beans>

Next one is dev.properties, save it to the same folder as dev.properties:


dir.root=/home/kbryd/AlfrescoT2/alf_data
index.recovery.mode=AUTO
integrity.failOnError=true
db.name=alfrescoT2
db.username=alfrescoT2
db.password=alfrescoT2
db.host=localhost
db.port=3306
db.driver=org.gjt.mm.mysql.Driver
db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}
hibernate.dialect=org.hibernate.dialect.MySQLInnoDBDialect

Obviously change the database name, username, password and path to the Alfresco filestore. The last (really!) missing bit is a package of Alfresco desktop files required during startup, just copy them from alfresco/WEB-INF/classes/alfresco/desktop to your project’s src/test/resources/alfresco/desktop folder.

Now, go to the root folder of the test project and use Maven to start it:


mvn test

It will start up the repository and then execute tests, you should see something like this:


Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.947 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

[INFO] [jar:jar {execution: default-jar}]
[INFO] Building jar: /tmp/test-project/target/test-suite-1.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 44 seconds
[INFO] Finished at: Sat Feb 26 14:45:26 CET 2011
[INFO] Final Memory: 30M/258M
[INFO] ------------------------------------------------------------------------

One ending note. You may wonder why I am not calling super.setUp() in the Java class. The only reason I am doing it is because the implementation of setUp() creates a separate Store for each test. I don’t like this default behavior and that’s why I am calling only setUpContext() and then creating all required beans in my test class.

I hope that this is useful, please email me or leave comments if you have any questions!

Log4j madness.

Some things will never change, like log4j configuration madness. I have lost so much time trying to fix log4j configuration in not so trivial setups (involving Documentum’s JBoss server or Alfresco running in Tomcat) that it is almost impossible to believe that I have never stumbled on this extremely useful advice. So guys, remember this: if you will ever have a problem with log4j use this:

-Dlog4j.debug=true

This will tell you, first of all, where is the log4j.properties located and what are the appenders, categories and so on. Now working with log4j is a piece of cake πŸ˜‰