cmis-in-batch released with data generation mode

I have just pushed proper 1.0 release of the cmis-in-batch tool to github and bintray:

https://github.com/karolbe/cmis-in-batch

https://bintray.com/karolbe/metasys/cmis-in-batch

Here is a quote from the README.md:

Data generation is a useful feature that allows bulk importing of test data documents into CMIS compatible repository. Additionally it can populate metadata of documents with values coming from predefined dictionaries.

Sample script for generating thousands of documents can look like this one below.

Here is a brief description of what the script does:

* it will load three dictionaries from files /tmp/disciplines, /tmp/types, /tmp/subtypes. The dictionaries are simple text files where values are separated by new line characters. From the dictionary values Cartesian product will be calculated so for example, having three dictionaries:

1. level1A, level1B
2. level2A, level2B
3. level3A, level3B
following combinations will be generated:


[level1A, level2A, level3A]
[level1A, level2A, level3B]
[level1A, level2B, level3A]
[level1A, level2B, level3B]
[level1B, level2A, level3A]
[level1B, level2A, level3B]
[level1B, level2B, level3A]
[level1B, level2B, level3B]

* it will import each file in content-path location (“/media/kbryd/Media/work/sample_data/department”) to a location in repository defined with linking-rule: /Repository/${discipline}/static/${doctype}/sub/${docsubtype} – each ${} variable will be replaced by a value coming from appropriate dictionary.
* naming-rule defines what the object name should be. It can use variables from the dictionaries plus a few additional: ${file_name}, ${file_size}, ${file_path}, ${file_ext}, ${file_mime}
* mapping defines the mapping for populating metadata of each document, e.g. in this case discipline attribute will be populated with value of discipline.


   generate-random-data "set1" {
   doc-type "cara_document"
   linking-rule "/Repository/${discipline}/static/${doctype}/sub/${docsubtype}"
   naming-rule "${file_name} - ${doctype}"
   content-path "/media/kbryd/Media/work/sample_data/department"

   mapping {
       discipline {
          "${discipline}"
       }
       doc_type {
          "${doctype}"
       }
       doc_subtype {
          "${docsubtype}"
       }
   }

   dictionaries {
      discipline "/tmp/disciplines"
      doctype "/tmp/types"
      docsubtype "/tmp/subtypes"
   }
}

And that’s all! Have fun using it! 🙂

Simple CMIS Export tool

It is hard to believe that there was no basic (well, even extremely basic) tool that would allow exporting some folders and documents from a CMIS repository (Alfresco in my case) to a file-system in a hassle free way. Thanks to the OpenCMIS library writing such tool took around one hour and here is the result:

https://github.com/karolbe/cmis-export-tool

Currently the tool accepts following arguments:

usage: com.metasys.CMISExportTool
 -h                           Print help for this application
 -f <arg>                     Destination folder location
 -u <arg>                     User login
 -p <arg>                     Password
 -levels <number of levels>   Number of levels
 -s,--starting-path <arg>     Start path

-levels argument is not yet supported, but will be soon.

This tool can be very useful when used together with cmis-upload-maven-plugin (https://github.com/karolbe/cmis-upload-maven-plugin) for writing Unit tests. For example, you can export some files (e.g. configuration) from a repository that are required by your unit test, then you put them in your AMP project and, finally, using cmis-upload-maven-plugin you can automatically upload those files to your test repository that it started during execution of your unit tests (mvn test).

So here is an example. Let’s assume that you have a project that needs some configuration in the repository, let’s name it ‘stamper’. In order to make the bootstrap process more convenient you need to add following section to your pom.xml file:


        <profile>
            <id>upload-config</id>
            <activation>
                <property>
                    <name>action</name>
                    <value>upload-config</value>
                </property>
            </activation>

            <build>
                <plugins>
                    <plugin>
                        <groupId>com.metasys</groupId>
                        <artifactId>cmis-upload-maven-plugin</artifactId>
                        <version>1.0-SNAPSHOT</version>
                        <configuration>
                            <localPath>${project.build.directory}/stamper-resources</localPath>
                            <destPath>/</destPath>
                            <overwrite>true</overwrite>
                            <username>admin</username>
                            <password>admin</password>
                            <url>http://localhost:8080/alfresco/cmisatom</url>
                        </configuration>
                        <executions>
                            <execution>
                                <phase>package</phase>
                                <goals>
                                    <goal>cmis-upload</goal>
                                </goals>
                            </execution>
                        </executions>
                    </plugin>
                </plugins>
            </build>
        </profile>

This will upload all files from ${project.build.directory}/stamper-resources folder to the root folder of your repository. You could copy some files there using maven.resources.plugin, so let’s do it…

This will copy files from main/src/test/config to target/stamper-resources.


           <plugin>
                <artifactId>maven-resources-plugin</artifactId>
                <version>2.7</version>
                <executions>
                    <execution>
                        <id>copy-cara-resources</id>
                        <phase>validate</phase>
                        <goals>
                            <goal>copy-resources</goal>
                        </goals>
                        <configuration>
                            <outputDirectory>${project.build.directory}/stamper-resources</outputDirectory>
                            <resources>
                                <resource>
                                    <directory>src/test/config</directory>
                                    <filtering>false</filtering>
                                </resource>
                            </resources>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

and now all you have to do to upload files is it type:

mvn -Dmaven.test.skip=true -Daction=upload-config package

Enjoy! 🙂

cmis-in-batch & Documentum 6.7

I have just pushed changeset which enables Documentum 6.7 support in cmis-in-batch. Generally speaking OpenCMIS supports Documentum 6.7 out of the box, but cmis-in-batch required some additional modifications due to the way how Documentum handles paths to objects. Documentum allows multiple objects with the same name in a single folder and therefore it is impossible to uniquely address an object using a typical path. To solve this problem Documentum uses different path segments, each object name is represented in following form:

with name: 0900000b80001234_docname
without name: 0900000b80001234_

Other limitation of Documentum 6.7 CMIS is that it exposes only CMIS object types (document, folder and relationship) and not custom ones, therefore the fail-when-type function will not work as expected.

You can get the latest source code using Mercurial like this:


hg clone https://cmis-in-batch.googlecode.com/hg/ cmis-in-batch 

cmis-in-batch

I have just uploaded an initial version of tool which tries to simplify data import and manipulation in CMIS enabled repositories.

More details as well as source code (under Apache License 2.0) are available here:
https://code.google.com/p/cmis-in-batch/

It is still Work In Progress but it works, at least now, with Alfresco. I have just finished installation of Documentum 6.7 with newly added CMIS support and I am going to test it as soon as possible.

Data importer for CMIS repositories

When developing Stamper I missed a tool which would allow me to easily import some files into a repository. Alfresco provides import/export functionality through ACP files which can be used for that purpose but what about other repositories? What about Nuxeo, Documentum? Documentum has Composer/Application Builder, Nuxeo perhaps also has it’s own mechanisms, but I wanted one tool which would work with all those repositories without need to create a separate installation package for each system.

Hopefully there is a CMIS standard which can make this happen.

CMISetuper (working name :-)) will be a tool which will be able to:

  • connect to any repository which supports CMIS. This will be possible thanks to the OpenCMIS library.
  • import files
  • create folders
  • validate presence of types, objects, folders
  • modify content (replace, version etc.)
  • link and unlink objects from folders

Those actions will be described by a declarative language named SDL – Simple Declarative Language. Below is a fragment of a script:


execute {
    import-files "/myFiles" {
        "image1.jpg" cm_name="Some title" metasys_taken_on_date=2009/10/05
        "image2.jpg" metasys_color=false metasys_taken_on_date=2011/01/05
    }

    import-file "/Documents" "someotherfile.doc"
    delete-file "/Old document.doc" all-versions="true"

    replace-content "/Pictures/picture.jpg" "/tmp/newContent.jpg"
    update-properties "/Documents/someotherfile.doc" cm_title="another title"
    link-to-folder "/myHouse.jpg" "/myFiles"
    unlink-from-folder "myHouse.jpg" "/myFiles"
}

Basically the tool will execute scripts in a few stages:

  • pre-validate stage – this stage can check presence of types, files, folders etc. in repository and fail when they are present (or not).
  • prepare stage – at this stage some preparations can take place, for example creation of folders for which objects will be imported.
  • execute stage – this is the main stage which takes care of actual data importing, linking etc.
  • validation stage – the last stage which checks whether everything was imported correctly

Some global settings will be available which will control overwrite mode as well as whether to stop on errors.
I have also plan to implement rollback mode but perhaps not in the first version 🙂

At this moment I would say that around 60% of above functionality is implemented and I hope to have rest ready by end of this week, so stay tuned for updates!

If you have some new ideas please share them with me.