Where are the documents (contents) stored in repository? Is it possible to find the documents by navigating to them in the file system? The question has been bothering me for quiet some time. I have heard lot many answers regarding this but I was never convinced. Till few weeks ago, in my earlier organization I was loaded with so much of work that I could never explore it myself.
Couple of weeks ago in an interview one of my colleagues asked the candidate about dmr_content. Later when I asked him he told me that the contents of renditions are stored as dmr_content. In an another incident that happened yesterday another colleague asked me about dmr_content while I was making a presentation on Object and Types. I recalled that a friend had told me once that the content of a deleted document can be retrieved using dmr_content. She was not satisfied with the response.
These incidents had developed enough curiosity in me about dmr_content. It was now my turn to explore it myself.
I checked the Object Relational Model. The i_contents_id of dm_sysobject (or it’s subtypes such as dm_document) is linked to the r_object_id of dmr_content. dmr_content also has a repeating attribute called parent_id which is linked to the r_object_id of dm_sysobject.
I was able to establish the relation between dm_document and dmr_content. The actual document or the content is stored as dmr_content and it is linked to dm_document which stores the metadata. In fact all the contents are stored as objects of dmr_content. That is the reason for creation of a new dmr_content object when we create a rendition.
But why is the parent_id a repeating attribute? It should mean that many dm_document objects can be linked to a single dmr_content object. Does that make sense? I performed the following operations on a document in order to check that.
- Check out a document and check it back in as a new version without making any changes to it.
- Create a copy of a document.
In both the cases a new object of dmr_content was created and the new document/version got linked to it. Interestingly even when I checked in the document as the same version, a new dmr_content object was created. The deletion of dm_document did not delete the associated dmr_content object either. Both of the above mentioned operations had resulted in the creation of orphan dmr_content objects. I read on a forum that the dm_clean job removes all such orphan objects. So, until the dm_clean job runs, it is possible to retrieve the content of a deleted document. Further the content files on the host file system which doesn’t have a referring dmr_content object are deleted by dm_filescan job.
Here is the most interesting part and the answer to the introductory question: I found this DQL query in the discussion forum of powerlink.
EXECUTE GET_PATH FOR ‘06XXXXXXXXXXXXXX’
This was the thing I was looking for. In order to verify it I imported an image file named object_type.jpg in my repository using DA. I used the following query to find the r_object_id of the dmr_content it was associated with.
WHERE any parent_id in
WHERE object_name = 'object_type.jpg')
Alternatively, the following query can also be used.
SELECT i_contents_id AS DMR_CONTENT_ID
WHERE object_name = 'object_type.jpg'
The return value of the query was ‘060003f08000c93e’. This value was used as input for the second query.
Any guesses? The query returned me the following content path.
The e2.jpg was same as the object_type.jpg which I had imported. In other words, the file was stored in the host file system but it was renamed. GET_PATH is an Administration method and can be found under Administration >> Job Management >> Administration Method node in Documentum Administrator.
I am sorry but I don’t want to indulge in any more puzzles now. You can get some help in Robin’s Post if you want to understand the logic of resolving the content path of dmr_content. As per Object Relational Diagram, the storage_id and data_ticket attributes are used to refer to the content stored in the host file system.
In the end I got the answer for the long standing question. But one question remained unanswered. Why is parent_id a repeating attribute? Any thoughts on that?