Feeds:
Posts
Comments

Posts Tagged ‘Enterprise Content Management’

There are a lot of discussions on various forums/threads regarding the Content Server High Availability Environment. But I have not come across any documentation providing the precise steps to implement it. This is an attempt to list the steps that I have been using for the implementation. It’s basically an integration of bits and pieces from various sources combined along with my experience in order to put a clear picture. These steps may not be exactly as suggested and supported by EMC.
The procedure listed below is specific to the Content Server Linux Oracle 6.5 SP2 version.

Prerequisites:

  • As it’s a HA environment, the content files should be present in a File Store that is shared across the Content Servers.
  • The Installation Owner and the Installation Path should be same on each Content Server.
  • Availability of a Database Server and its connectivity through each Content Server Host using Oracle Client.
  • Update the /etc/hosts file of the Content Server Hosts so that they can resolve their IP addresses and hostnames.

Once the above prerequisites are satisfied, the below steps can be used to establish the HA environment.

  • Install the Primary Content Server as per the standard procedure mentioned in Installation Guide.
  • Install the docbroker on the Secondary CS host.
  • Create a Secondary Server Config object using Documentum Administrator.
  • Copy server.ini, aek.key, dbpasswd.txt, dm_start_docbase and dm_shutdown_docbase from Primary CS Host to Secondary CS Host.
  • Update the server.ini on both the Hosts so that the docbrokers project to each other.
    server.ini on the Primary CS:

    [DOCBROKER_PROJECTION_TARGET]
    host = <primary docbroker>
    port = 1489
    [DOCBROKER_PROJECTION_TARGET_1]
    host = <secondary docbroker>
    port = 1489

    server.ini on the Secondary CS:

    [DOCBROKER_PROJECTION_TARGET]
    host = <secondary docbroker>
    port = 1489
    [DOCBROKER_PROJECTION_TARGET_1]
    host = <primary docbroker>
    port = 1489
  • Update the dm_shutdown_docbase as follows: 
      The line preceding to “shutdown,c,T,T” should be updated as follows:

    • Original:
      ./iapi <docbase> -U$DM_DMADMIN_USER -P -e << EOF 
    • Updated:
      ./iapi <docbase>.<secondary server config object name> -U$DM_DMADMIN_USER  -P -e << EOF
  • Update the dfc.properties of the Web Application as well as both the Content Server Hosts so that they point to both the docbrokers.
  • Create an ACS Config object using the below command:
      dmbasic -f dm_acs_install.ebs -e Install -- <docbase name> <installation owner> <password> <new acs config name> <secondary server config name> <JMS Port> <JMS protocol> <output log location>
  • Update the acs.properties accordingly.
  • Once the above steps are complete, shutdown the Primary CS and test the Secondary CS for expected functionality. There may be minor issues which may need few basic fixes. Such fixes include creation of sysadmin directory at $DOCUMENTUM/dba/logs/<repository id>/ in order to fix issues relating to jobs running on Secondary CS.

Start the Primary CS and now both the CS should be in HA.

Advertisements

Read Full Post »

I came up with this junk when I was trying to clean my repository by finding a DQL that I could use to delete more that 150,000 junk user objects from my repository. Ignore that statement as it is again a junk. What I came up is not very new. It’s just the weirdness in the behavior of the DQLs. Is this weirdness only with the repeating attributes? I would like the DQLs to do much of the talking as I feel they can explain their pain much better. So here it goes:

A1)SELECT COUNT(DISTINCT user_name) FROM dm_user >> 168241

A2)SELECT COUNT(DISTINCT users_names) FROM dm_group >> 2916

A3)SELECT COUNT(DISTINCT user_name) FROM dm_user
WHERE user_name NOT IN
(SELECT DISTINCT users_names from dm_group) >> 0

Isn’t that weird?

B1)SELECT COUNT(DISTINCT user_name) FROM dm_user, dm_group
WHERE ANY dm_group.users_names = dm_user.user_name >> 2915

A2 VS B1: again weird!

C1)SELECT COUNT(DISTINCT user_name) FROM dm_user
WHERE user_name NOT IN
(SELECT DISTINCT user_name FROM dm_user, dm_group
WHERE ANY dm_group.users_names = dm_user.user_name) >> 165326

(A2, A3) VS (B1, C1): any explanation?
But that’s a relief indeed. That’s the result I needed. A1 – B1 = C1. Under the current circumstances, that’s encouraging enough to get into some more meddling.

D1)SELECT COUNT(users_names) FROM dm_group
WHERE ANY users_names NOT IN
(SELECT user_name FROM dm_user) >> 5

What the hell!!!

D2)SELECT users_names FROM dm_group
WHERE ANY users_names NOT IN
(SELECT user_name FROM dm_user)
>> TestUser1 test3 test2 orts test1x

hmmmm…

D3)SELECT COUNT(*) FROM dm_user
WHERE user_name IN
(SELECT users_names FROM dm_group
WHERE ANY users_names NOT IN
(SELECT user_name FROM dm_user))>> 4

This is insane. Isn’t D3 a contradiction in itself? Can I challenge EMC to explain that? Can someone come to my rescue?

D4)SELECT user_name FROM dm_user
WHERE user_name IN
(SELECT users_names FROM dm_group
WHERE ANY users_names NOT IN
(SELECT user_name FROM dm_user))
>> TestUser1 test3 test2 orts

Here is something that helps; but it doesn’t explain the insanity though.

ConsistencyChecker Report:

Checking for users belonging to groups not in dm_user
WARNING CC-0002: User ‘test1x’ is referenced in dm_group with id ‘12000d808004a500’ but does not have a valid dm_user object
Rows Returned: 1

Summery:
Weirdness No. 1:     (A1, A2, A3)
Weirdness No. 2:     A2 VS B1: Explained by the ConsistencyChecker Report.
Weirdness No. 3:     (A2, A3) VS (B1, C1)
Weirdness No. 4:     D3: The Contradiction in itself.

I could get an explanation only for Weirdness No. 2 which I guess is not a weirdness at all. I hope someone reading this post would try explaining the other three. 1 & 2 I guess are contributed by repeating attributes but remains unexplained anyway. 4, the D3 is an absolute marvel. Is there a bug in the way DQL works?
C1 is the undisputed winner as it provides the expected result.

Read Full Post »

This query is raised by many people who read my earlier posts on Aliases. Why do I need more than one Alias Set? Let me try to answer this question. As per my understanding of Aliases I can clearly visualize its use and benefit while using multiple Alias Sets. Most of the people who raised this question had a mindset that the Alias will make their work easier as it will act as a place holder and they will be able to replace the value and use the same Alias set again and again. I had tried the same as mentioned in my first post on Alias.
I had created an Alias Set and an ACL Template using the Alias. This ACL Template was applied on a document. The Alias was resolved and the permission defined in the ACL Template was granted to the user defined in the Alias. Later when I updated the Alias value with a new user name and applied it on a new document, the new user was granted the permissions in the ACL Template. The behavior was as per expectation. What about the permission on the first document? It was found that the user name in the permission set of the first document was updated as per the Alias Set. In other words, I was not able to grant permissions to two different users on two different documents using one Alias Set. The change in the Alias value was reflected even in the Permission Set on the old document. In this case there was no other option but to use more than one Alias Set. More precisely I needed as many Alias Sets as the number of users.

Why?
If I don’t use Alias Sets and ACL Template and try achieving the same situation then I need to create ten different ACL for ten different users. While using Aliases I need one Permission Set Template and ten Alias Sets. So, there is no reduction in the efforts as such. The benefit can be seen only while managing the ACL. In the previous case if the permission has to be changed (lets say from read to write), we need to update all ten ACL. In the later case updating the ACL Template will give us the desired result. Just to note that the ACL cannot be updated through DQL if you thought that all the ten ACL can be updated through DQL by using an appropriate Where condition. As the number of ACL grows, the ease in managing the permission using ACL Template becomes increasingly prominent.

When?
Does that mean I can use ACL Templates and Aliases in every case? The condition being that all the users need the same permission on the respective document. In the above case all the ten users have the same permission on the respective document. ACL Templates and Aliases can be really helpful in granting permissions to specific users who can be very large in numbers as compared to limited number of groups.

How?
In a recent implementation I have used ACL Templates and Aliases to grant permission to user and his superiors in the organization hierarchy. I have one Alias Set per user and it defines his superiors at various levels through different Aliases. The Alias Sets are created, associated and updated through a TBO. The same Aliases are also used in the workflow to decide the performer of various activities. In absence of the Alias Set I would have to fetch the users’ superiors and set them as performers of various activities in the workflow using a DFC code. As evident, Aliases have helped me in more than one ways.

Related Articles:

Read Full Post »

Few days ago my friend asked me a dql query. He wanted to have a list of all the documents and their folder paths in the repository.

It didn’t appear to be a tough one. I knew that the i_folder_id of dm_document is mapped with the r_object_id of the dm_folder to which the document is linked to. A document can be linked to multiple folders and that explains why this i_folder_id is a repeating attribute. The dm_folder has another repeating attribute r_folder_path which stores the folder path of that particular folder. The reason for r_folder_path being repeating is same as that for i_folder_id in case of dm_document.

I told him this dql to get the folder path of all the documents.

SELECT r_folder_path
FROM dm_folder
WHERE r_object_id in
(SELECT distinct i_folder_id
FROM dm_document)

He didn’t look happy enough. What he actually wanted was a list of documents with the folder path. I told him that it is not a problem; we can put a join and get the object_name from dm_document.

SELECT doc.object_name,fol.r_folder_path
FROM dm_document doc, dm_folder fol
WHERE any doc.i_folder_id = fol.r_object_id

The query looked very simple to me. He seemed to be happy and he told he will try it out. I was happy that I have given him a solution. I was a bit surprised when he told me that he was getting an error while executing this query. I didn’t expect this. The error was:

    :[DM_QUERY2_E_REPEAT_TYPE_JOIN]error: “Your query is selecting repeating attributes and joining types.”

The above query works fine if I select only doc.object_name; but it was not allowing me to select fol.r_folder_path which is a repeating attribute. The conclusion was that a repeating valued attribute cannot be selected in a normal query which uses repeating valued attribute in a join. It can be achieved by using DQL Hints mentioned later. But this friend of mine needs both Document Name and Folder Path together. I asked him to write a dfc code which gets all the documents, finds their folder path and writes this information to a file. But my friend is a real lazy guy. The bad part being that I am also no different.

I thought a lot and in the end I got an idea. I asked him to fire a query on the RDBMS tables. The object types are represented as _s and _r tables in RDBMS for their single and repeating valued attribute respectively. The query has to be fired on dm_document_s, dm_document_r, dm_folder_s and dm_folder_r tables.

So, the new query which I formed was:

SELECT doc_s.object_name,fol_r.r_folder_path
FROM dm_document_s doc_s, dm_document_r doc_r, dm_folder_r fol_r
WHERE doc_s.r_object_id = doc_r.r_object_id
AND doc_r.i_folder_id = fol_r.r_object_id

But when I fire this query on my docbase I again got an error.

    :[DM_QUERY2_E_TABLE_NOT_FOUND]error: “The database table or view was not found in the database. Error from the database was: ‘ — The database object is invalid — STATE=S0002, CODE=208, MSG=[Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name ‘dbo.dm_document_r’.'”

I realized that there is no such table called dm_document_r as dm_document doesn’t have any repeating attribute of its own. But in that case from where should I get the value for i_folder_id? If i_folder_id is not an attribute of dm_document then it should be inherited from dm_sysobject. Isn’t so?

So I had my new dql:

SELECT doc_s.object_name,fol_r.r_folder_path
FROM dm_document_s doc_s, dm_sysobject_r doc_r, dm_folder_r fol_r
WHERE doc_s.r_object_id = doc_r.r_object_id
AND doc_r.i_folder_id = fol_r.r_object_id

I was so confident. This is the ultimate query which will get me the result. When I fired this query I got the following result.

    :[DM_QUERY_E_CURSOR_ERROR]error: “A database error has occurred during the creation of a cursor (‘ STATE=S0022, CODE=207, MSG=[Microsoft][ODBC SQL Server Driver][SQL Server]Invalid column name ‘object_name’.’).”

How could I forget that object_name is also an attribute of dm_sysobject. Actually the dm_document_s has just one attribute and that is the r_object_id. As a matter of fact dm_document doesn’t have any attribute of its own. Then what is the use of having the object_type dm_document? I leave this question to be answered by you. This has to be my final query.

SELECT doc_s.object_name,fol_r.r_folder_path
FROM dm_sysobject_s doc_s, dm_sysobject_r doc_r, dm_folder_r fol_r
WHERE doc_s.r_object_id = doc_r.r_object_id
AND doc_r.i_folder_id = fol_r.r_object_id

Ultimately I got the result without any errors. But I wasn’t too happy. Many records in object_name as well as r_folder_path were showing as empty.
I did a quick fix:

SELECT doc_s.object_name,fol_r.r_folder_path
FROM dm_sysobject_s doc_s, dm_sysobject_r doc_r, dm_folder_r fol_r
WHERE doc_s.r_object_id = doc_r.r_object_id
AND doc_r.i_folder_id = fol_r.r_object_id
AND fol_r.r_folder_path is not nullstring

The r_folder_path was not showing any empty records but the case was not so with object_name. Something clicked in my mind. My mind had found a flaw in the query which was looking fine to me till now. My friend was interested in documents; not the sysobjects. I was actually giving him a lot of garbage.

And thus I got my next query:

SELECT doc_s.object_name,fol_r.r_folder_path
FROM dm_sysobject_s doc_s, dm_sysobject_r doc_r, dm_folder_r fol_r
WHERE doc_s.r_object_id = doc_r.r_object_id
AND doc_r.i_folder_id = fol_r.r_object_id
AND fol_r.r_folder_path is not nullstring
AND doc_s.r_object_type = 'dm_document'

Cool….. My friend looks to be the happiest person on earth. He got the result he was looking for. But I still want to do one more small change.

SELECT doc_s.object_name,fol_r.r_folder_path
FROM dm_sysobject_s doc_s, dm_sysobject_r doc_r, dm_folder_r fol_r
WHERE doc_s.r_object_type = 'dm_document'
AND fol_r.r_folder_path is not nullstring
AND doc_s.r_object_id = doc_r.r_object_id
AND doc_r.i_folder_id = fol_r.r_object_id

The result is same as the earlier query but somehow I look more satisfied with this query. I am not sure how good or how efficient this query is but my friend got what he wanted and I was satisfied that I was able to help him out.

The story is not over yet. I just checked the sql equivalent of this query in the dql component of webtop.

Here is the result:

SELECT all doc_s.object_name, fol_r.r_folder_path
FROM dbo.dm_sysobject_s doc_s, dbo.dm_sysobject_r doc_r, dbo.dm_folder_r fol_r
WHERE ((doc_s.r_object_type=N'dm_document')
AND fol_r.r_folder_path != ' '
AND (doc_s.r_object_id=doc_r.r_object_id)
AND (doc_r.i_folder_id=fol_r.r_object_id))

Do you know what the most interesting part is?

After doing all this crap, I posted it on my blog. I had added tags like DQL, Documentum, etc. When I clicked on the DQL link (The tag which I had added) I got Rajendra’s blog in the result. He has posted a query there.

SELECT A.r_object_id, A.object_name, B.r_folder_path
FROM dm_document A, dm_folder_r B
WHERE any A.i_folder_id = B.r_object_id

Doesn’t that look much simple? :D:D:D

Here is another query from inthewoods. It uses DQL Hint and appears to be a better option than the earlier two.

SELECT doc.r_object_id, doc.object_name, fld.r_folder_path
FROM dm_document doc, dm_folder fld
WHERE doc.i_folder_id = fld.r_object_id ENABLE(ROW_BASED)

So.. Thats the whole story. Hope you enjoyed it.

*Guess it would be a nice idea to add ‘fld.r_folder_path is not nullstring’ and ‘order by r_object_id’ to the final query.

Read Full Post »