Wednesday, June 27, 2012

SharePoint 2010 strange behavior of Taxonomies under Migration from one server to another

Microsoft.SharePoint.dll version 14.0.4762.1000

Microsoft.SharePoint.Taxonomy.dll  version 14.0.4756.1000

 

We migrate only published and approved content from staging to production. Under one of the scenarios, we want to introduce new Terms for our system. Here is the strange behavior we are facing that might be reported to Microsoft for improvements in future service packs:

  1. Create new term on staging under desired termset.

  2. Reproduce same term with same Guid and parent term set on production :

    • By using TermSetItem.CreateTerm Method (String, Int32, Guid)

    • By stopping meta dataservice and then replace db on production from staging

    • By using Export-SPMetadataWebServicePartitionData,Export-SPMetadataWebServicePartitionData  powershell



  3. Create unpublished SPItem on staging which refer the new term.

  4. Let the content deployment proceed. This will migrate only the Term entry in TaxonomyHiddenList.

  5. Publish any spDoc which refers few of the old and new terms.

  6. Let the content deployment migrate the new published item.

  7. Now on production the newly migrated spDoc has only new term referred.

  8. If we migrate spDoc which only has old terms , not the new one, it migrates well and all the terms are visible under the display item form > SPField.


How you could help me in this:

I got tp_ListId and tp_DocId for my target document migrated from [AllDocs] database Tableusing LeafName.

Using tp_ListId and tp_DocId  on [AllUserData] I was able to observe that, actully content deployment has created entry correctly here.

The entry is like under ntext2 is like:“OldTerm1|OldTerm1GUID;NewTerm1|NewTerm1GUID;OLDTerm2|OldTerm2GUID”

Using all SharePoint API’s and U2U etc, my migrated item returns only the new term , it seems to be old terms are wiped off , but they are not , they are there in the database.

The output of U2U is:

NewTerm1|NewTerm1GUID

Now I have a question for you, what else than Taxonomy Update Scheduler Timer Job could be the culprit? Is it like the account with which SharePoint Timer Job is running , must have full control on Meta Data DB and Site Collection DB ?

 

How Microsoft should help us in this:

For a new term, when referred by an item, it should be handled by the migration /deployment API’s exposed internally that:

  • A new Term should be created under the same termeset as on source. Right now it tends to create under System > Keyword Terms set if term is missing. ( if I miss step 2 above)

  •  No multiple entries should be allowed in TaxonomyHiddenList with same title /GUID, name parent termset etc. If we migrate an item which is published under ste 4 above. 2 entries are created under TaxonomyHiddenList.

  • Taxonomy Update Scheduler should be made efficient to handle multiple entries for same term in the taxonomyhiddenlist by the external processes.


 Reply 1


A.

 

After doing all the above steps once :

So the Taxonomy Hidden List has all the terms as desired , the Term Store is up to date , the SQL  content DB has the entry as it should be.

Means Taxonomy Update Scheduler is not able to update the content DB , so that it may display the right values.

 

Do we need some kind of Service pack here ? We have :

Microsoft.SharePoint.dll version 14.0.4762.1000

Microsoft.SharePoint.Taxonomy.dll  version 14.0.4756.1000

 

I tried  TaxonomySession.SyncHiddenList(mySiteCollection); but it could not help .

 

 

B.

If I run all my Meta service , web application pool and SharePoint Timer Job with an "Admin Everywhere" account on production replica and follow the steps above along with TaxonomySession.SyncHiddenList(mySiteCollection);  still it does not help.

C.

If I run all the app pools and windows services on the server using an account who has admin rights everywhere along with TaxonomySession.SyncHiddenList(mySiteCollection);  it helps , I had to insert SyncHiddenList along with manual run of Taxonomy Migration Job in b/w step 4 and 5 above. i.e after the Taxonomy Hidden List item migration and before the actual content come in .

 

 

 

 

 

 

Is there some shorter way to avoid all the mess above said ? Or at least you could point out what else than exactly my Meta service app pool  , web application pool and SharePoint Timer Job is involved in Taxonomies !!!

 

Reply 2


Temporary Solution:
Everywhere on msdn and blogs, technology geeks have suggested making taxonomy store to be common for staging and production environments. But in our case we cannot maintain this. So with our version of SharePoint (Microsoft.SharePoint.dll version 14.0.4762.1000, Microsoft.SharePoint.Taxonomy.dll  version 14.0.4756.1000 ) we have tested below mentioned to work as an alternative:

 

0. On staging and production go to Central Admin > Security > Configure Service Accounts

a. Select Farm Account and press OK, the account you had decided to be farm account previously will be getting full control at many places in the DB , if the DB's are backup and restore off the line , generally we loose this important couple b/w farm account and Database. It will make sure the timer service which runs with Farm account can do many stuff on the whole farm without any errors.

b. do similar to above for app pool account for managed meta service and your target web applications.

1. Run export import for the staging and production to be in sync. Halt the changes on staging.
2. Create new term on staging under desired termset. Extract the GUID for this new term and keep safe with you
3. Create unpublished SPItem on staging which refer the new term.( To create an entry in taxonomy hiddenlist)
4. TaxonomySession.SyncHiddenList(mySiteCollectionStaging).Before Sync Please make sure your SharePoint Timer Job is running with an account which has full rights on site collection and the taxonomy data base. This can be done by setting admin for site collection and taxonomy service.Please make sure after this you wait for  "Enterprise Metadata site data update" and "Taxonomy Update Scheduler"  to run once as scheduled.

5. Delete the unpublished item of step 2 if you wish to .

6. Run export process and discard this export package. Because this package contains all the terms, since they are updated by sync process.
A. Reproduce same term with same Guid and parent term set on production: By using TermSetItem.CreateTerm Method (String, Int32, Guid) Take GUID and name from step two above.
B. Create an unpublished dummy item on production with new term.
C. TaxonomySession.SyncHiddenList(mySiteCollectionProd).Please make sure after this you wait for  "Enterprise Metadata site data update" and "Taxonomy Update Scheduler"  to run once as scheduled.
D. Delete the unpublished item if you wish to.

Now servers are ready for future normal import and exports without any error till the time you don’t introduce new terms.

 

Reply 3


Permanent Solution:

 

How Microsoft should help us in this:

1. Make the content deployment API's smart enough , that source Taxonomy hidden list is not marked for migration while exporting site collection , when the SPtem which refer to a term is created on Target , the right ( in sync) Term will automatically be created under target taxonomy hidden list.

2. In case Target term store is not updated , there should be explicit messages in import log , that a term is referred which may have been missing in the target term store.Import is unsuccessful.Please update your term store and run the import again. Or let the SharePoint Timer job create one at right place and give error if parent term set is missing

No comments:

Post a Comment