Wednesday, October 9, 2013

The Mysterious Case Of The Nested Groups & Orphaned SCOM Gateway SiteNames

UPDATE 11-21-2013
With the release of SCOM 2012 R2 Microsoft has fixed this issue and also posted an article how to go about it. Therefore it’s better to use their methods FIRST because they’re officially supported. Read this posting of mine all about it.

WARNING!!!
This posting contains NON SUPPORTED methods to solve issues. So be careful. Also good to know: All code samples are provided 'AS IS' without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose

OUCH!
Some weeks ago I had a strange situation in my SCOM 2012 SP1 UR#3 MG: ALL Groups were nested!

I was really puzzled about it. Normally the SCOM MG had a long list of Groups but now it looked like this:
image

On top of it all, the properties of these two Groups couldn’t be checked at all:
clip_image001

The names themselves of these two Groups puzzled me. Then I got a bad feeling: the names of these Groups were the SIteName switches I used for a demonstration of SCOM Gateway server functionality.

For that demonstration I had built myself two SCOM Gateway Servers, both using the SiteName switch. Since it’s a demo environment, I removed these two SCOM Gateway Servers the quick & dirty way. Two VMs they were, running on a trial license. So I simply removed both VMs without giving it a second thought. Yes, also removed them from AD and the required SCOM actions (removing both Gateway Servers) were planned for another day…

The situation gets even worse…
However, this approach turned out bad and was biting me seriously. And yes, in the SCOM Console both SCOM Gateway Servers were still listed. So I thought to remove those two Gateway Servers by running the SCOM Gateway Server Removal Tool.

I know, it’s the wrong order of things but as it turned out, removing a SCOM Gateway Server which uses the SiteName switch isn’t that easy at all, even without my own wrong actions (the quick & dirty removal of the SCOM Gateway Servers).

And now I got other errors to deal with as well:
image

And this puzzled me since the SCOM Gateway Servers were gone. All objects reporting to those SCOM Gateway Servers were removed as well, so there was simply NOTHING reporting anything to these non-existent SCOM Gateway Servers.

No what? Thanks to the help of Daniele Grandini this puzzle got solved. So this posting came to be thanks to him. Therefore I want to say a BIG thanks to my Italian MVP buddy, Daniele.

Italy to the rescue Smile
First I had to run a query against the OperationsManager database in order to obtain information for the second query.

Query 1:

   1: declare @nodeHS nvarchar(255)
   2:  
   3: Set @nodeHS=N'Microsoft.SystemCenter.HealthService:FQDN of GatewayServer'
   4: select BaseManagedEntityId from BaseManagedEntity where FullName =@NodeHS
   5: SELECT DSR.DiscoverySourceId, DSR.RelationshipId, RV.SourceObjectFullName
   6: ,RV.TargetObjectFullName, 
   7: RV.RelationshipId
   8:  
   9:  
  10: FROM dbo.RelationshipGenericView RV 
  11: inner join dbo.DiscoverySourceToRelationship DSR on DSR.RelationshipId=RV.Id 
  12: where RV.RelationshipId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceShouldManageEntity() AND RV.ISDeleted=0
  13: AND SourceObjectId      in (select BaseManagedEntityId from BaseManagedEntity where FullName =@NodeHS)

At line 3, replace FQDN of GatewayServer. This is the output I got:
image
For the second query the DiscoverySourceId and RelationshipId are required.


Warning
Until now, no modifications are made to the SCOM database. The second query however WILL modify the SCOM database which is UNSUPPORTED. So think twice before proceeding.


Query 2:



   1: declare @utc datetime 
   2: Set @utc = GETUTCDATE() 
   3: exec dbo.p_RemoveRelationshipFromDiscoverySourceScope @RelationshipID='RelationshipID returned from previous query', 
   4:     @DiscoverySourceId='Discovery Source GUID returned from previous query', @TimeGenerated=@Utc 

At line 3 replace RelationshipID returned from previous query and at line 4 replace Discovery Source GUID returned from previous query. This is the output I got:
image


So now it was time to run the Gateway Approval Tool again in order to remove the non existent Gateway Server:
image

Yes! The Gateway Server is removed now.


I repeated all steps for the other orphaned Site Name and non-existent Gateway Server as well. And now the Group view was okay again:
image


So thanks to Daniele I solved this issue. Awesome!


When you want to know more about SCOM 2012 Gateway Server and Site Names, read Daniele’s posting: The road to Operations Manager 2012 – Sites and gateways. It covers many details and gives you a better understanding about this posting as well.


Credits
Again, this posting couldn’t have been written without the help of Daniele Grandini. So all credits go to him.

No comments: