Wednesday, January 28, 2015

OM12x: Changing Passwords OM12x Service Accounts

Why this posting?
Normally I advise my customers NOT to change the OM12x Service Accounts. Instead create the required OM12x Service Accounts with passwords which are impossible to remember, store them in the dedicated account/password application, like KeePass. Set the accounts so their passwords never expire and be done with it.

In many situations this works like a charm. However, there are organizations where this approach is simply forbidden. They have to adhere to all kinds of laws, legislations, rules and so on. In worlds like this service accounts with a ‘set & forget’ approach aren’t allowed at all and will make their security audits go wrong.

In a world like that it’s required that the passwords used by the OM12x service accounts are renewed, once per 100 days for instance. And yes, this can be done and Microsoft even provides procedures for it about how to do just that. But like many other procedures, they aren’t spot on. Therefore I’ve decided to write this posting, all about changing the passwords of the OM12x service accounts AND to have a fully functional SCOM 2012x environment again.

What to do?
There is a lot to do. As well as preparing everything and executing it. Therefore I’ve divided it in three parts: Preparation, Execution and Aftercare.

A: Preparation
During this phase you’re not going to change anything. You just make sure you’ve got everything in place and ready for the change itself. This isn’t about technology only, but more about processes, people & organization.

01: Change Management
Yes. I know. Change Management can be a pain in the backside. But still many companies use it (thankfully!) and it helps to prevent outages and other not so nice things. So simply follow the rules of your company. Or be toast…

Modification of the passwords of the SCOM 2012x Service Accounts IS a change. So treat it a such and plan to do it outside the regular production time. Or plan it during a time frame when a short outage of the SCOM environment is acceptable.

Since it’s something which comes back on a regular basis, make a template out of the Change Request. This way it’s far more easier to follow the change management procedure. Store that document in your SharePoint portal and notify your team about the existence of this document.

During this phase you allocate resources (who’s going to do what) and also plan a date and a time (who’s going to it when). It also helps you to see whether or not this change doesn’t coincide with other changes/actions which don’t allow the monitoring solution to be unavailable for some time.

02: Run an inventory
Know what you have and where. So make an inventory of your SCOM 2012x MG:

  1. What are the SCOM 2012x Management Servers (MS) ?
  2. What SCOM 2012x MS servers host the SCOM Web Console?
  3. Are there any SCOM 2012x Gateway Servers?
  4. Which SQL servers/instances host the Operational SQL database?
  5. Which SQL servers/instances host the Data Warehouse database?
  6. Which SQL server/instance hosts the Reporting Services for SCOM 2012x Reporting?
  7. What is the Active Directory SCOM Agent Action account?
  8. What is the Active Directory SCOM SDK/Data Access/ Configuration account?
  9. What is the Active Directory SCOM DW Read account?
  10. What is the Active Directory SCOM DW Write account?
  11. Are there any 3rd party service accounts (e.g. Veeam) which require password modifications as well?
  12. Are there any 3rd party servers used for extended monitoring (e.g. Veeam) which require attention as well?

Since this kind of information is required every time when a password change is due, simply DOCUMENT it, use good versioning, also store it on your SharePoint portal (DON’T write the passwords down in the same document!!!) and notify your team about this document. And when something changes (like newly added Gateway or MS Servers) update this document accordingly.

This way you’ve got a single place where you can find this information.

03: Write a procedure for the password change
Make a document, describing high level the required steps and their order. In the same document describe every step in details – with screenshots of the most ‘exciting’ parts – so this knowledge/experience is transferable to members of your team.

And when executing the change, use this document and modify it as required. Also store this document on your SharePoint portal and notify your team about the existence of this document.

04: Read, read and read
There are many resources available on the internet about this password change. So READ it. This helps you not to wander into the unknown.

05: Test yourself before you wreck yourself
When you’re lucky to have a SCOM 2012x test environment in place, you can test the procedures there before using them in your production environment. But please be aware that many times SCOM 2012x test environment many times do not represent your SCOM 2012x production environment. Things can be missing or less distributed, like a single SQL Server instance, hosting BOTH SCOM 2012x databases and running the SQL Server Reporting Services for SCOM Reporting as well, while in production this is hosted by many more SQL Server instances and perhaps even in a high available scenario.

06: Garbage in, garbage out…
Meaning: When your SCOM environment is already having issues. DON’T change the passwords!!! First FIX those issues so your SCOM environment is healthy BEFORE you change the passwords. Might sound stupid to mention it, but you would be surprised….. Sad smile

Recap phase A – Preparation
As you can see – except for Step 05 that is – you haven’t done anything technical. No geek stuff. Writing documents, filing a request for change and stuff like that. But this is VERY important. So do it for a full 100% since it will be the foundation for the actual password change itself.

B: Execution
In this phase you’re going to change the passwords of the SCOM 2012x service accounts. High level overview of the steps involved:

  1. Change the passwords of the SCOM 2012x service accounts in AD;
  2. Change the passwords of the SCOM 2012x service accounts for the related services on the SCOM 2012x MS servers;
  3. Change the passwords of the SCOM 2012x service accounts in the SCOM Console (Run-As-Accounts);
  4. Change the passwords for SCOM Reporting in Reporting Services Configuration Manager Console;
  5. Change the passwords in IIS for the related Application Pools (when affected);
  6. Change the passwords in 3rd party SCOM 2012x tooling/MPs/Add-ons (when affected).

Before you start, empty the Operations Manager event logs on the SCOM 2012x MS servers. So it’s easier to see what’s happening after the password change in AD. Also open multiple services.msc consoles, each one connected to one SCOM 2012x MS server. This makes it easier to change the passwords for the related SCOM 2012x services in a single RDP session, instead of switching to different RDP sessions.

01: Change the passwords in AD for the SCOM 2012x service accounts
This one is straight forward. I am NOT going to explain this one Smile
image

02: Change the passwords for the SCOM 2012x services on the SCOM 2012x MS servers
In the services.msc console, modify the passwords as required for the services System Center Data Access Service and System Center Management Configuration. Do this for ALL SCOM 2012x MS servers!

  1. Open the properties of the service, go to the tab Log on, and change the password
    image
  2. Apply it and restart the service.
  3. Repeat Steps 1 and 2 for the other service. Repeat this per SCOM 2012x MS server.

03: Change the passwords in the SCOM 2012x Console (Run-As-Accounts)
Now you need to open the SCOM 2012x Console, log on with SCOM admin permissions.

  1. Administration pane > Administration > Run As Configuration > Accounts
  2. Select Action Account > your SCOM Action Account (in this example SC\OM12Action) > open it > tab Credentials > modify the passwords > Apply
    image
  3. Repeat this for all SCOM 2012x accounts in your Console like Windows > Data Warehouse Action Account and Windows > Data Warehouse Report Deployment Account.
  4. Check ALL the Run As Accounts since this configuration can be different per MG!
  5. When done, RESTART ALL SCOM 2012x MS Services on your SCOM 2012x MS servers, Health Service (Microsoft Monitoring Agent) included. This will make SCOM process the changes faster. Also some errors will go away since SCOM will apply the password changes.

04: Change the passwords for SCOM Reporting in Reporting Services Configuration Manager Console
Now it’s time to open Reporting Services Configuration Manager Console for the SQL Server instance hosting this service for SCOM 2012x Reporting and to change the passwords there as well.

  1. Open Reporting Services Configuration Manager Console and connect to the SQL Server instance hosting SSRS for SCOM 2012x Reporting;
  2. Go to Service Account > modify the password > Apply
    image
  3. Check the Results pane in order to see all went well:
    image
  4. This is a beast when you forget it because Reporting won’t run anymore. You’ll get errors like these when trying to connect to Report Manager:
    image

    And when connecting to Reporting Services:image

    So go to Database > Change Credentials
    image

    Hit the button Test Connection and when you get an all is okay message > OK > Next
    image

    Enter the new password > Next
    image

    Summary page > Next > Check the status when finished, it MUST be Success > Finish.
    image
  5. Go to Execution Account > modify the password > Apply
    image
  6. Check the Results pane in order to see all went well:
    image
  7. Open the services.msc console and open the properties for the SQL Server Reporting Services (YOUR SQL SERVER INSTANCE), in this example SQL Server Reporting Services (MSSQLSERVER). Go to the tab Log on and modify the password > Apply. Don’t forget to restart that service!
    image
  8. All should be okay now. So simply test it in the Reporting Services Configuration Manager Console at the Web Service URL entry and Report Manager URL entry by clicking on their URLs:
    image
    Result (when all is okay Smile):
    image

    And:
    image
    Result (when all is okay Smile):
    image

    When something isn’t okay, check all previously mentioned steps again. This approach should work.
  9. Close the Reporting Services Configuration Manager Console.

05: Change the passwords in IIS for the related Application Pools
Only when affected. So when you’re running 3rd party add-ons like the Remote Maintenance Mode Scheduler for instance.

  1. Open IIS Manager and connect to the server which is hosting the Application Pool which requires the password modification > Application Pools. Now you see a complete list of Application Pools and under what account they run. Mostly it’s ApplicationPoolIdentity but sometimes a SCOM 2012x service account is being used:
    image
  2. Right click on that Application Pool using the SCOM 2012x service account credentials > Advanced Settings > Process Model > Identity > click on the button with the three dots > Set and enter the correct credentials
    image
  3. > OK > OK > OK. Restart this Application Pool.
  4. Test it in the IIS Admin console: Go to Sites > Default Web Site (or any other main web site under which the SCOM 2012x web applications/sites are placed) > select the Application Pool you just modified > right click > Manage Application > Browse. IE will open the URL now and when all is okay, the website will be properly loaded.
  5. Repeat Step 4 for ALL applications in order to see all is okay:
    image
  6. When you’re having issues with the SCOM 2012x Web Console: relax. That website is a pain but easy to fix:
    - Open an elevated cmd-prompt and type IISRESET. Wait until IIS is running again. Test the SCOM Web Console again. When not okay, go to this posting of mine, run the commands as discussed AND run an IISRESET afterwards. Then all will be okay again. When the Web Console is still broken, go to this blog posting of Bob Cornelissen and follow his advice. And again, relax. The SCOM 2012x Web Console is a pain and breaks very easily.

06: Change the passwords in 3rd party SCOM 2012x tooling/MPs/Add-ons
This one is straight forward. Simply follow the guidelines outlined by the 3rd party about how to do that Smile. When you can’t find those guidelines, call their support team and let them send you the required manual.

Recap phase B – Execution
Wow! That was some serious work. As you can see, changing the passwords of the SCOM 2012x service accounts isn’t like a walk in the park. It needs some serious attention. In the last phase you’re going to test whether SCOM is operating like it was before the password change.

C: Aftercare
In this phase you’re going to check AND double check in order to see EVERYTHING is all right before you sign it of and tell the team the change has gone well while some stuff is broken… Ouch! Bad for your reputation and perhaps even your career.

So: CHECK YOURSELF BEFORE YOU WRECK YOURSELF. (Pete Zerger and Anders Bengtsson taught me this slogan during a MMS conference, and I love it).

01: OperationsManager event logs on the SCOM 2012x MS servers
Those logs? I love them since they tell me so much about the current state and health of the SCOM environment. No need to open the SCOM Console (just yet!).

So check the event log and see whether there are no Warnings or Criticals requiring immediate attention like not being able to store data in the Data Warehouse. Run this check on ALL your SCOM MS servers.

02: SCOM Console
Open the SCOM Console. Does it work? No SDK errors? Phew! At least you got that part right Smile. But this is only the start of all your checks, there is much more:

  1. Monitoring > Active Alerts
    Check whether new Alerts do come in. How old are they? Can you open them? Can you modify the resolution state? Can you reset them (Monitors), can you close them (Rules)?
  2. Monitoring > Operations Manager > Management Group Health.
    Is everything green? And when not, open Health Explorer and investigate AND fix it.
  3. Reporting > Reporting
    Give the reporting tree a refresh (F5). Does it show the available reports? Can you run the reports? No errors?
  4. Administration > Administration > Settings > Web Address
    Can you open the SCOM Web Console? No errors? Can you browse in it? Can you open the Web Console based Health Explorer? Can you open the properties screen of an Alert?
  5. Administration > Administration > Settings > Reporting
    Open it, copy and paste the URL in IE. Can you open it in IE? Can you browse it?

03: IIS Admin console
Even though you have already tested during the execution phase, test it AGAIN. Always better to know for a full 100% all is okay than some else is telling you it doesn’t… Ouch!

  1. Go to Sites > Default Web Site (or any other main web site under which the SCOM 2012x web applications/sites are placed) > right click the first application on top of the list > Manage Application > Browse. IE will open the URL now and when all is okay, the website will be properly loaded.
  2. Repeat Step 1 for ALL applications in order to see all is okay:
    image

04: 3rd party add-ons, consoles, MPs, tooling and what ever you’ve got
Test those as well so you know those are okay as well.

Recap phase C – Aftercare
Congrats! Time to sign it off. The password change of the SCOM 2012x service accounts didn’t wreck your SCOM environment. Awesome! Now you can tell your team SCOM is up & running again AND fully functional.

Total recap
As you can see, modifying the passwords of the SCOM 2012x service accounts isn’t rocket science and yet it requires thorough preparations and once started, you need to follow a certain order and not stop when half way through since it will result in a (temporary) broken SCOM environment.

The first time is always the biggest challenge. But when you make good documentation, keep them up to date, store them in your companies SharePoint portal AND notify your team about the existence of those documents, you’ll see that the next time it will be easier to do.

Monday, January 26, 2015

OM12 Quick Trick: Overview Of Clean & Dirty Server Reboots


Update 01-26-2015
A regular reader of my blog pointed out a BIG mistake in this posting. You can’t use Event Collection Rules to send out Alerts. Simply because Notifications REQUIRE Alerts to work. And guess what? Event Collection Rules DON’T create Alerts, so NO Notifications are send out. Ever! Now I come to think about it, I wonder why I made such a mistake. Like a am a total newbie to SCOM… Therefore I decided to pull this posting, update it and repost it. A BIG word of thanks to Keith Kleiman who pointed this out to me.

Why this posting?
Many times I am asked whether SCOM can show which Windows Servers got a reboot, whether it was a clean restart (e.g initiated by an administrator or a PS script, whether planned or unplanned) or a dirty reboot (unexpected, e.g a power off or a system failure).

The funny thing is, SCOM already collects that kind of information. However, it’s not shown by default in the SCOM Console. All you’ve to do is to build to Views and – when required – adjust the Notification Model so those server reboots (only the dirty ones or the clean ones as well) are send out by SCOM as an SMS or e-mail message.

Another trick I want to share with you is to build such a View first in the ‘My Workspace’ area of the SCOM Console. For multiple reasons, but these are the two main ones:

  1. All the Views you create here aren’t stored like all the other Views in a MP. So you get to see the end results much faster, also allowing you to tweak it until it fits the requirements of your organization. This way you’re capable to build & test the View much faster without impacting other users;
  2. Whenever you build a View in the Monitoring pane it’s stored in a MP which costs time and uses CPU time on the SCOM MS servers. Sometimes those CPU resources can’t be spared for it, especially in the really big SCOM environments. I’ve seen SCOM environments having issues with updating MPs, stalling the SCOM Consoles. In SCOM 2012x issues like these are mostly gone, but in SCOM 2007x it happened regularly.

Procedure
Open the SCOM Console with Admin permissions and go to the My Workspace area.

Clean Server Restarts View

  1. Right click on Favorite Views > New > Event View;
  2. Name: Clean Server Restarts. Give it a proper Description > Under the header Select conditions, select the
    option with a specific event number. Now you’ve got a screen like this one:
    image
  3. Click on the blue link ‘specific’ and you’ll get this screen:
    image
    You need EventID 6006
  4. OK > OK. The View is now created and will be SOON available:
    image
  5. However, the Computer Names are missing. Right click on the new View > Properties > tab Display and select the Column Logging Computer. and move it to the required position using the arrow button:
    image
  6. Now the View is ready:
    image

Dirty Server Restarts View

  1. Right click on Favorite Views > New > Event View;
  2. Name: Dirty Server Restarts. Give it a proper Description > Under the header Select conditions, select the option with a specific event number. Now you’ve got a screen like this one:
    image
  3. Click on the blue link ‘specific’ and you’ll get this screen:
    image
    You need EventID 6008
  4. OK > OK. The View is now created and will be SOON available:
    image
    However, the Computer Names are missing. Right click on the new View > Properties > tab Display and select the Column Logging Computer and move it by using the arrow key to the required position:
    image
  5. > OK. Now the View is ready:
    image

When both these Views are okay you can rebuild them in the Monitoring pane. Remember to build a MP FIRST and not to add these new Views to the root since they’ll end up in the Default MP and that’s a No Go Area!

Okay that’s nice. But how about sending out Notifications?
A bit more work is required here. Reason is that the previous mentioned Rules won’t work to send out Notifications. Why? Notifications only work for Alerts being shown in the Console. But the previous mentioned Rules are EVENT COLLECTION Rules so they’ll NEVER create an Alert. As a result a Notification won’t be send out based on those Event Collection Rules.

This means you’ve to create two new Alert Generating Rules; one for clean server restarts, and another one for dirty server reboots. Disable them by default and enable them for a special Group of servers you want to keep track of, like the production servers for instance. Of course, you can also enable these two Rules for the Groups containing the different Windows Server versions, like Windows Server 2003, 2008x and 2012x. By creating two Rules, disabling them by default and enabling them through an Override targeted at Groups of your choice, you circumvent the situation where you would have to create these Rules more than once.

A: Alert Generating Rule for Clean server restarts
What we need is Event ID 6006 from the System event log. This is easy and straight forward as well:

  1. SCOM Console > Authoring > Management Pack Objects > Rules > right click > Create New Rule > Alert Generating Rules > Event Based > NT Event Log (Alert) > Choose a proper Unsealed MP or create a new one (e.g. _Server Reboots) > Next
    image
  2. Give the Rule a proper Name (e.g. Clean Server Restart Alert) and Description. Choose as Rule target Windows Computer. I know this doesn’t adhere to the Best Practices of MP authoring since the target Windows Computer is way too generic, like aiming with buckshot. But in this case we DISABLE the Rule and enable it later on through one or more Overrides, aimed at specific Groups, containing a special set of Windows Servers. So in this case it’s not that bad Smile > Next
    image
  3. Type: System > Next
    image
  4. Type 6006 as Value for the Parameter Name Event ID. image
    Select the line Event Source by clicking on the left side of it and click on Delete. Now you’ve got this:
    image
    > Next
  5. Give it a proper Alert Name (e.g. A Windows Server Got a Clean Reboot), remove the entry in the Alert Description field and type Server[space] > hit the button with the three dots and select Target > DNS Name > and type now: [space] got a clean restart. Now you’ve got something like this:
    image
    > OK. Since it’s a CLEAN server restart, I expect it to be planned. So a Critical Alert is a bit too much so I’ve set it to be a WARNING Alert with Priority High > Create
    image
  6. The Rule is being created now. When it’s in place enable through an Override for a Group containing those Servers you wanted to be Alerted for when they got a clean restart. In this example I’ve enabled this Rule for the Group Windows Server 2012 R2 Computer Group:
    image
  7. Let’s test it and restart gracefully a Windows Server 2012 R2 server, monitored by SCOM:
    image
    Bingo! So now you can use this newly created Rule (Clean Server Restart Alert) for your Notification Model.

B: Alert Generating Rule for Dirty server reboots
In this case we need to create an Alert Generating Rule for Event ID 6008, System even Log.

Basically the steps for authoring this Rule are almost the same as described in the Steps for generating Alerts for Clean server restarts. So therefore I’ll only highlight the differences:

  1. No differences;
  2. Name is like: Dirty Server Reboot Alert and a good description, like:
    image
  3. No differences;
  4. Capture Event ID 6008:
    image
  5. Change the Alert Name to A Windows Server Got A Dirty Reboot! and modify the Alert Description as well so it displays that the server got a dirty reboot and additional checks are required, like this for example:
    image
    Since it’s a DIRTY reboot I set the Severity to Critical and the Priority to High:
    image
  6. Create the Rule and Override it as desired. In this example I Override it for the Group Windows Server 2012 R2 Computer Group.
  7. Let’s test it and give a monitored WS 2012 R2 server a dirty reboot:
    image
    Bingo! And now you can use this newly created Rule for your Notification Model as well.

Wow! And how about Reports?
That’s easy as well. Seriously.

  1. Reporting > Microsoft Generic Report Library > Custom Event > Open;
  2. Be patient Smile > choose for From an offset like a month or so (Advanced > Today > Minus > 1 > Month). Now you’ve got something like this:
    image
  3. Add Group > Windows Server Computer Group > Add. Now it looks like this:
    image
  4. Under Report Fields select these items in the SAME order(!):
    - Logging Computer
    - Object
    - Event ID
    - Level
    - Date
    By default the order is wrong. Use the arrow buttons to sort that out. It looks like this now:
    image
  5. Select Report Field Event ID > hit the button Filter > enter Event ID 6008
    image
  6. Run the Report once in order to test it.
    image
  7. When all is okay > File > Publish (or save to MP, go here for the differences) > give it a proper name like Dirty Server Reboots the last month.
  8. Repeat Steps 1 to 7 for another report about the clean server reboots. Only in Step 5 use Event ID 6006 and in Step 7 use another name like Clean Server Reboots the last Month.

Recap
As you can see, it’s quite easy to have SCOM monitoring for server reboots, whether those are clean restarts or dirty reboots. Some of this monitoring is already available out of the box whereas for Notifications some additional (very basic and easy!) authoring is required.