Thursday, October 06, 2016

SQL Server Configuration Manager - Invalid class [0x80041010]

This is the second blog post that I wrote about troubleshooting of SQL Server Configuration Manager. This time, or a forgotten number of times, I start Configuration Manager but it returns an error like below:

Cannot connect to WMI provider. You do not have permission or the server is unreachable. Note that you can only manage SQL Server 2005 and later servers with SQL Server Configuration Manager. Invalid class [0x80041010]

Sounds like the code [0x80041010] is a good starting point. Did a bit of research, and found Michael Aspengren's blog post about it (see reference below).

So first, open the command prompt with Run as administrator and navigate to the C:\Program Files(x86)\Microsoft SQL Server\100\Shared\

then run the following command: mofcomp sqlmgmproviderxpsp2up.mof

The output should look like these:

Microsoft (R) MOF Compiler Version 6.1.7600.16385
Copyright (c) Microsoft Corp. 1997-2006. All rights reserved.
Parsing MOF file: sqlmgmproviderxpsp2up.mof
MOF file has been successfully parsed
Storing data in the repository…
Done!

And Wola!! SQL Server Configuration Manager should starts without a problem!! Thanks again, Mike!! :)

Reference
"SQL Server Configuration Manager" gives "Invalid class [0x80041010]” when starting.

Friday, July 03, 2015

Cheatsheet - Time related query tricks

A cheat sheet for me to collect the wisdom in query date time.... (and this is on-going updated when I find other clever method over time)

Convert UTC time into local time

Wednesday, February 11, 2015

Troubleshooting - Failed to write file data on cluster disk 1 partition 1, failure reason: The disk structure is corrupted and unreadable

One of the initial task in setting up a windows cluster for SQL failover cluster, is to validate if the servers fulfill all the network, storage, hardware and software requirements. It helps you to work through the setting up and straights things up before the really deal.

Sometimes, it is a step with full of surprises, especially when you don't have the control over how the server was set up. To be fair, it is quite a list of items and it is fair to say my network/system team missed on one or two things, which is still acceptable. Validation is like a QC check against the built servers to match the standard. On the other hands, some issue would just a pure kind of jerk that shows up to make your hard day harder.

I have the following issue returned, on the storage section, which make me and my network/SAN dude scratch our head a bit. The "Validate a configuration" check shows:

Interestingly, new shared drive created at the SAN shows up as corrupted. I tested the drive and I can read and write to it. It just not pass the validation. First come to my mind, is to reformat the drive 1 from the disk management msc. And that doesn't help. Indeed, I realise that the approach was wrong.

I researched on a few forums to look for the solution. First everyone suggested that (potential) Cluster Disk 1 doesn't corresponding to the physical disks which listed in the disk management msc. To determine what disk is the troublemaker, we need to check the List Potential Cluster Disks at the storage section in the validation report, then take a note of the disk ID. For example, I have an issue with Cluster Disk 1, and look up the list of potential cluster disks give me the ID c188f5ac:

Then, from the List All Disks, which lists any disk attached to the servers, we can then identify the disk in trouble with the disk ID mentioned above.

Now that we know which disk requires our attention, so let's get the solution out. There is a few potential reason that may contribute to the symptom that "The disk structure is corrupted and unreadable". I have a disk drive issue, what should I do? It sounds like one of my old days job interview question.

It would be the old school "chkdsk /f". The truth is, if you found 1 drive shows up disk drive issue, I would run chkdsk against all the drives, just in case. In my case, there are 3 drives out of 8 reported with drive issue, and "chkdsk /f" just fixed that.

Monday, February 02, 2015

SQL Trick KB: Algorithm - First Date of Financial Year

I was trying to implement a stored procedure with some function working based on the first date of the financial year for any given date. In Australia, our financial year start on the 1st of July. So, for example, if today is '2015-06-30', then it should return the '2014-07-01' as the first date of FY15, while if the date '2015-07-02' would return the next financial year FY16 with the start date as '2015-07-01'.

Sounds easy, huh? I was working on some CASE...WHEN statement, until I found this piece of brilliant, clean-cut idea, and thought that this is just so simple. It is from damien-the-unbeliever's answer at the Most efficient way to calculate the first day of the current Financial Year? at stackoverflow.com.

Monday, September 15, 2014

Maps database user 'dbo' to a server login instead of sa

Once upon a time, when human being was so naive and innocent, everyone wrote their application starting with the statement:

"It needs to connect to the database as the 'dbo' database user, or the application would not work at all."

That was the golden rule of thumb in software development. Years and eras gone by and a bunch of developers keep telling the whole world that it is necessary to stay that way, or the application would not work at all. Full stop.... The bunch of developers said.

Let's face the now-a-day facts:

  1. Human being is not so naive and innocent now-a-day. We couldn't trust people like how it was used to be.
  2. There are ways for how applications connect to the database, other than solely connecting as the 'dbo' database user.
  3. Based on point 1, the world is asking for more to be done to keep their data secure.
  4. Based on point 2, if one could not code without solely connecting as the user 'dbo', it is more likely an issue with capability or indolence.
  5. Based on point 1 & 2, the argument that "an application required to be the DBO user and the database owners" is simply invalid.

Given there are now-a-day facts, there are still out-dated people thus applications. In our case, we have limited choices for a number of reason to stuck with a suboptimal application. At certain point, you can only work with it.

As a DBA, one would usually find that this happens when a database needs to be migrated from an old SQL Server 2000 instance to a newer version on SQL server. Once you restore the database and found that the database owner was empty, in general a DBA will assign the database ownership to 'SA' and then create a database user for the App's service account - here is the point that when the application vendor would told you that it is not 'correctly' set up.

What you can do to compromise the database to comply with their so called system requirement? The following command would help you to associate a server login to the database dbo user:

Thursday, September 04, 2014

Tracing Deprecated Features SQL Server during database migration or upgrade

Imagine. You have an in-house developed application which you spent so much time and effort on to enable it to close to perfection. Users love you for the state of art build and you feel you have achieved something to be proud of.

2 years down the track, it is time for SQL server major upgrade. Let's bump it up a version and take advantage of the new features and enhancement that Microsoft charge you for. After that hard work on SQL server upgrade, you found that you great perfect application - shows some show stopper issue. You know, it is kind of frustrating if you have developed a database object and one day it just stops working because Microsoft decide to make the world better by retiring existing build-in object and function and replace with a better new one.

Users are asking why your perfect application doesn't work, and suggesting that let's abandon the upgrade project as everything is working at the moment, why change? (the usual normal people...). Time is clicking and you are stress....

I don't want to be there, at that timing. So let's be proactive!! You can found out what deprecated function your application is currently using and how frequent that they are being use with the following query:

And this is a list of link from Microsoft to show you what is required to look out:

Deprecated Database Engine Features in SQL Server 2008 R2

Discontinued Database Engine Functionality in SQL Server 2008 R2

Discontinued Database Engine Functionality in SQL Server 2008

Discontinued Database Engine Functionality in SQL Server 2005

Deprecated Commands in SQL Server 2005

Friday, July 11, 2014

To find out which domain controller you are authenticated to on ther server ?

I was helping my friend in finance to check out some issue in test environment, and of course, which starts as "hey, we got some database issue since we can't connect to to backend...". Great pick up line as usual.

So my friend shows me his database issue, and I can point it out straightaway that it is an authentication plus DNS issue rather than database. It rings the bell that my OP team friend told me about a project that they are looking at upgrading the domain controllers and it is reasonable that they are doing a rehearsal for the big date.

There is 2 active directory servers online and I am not sure which one the application server is authenticated from. Well, I found the following handy environment variable for just the job:

Thursday, July 03, 2014

Start / Stop a service at remote Windows server

There is an useful Windows command I learn recently which I thought why I did come across this earlier. It helps a lot when it needs to stop a service on an remote app server prior to run a database script.

The command is sc.

Task Command syntax
To stop a Service sc \\{server} stop {service}
To start a Service sc \\{server} start {service}
To restart a Service sc \\{server} restart {service}

Event ID to identify a system shutdown, startup, reboot and System uptime

How many times that you are surprised by automated Windows patching task which restarts your database server in the middle of a job runs? Of course, you should discuss that with you system admin team to work out how to schedule and deal with Windows patching on database servers. To start a discussion like that, you always need to find if your server does restart in the first place and what time and by who/what.

So, what are you looking for? In the event viewer, you can filter the system logs with the following Event IDs as explain below:

Event ID Description
6005 / 6009 OS started
6006 OS shutdown
6013 OS daily update report. that's the OS has been running for more than a day, measure in seconds.
1074 This sweet event shows you which process and on behalf of which user a reboot was initiated.

Saturday, October 12, 2013

Log shipping 'Could not find a log backup file that could be applied to secondary database'

"I hate Friday and Monday."

Statistically, these 2 dates have demonstrated a much higher chance of system failure while compares to the rest of the other weekdays. Needless to mention that a majority of PMs love to ask for additional enhancement on Friday, in between 3pm and 5pm...

It was started at Thursday night, and I have a mini change to deploy. While I found that I can't remote in with some weird system error, my heart can sense that something had gone wrong. Unfortunately, data center is too far away to me, so there is nothing I can do.

Next day, reading the email from the operation team sent at late 4am. The team had worked around the clock to turn around the situation. They are later told that building maintenance was conducted some work and an "unforeseeable" event took place, and as a result air conditioning's power got cut off. Server room "virtually moved" to Africa with a temperature at high 40°C plus, servers shut itself down to survive the heat wave. Everything comes back online. A few issue here and there, but we can claim that we make it through. Check my SQL servers, most of the servers and databases are fine... except one pair of servers their log shipping is not working on all databases. After all my work is done to recover these log shipping, I think this is rather a rare scenario that a DBA can face. Given that, I really wanted to share here and hope it may help someone else who hates Friday like me.

Back to the topic. In general, log shipping would recover from most of the unexpected outage event very well. Usually, it just needs to catch up. Since we all know that the log backup are done in the order of transaction, and so does the restore job will only restore each file in the same order to ensure proper roll forward, there is really not much room for anything to go wrong. And if restore failed, one would assume that there could be a log backup file gone missing or corrupted. At worst, it could simply fix it by recreating the log shipping from scratch. In my case, I can't. I have a bunch of large SharePoint content databases and it will incur a big deal work and time to reinitialize them one by one.

Through investigation, what I found is:

  1. The log backup job and log copy job keep going without an issue, but the log restore job is just stuck and claims that it could not restore a log which is too recent, with a error message similar to this:
    • Error: The file 'S:\DEFAULT INSTANCE\LogShip\WHATEVERDB_20131010094501.trn is too recent to apply to the secondary database 'WHATEVERDB'.(Microsoft.SqlServer.Management.LogShipping)
    • Error: The log in this backup set begins at LSN 813434000023423100001, which is too recent to apply to the database. An earlier log backup that includes LSN 813226000014464300001 can be restored. Searching for an older log backup file. Secondary Database: 'WHATEVERDB'
    • Error: Could not find a log backup file that could be applied to secondary database 'WHATEVERDB'.(Microsoft.SqlServer.Management.LogShipping)
  2. I checked the list of log backup files and found that they are identical on both the primary and secondary server. I even delete those on the secondary and copy them over again. Therefore, I didn't miss any log backup that need to be restored. Still, restore job stuck at a single log backup file.
  3. By testing to manually restore the log backups, I can cross out the possibility that if the log backup is corrupted, as I believed that if the failure resulted of a corrupted log backup, the error message would be different.
  4. I noted that the system time on the server was gone strange for a short while, but it seems normal when I checked again.

I did some research and I found that there are 2 system tables in msdb, dbo.log_shipping_secondary and dbo.log_shipping_secondary_databases, can help to identify the last log backup file that it restored at the secondary database, Put them together in a query like the following:

So, I identified the last restored log backup and start to work from that point. I disabled the copy and restore jobs, then I tried to manually restore each log backup individually to the database. This is the list of files I got at the time:

First, I attempted to restore the trn file#1, which is done successfully. Next, I have a go with the trn file#2, and it complaint that it is too recent to apply to the database, same as reported from the restore job. Thought I have nothing to lose, so I tried to restore the trn file#3. I was expected a "too recent to apply" error again, however it surprises me with successfully done!! WTH? So, I retry on trn file#2, and it reports it's done successfully as well. After this point, I can restore in the normal order. So all I have done in order to fix it is to swap the order of restoring the trn file#2 and trn file#3. Strange, isn't it?

So, what is the "Lesson Learnt" here? I give it a thought on why it could be happened in the first place. Log backup file are named as [dbname_yyyymmddhhmiss.trn] format and log shipping job run based on the order of how the filename sorts in ascend. In our case, something gone wrong with the order when the 2 log backup files were written and named. Sounds non-sense, but it is possible. Remember, I have mentioned that the system time has gone strange for a short time. My guess is that time was incorrect on the server, and the time service doesn't able to correct it when the file was written. Therefore, when the log backup job first kicks in with a wrong system time, it named the file incorrectly. and once the time service fixed the time, you may found that SQL server would miss the right file because of the incorrect file sorting. That would explain why all the database on the server suffers the same symptom.

So, next time before you restart the log shipping from scratch with no choice but in pain, try to check the current situation. Sometimes, you may found that it is not as bad as you thought...