Summary
Backups using Volume Shadow Copy is not run fine.
Error
Environment
Cell Server: HP-UX DP 6.11
Clients W2003/R2 and W2008/R2
Error
[Critical] From: VBDA@server
"server [/D]" Time: 2010-11-17 09:57:10
Volume Shadow Copy functionality could not be initialized.
[Critical] From: VBDA@server
"server [/D]" Time: 2010-11-17 09:57:10
Fallback to legacy filesystem backup was not allowed. Aborting the backup.
[Critical] From: VBDA@server
"server[/D]" Time: 2010-11-17 09:57:10
[81:52] /D
Not a valid mount point => aborting.
Cause
VBDA debugs show:
[ 12] 2010-11-17 09:57:10 ("da/bda/win32.c /main/blr_dp61/hsl_hpit2_2/blr_hpit2_2/2":390) A.06.11 bDPWIN_00441
[ 81] Win32CheckVol: volname: \D
[ 81] Win32CheckVol: root: \\?\D:
path: D:
[ 81] Win32CheckVol: volname: \D
[ 81] Win32CheckVol: root: \\?\D:
path: D:
[ 12]
Volume Info:
root: \\?\D:\
volname: Data
max filename len: 255
filesystem flags: 459007
filesystem: NTFS
Volume Info:
root: \\?\D:\
volname: Data
max filename len: 255
filesystem flags: 459007
filesystem: NTFS
[ 23] Disk support file compresion
[ 99] 2010-11-17 09:57:10 ("da/win32/vss_back.cpp /main/hsl_dp61/blr_dp61/5":195) A.06.11 bDPWIN_00441
[ 99] ===>> (0) VSSInit() {
[ 99]
[ 50] SetContext() failed with return code 8000ffff <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ERROR to FIND out of DP, GOOGLE for example.
[ 99] ===>> (0) VSSInit() {
[ 99]
[ 50] SetContext() failed with return code 8000ffff <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ERROR to FIND out of DP, GOOGLE for example.
[ 99] 2010-11-17 09:58:20 ("da/win32/vss_back.cpp /main/hsl_dp61/blr_dp61/5":271) A.06.11 bDPWIN_00441
[ 99] <<=== (0) } /* VSSInit() */
[ 99]
[ 99] <<=== (0) } /* VSSInit() */
[ 99]
[ -2] 2010-11-17 09:58:20 ("da/bda/win32.c /main/blr_dp61/hsl_hpit2_2/blr_hpit2_2/2":927) A.06.11 bDPWIN_00441
[ -2] Cannot init Volume Shadow Copy Service <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ERROR to find in the Application Event Viewer
[ -2] Cannot init Volume Shadow Copy Service <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ERROR to find in the Application Event Viewer
[[100] *************** SKPrintf [START] ***************
[100] Tokenize the string...
[100] token [0] = [%1] From: %3@%4 "%6" Time: %2
%5EVENT VIEWER APPLICATION ERROR ON THE CLIENT
[100] token [1] = Critical
[100] token [2] = 17/11/2010 09:58:20
[100] token [3] = VBDA
[100] token [4] = server
[100] token [5] = Volume Shadow Copy functionality could not be initialized.
[100] token [2] = 17/11/2010 09:58:20
[100] token [3] = VBDA
[100] token [4] = server
[100] token [5] = Volume Shadow Copy functionality could not be initialized.
[100] token [6] = server[/D]
[100] *************** SKPrintf [END] ***************
[100] *************** SKPrintf [END] ***************
[ 5] 2010-11-17 09:58:20 ("da/bda/win32.c /main/blr_dp61/hsl_hpit2_2/blr_hpit2_2/2":933) A.06.11 bDPWIN_00441
[ 5] [REPORT 3] in Win32CheckVol
Marked: ("":Line:0 in '')
Code is:0
[Critical] From: VBDA@server "server [/D]"
[ 5] [REPORT 3] in Win32CheckVol
Marked: ("":Line:0 in '')
Code is:0
[Critical] From: VBDA@server "server [/D]"
Time: 17/11/2010 09:58:20
Volume Shadow Copy functionality could not be initialized.
Volume Shadow Copy functionality could not be initialized.
07/11/2010 09:57:50 Service Control Manager Error None 7000 N/A SERVER1 "The Volume Shadow Copy service failed to start due to the following error:
The service did not respond to the start or control request in a timely fashion. "
07/11/2010 09:57:50 Service Control Manager Error None 7009 N/A SERVER1 Timeout (30000 milliseconds) waiting for the Volume Shadow Copy service to connect.
07/11/2010 09:57:50 DCOM Error None 10005 N/A SERVER1 "DCOM got error ""The service did not respond to the start or control request in a timely fashion. "" attempting to start the service VSS with arguments """" in order to run the server:
{E579AB5F-1CC4-44B4-BED9-DE0991FF0623}"
The service did not respond to the start or control request in a timely fashion. "
07/11/2010 09:57:50 Service Control Manager Error None 7009 N/A SERVER1 Timeout (30000 milliseconds) waiting for the Volume Shadow Copy service to connect.
07/11/2010 09:57:50 DCOM Error None 10005 N/A SERVER1 "DCOM got error ""The service did not respond to the start or control request in a timely fashion. "" attempting to start the service VSS with arguments """" in order to run the server:
{E579AB5F-1CC4-44B4-BED9-DE0991FF0623}"
Check if writers reports any error:
C:\VSSADMIN LIST WRITERS
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001 Microsoft Corp.
(C) Copyright 2001 Microsoft Corp.
Writer name: 'System Writer'
Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
Writer Instance Id: {92f67625-8e13-49dd-97ec-77acc8d9104c}
State: [5] Waiting for completion
Last error: No error
Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
Writer Instance Id: {92f67625-8e13-49dd-97ec-77acc8d9104c}
State: [5] Waiting for completion
Last error: No error
Writer name: 'MSDEWriter'
Writer Id: {f8544ac1-0611-4fa5-b04b-f7ee00b03277}
Writer Instance Id: {d748d06c-07f8-4785-a5cb-bcf351dbad2b}
State: [1] Stable
Last error: No error
Writer Id: {f8544ac1-0611-4fa5-b04b-f7ee00b03277}
Writer Instance Id: {d748d06c-07f8-4785-a5cb-bcf351dbad2b}
State: [1] Stable
Last error: No error
Writer name: 'Event Log Writer'
Writer Id: {eee8c692-67ed-4250-8d86-390603070d00}
Writer Instance Id: {28da5d2e-db99-4e20-a89b-8f150499386d}
State: [1] Stable
Last error: No error
Writer Id: {eee8c692-67ed-4250-8d86-390603070d00}
Writer Instance Id: {28da5d2e-db99-4e20-a89b-8f150499386d}
State: [1] Stable
Last error: No error
Writer name: 'Registry Writer'
Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
Writer Instance Id: {48fefa43-0940-4000-ab18-33cb2a09b988}
State: [1] Stable
Last error: No error
Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
Writer Instance Id: {48fefa43-0940-4000-ab18-33cb2a09b988}
State: [1] Stable
Last error: No error
Writer name: 'WMI Writer'
Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
Writer Instance Id: {549dfeea-6480-4553-a9d9-6236696764a9}
State: [5] Waiting for completion
Last error: No error
Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
Writer Instance Id: {549dfeea-6480-4553-a9d9-6236696764a9}
State: [5] Waiting for completion
Last error: No error
Writer name: 'COM+ REGDB Writer'
Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
Writer Instance Id: {b6936b6c-c96c-460f-a825-323e488f05fb}
State: [1] Stable
Last error: No error
Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
Writer Instance Id: {b6936b6c-c96c-460f-a825-323e488f05fb}
State: [1] Stable
Last error: No error
After we have checked all info, we opened a case with Microsoft's Team for work on this issue, because apparently the problem is out of Data Protector.
Below some data reported by Microsoft Team.
==================================================================
I believe there to now be a timing issue preventing VSS from communicating with the SCM in the 30 seconds we allow.
From your failed services.exe trace.
We see the start request:
0:000> g
Breakpoint 1 hit
Time Travel Position: 1C4AB00000024.
eax=00def844 ebx=0101c040 ecx=00000000 edx=00000000 esi=0064cc20 edi=7c82f9e7
eip=01008a59 esp=00def810 ebp=00def84c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
services!ScLogonAndStartImage:
01008a59 8bff mov edi,edi
0:000> kP 1
ChildEBP RetAddr
00def80c 01008a4c services!ScLogonAndStartImage(
struct _SERVICE_RECORD * ServiceRecord = 0x0064cc20,
unsigned short * ImageName = 0x00114df8,
struct _IMAGE_RECORD ** ImageRecordPtr = 0x00def844) [d:\nt\base\screg\sc\server\start.cxx @ 828]
0:000> du 0x00114df8
00114df8 "C:\WINDOWS\System32\vssvc.exe"
We then see the pipe created:
0:000> p
Time Travel Position: 1C58540000031.
eax=00def6ec ebx=00def7f4 ecx=7c829f79 edx=38a20007 esi=00000000 edi=00def736
eip=01009104 esp=00def6a0 ebp=00def73c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
services!ScCreateControlInstance+0xb2:
01009104 ff15cc110001 call dword ptr [services!_imp__CreateNamedPipeW (010011cc)] ds:0023:010011cc={kernel32!CreateNamedPipeW (77e5c9a6)}
0:000> dv
PipeHandlePtr = 0x00def7f4
dwCurrentService = 0x14a
pAccountSid = 0x00087808
wszPipeName = unsigned short [37]
SecurityAttr = struct _SECURITY_ATTRIBUTES
AceData = struct SC_ACE_DATA [1]
SecurityDescriptor = 0x000ebc80
0:000> du wszPipeName
00def6ec "\\.\pipe\net\NtControlPipe330"
We then timeout waiting for VSS to connect:
0:000>
Time Travel Position: 1C5884000000D.
eax=00def7c8 ebx=00000000 ecx=00def734 edx=5ea012a0 esi=0064cc20 edi=00000000
eip=01008c45 esp=00def740 ebp=00def80c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
services!ScLogonAndStartImage+0x2b4:
01008c45 e801fdffff call services!ScWaitForConnect (0100894b)
0:000> p
Time Travel Position: 1CDFE00000020.
eax=0000041d ebx=00000000 ecx=00defce8 edx=5ea012a0 esi=0064cc20 edi=00000000
eip=01008c4a esp=00def750 ebp=00def80c iopl=0 nv up ei pl nz ac po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000212
services!ScLogonAndStartImage+0x2b9:
01008c4a 3bc3 cmp eax,ebx
0:000> !error 0000041d
Error code: (Win32) 0x41d (1053) - The service did not respond to the start or control request in a timely fashion.
Comparing a procmon from working and non-working we see an big delay in the time it takes for time it takes for VSS to query the registry for the pipe ID:
Broken
12:31:52.1588015 services.exe 452 RegSetValue HKLM\System\CurrentControlSet\Control\ServiceCurrent\(Default) SUCCESS Type: REG_DWORD, Length: 4, Data: 314
12:32:13.1196370 vssvc.exe 3912 RegQueryValue HKLM\System\CurrentControlSet\Control\ServiceCurrent\(Default) SUCCESS Type: REG_DWORD, Length: 4, Data: 314
21 seconds.
Working
11:34:36.0977900 vssvc.exe 2356 Process Start SUCCESS Parent PID: 452
11:34:36.0989083 services.exe 452 RegSetValue HKLM\System\CurrentControlSet\Control\ServiceCurrent\(Default) SUCCESS Type: REG_DWORD, Length: 4, Data: 307
11:34:36.3899567 vssvc.exe 2356 RegQueryValue HKLM\System\CurrentControlSet\Control\ServiceCurrent\(Default) SUCCESS Type: REG_DWORD, Length: 4, Data: 307
Fix
Add the following registry entry onto the problem server. It will instruct the SCM to wait longer than 30 seconds for a service process to respond. If something is delaying the startup of VSS then this may well be a solution to the issue.
1. In Registry Editor, locate, and then right-click the following registry subkey:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control
2. Point to New, and then click DWORD Value.
3. Type ServicesPipeTimeout to replace the New Value #1, and then press ENTER.
4. Right-click the ServicesPipeTimeout registry entry that you created in step 3, and then click Modify. The Edit DWORD Value dialog box appears.
5. In the Value data text box select Decimal, enter 90000 in the Value Data: and click OK
6. Restart the server
=======================================================================================
Even servers that had all of the VSS fixes still had the issue until we made the registry change.
CASE SOLVED.