The crash happens when trying to open the SVG file with the below content. Note, that CDATA (stands for charachter data) doesn't have any element content leading to crash.
<?xml version="1.0" encoding="utf-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<![CDATA[]]>
</svg>
These are the registers and the faulting instruction when the read access violation occurs.
0:005> r
rax=0000000000000000 rbx=00000000042bc2b0 rcx=00000000ffffffff
rdx=0000000000000000 rsi=0000000000000002 rdi=00000000042bbdc0
rip=000000005bd9aff5 rsp=00000000042bc4d0 rbp=00000000002844b0
r8=0000000000000000 r9=0000000000000084 r10=000000005c7c1b00
r11=000000000eb1d7a0 r12=00000000042bc570 r13=0000000000000000
r14=00000000002844b0 r15=0000000000000001
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010244
MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x134:
00000000`5bd9aff5 410fb70c4c movzx ecx,word ptr [r12+rcx*2] ds:00000002`042bc56e=????
This is the call stack. It's not used to support any of claims but it's pasted for information only.
0:005> k
Child-SP RetAddr Call Site
00000000`042bc4d0 00000000`5bc7c249 MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x134
00000000`042bc530 00000000`5bc7c153 MSHTML!CHtmCrlfParseCtx::AddTextIE9+0x146
00000000`042bc7c0 00000000`5bd2bd17 MSHTML!CHtmCrlfParseCtx::AddText+0x24
00000000`042bc800 00000000`5bd7daa5 MSHTML!CHtmParse::ParseText+0x349
00000000`042bc9c0 00000000`5bd2d36e MSHTML!CHtmParse::ParseToken+0x2d2
00000000`042bca00 00000000`5bd20980 MSHTML!CHtmPost::ProcessTokens+0x49b
00000000`042bcd00 00000000`5bd20f7b MSHTML!CHtmPost::Exec+0x274
00000000`042bcf10 00000000`5bd20e81 MSHTML!CHtmPost::Run+0x5a
00000000`042bcf50 00000000`5bd206eb MSHTML!PostManExecute+0x1ae
00000000`042bcfa0 00000000`5bd1f4e0 MSHTML!CDwnChan::OnMethodCall+0x1b
00000000`042bcfd0 00000000`5bdeb468 MSHTML!GlobalWndOnMethodCall+0x18b
00000000`042bd060 00000000`771d9bd1 MSHTML!GlobalWndProc+0x36c
00000000`042bd0e0 00000000`771d98da USER32!UserCallWinProcCheckWow+0x1ad
00000000`042bd1a0 000007fe`ebfdaf5e USER32!DispatchMessageWorker+0x3b5
00000000`042bd220 000007fe`ebf87754 IEFRAME!CTabWindow::_TabWindowThreadProc+0x9c1
00000000`042bf680 00000000`76d153b3 IEFRAME!LCIETab_ThreadProc+0x39f
00000000`042bf820 000007fe`ebf68dcb iertutil!CIsoScope::RegisterThread+0x10f
00000000`042bf850 00000000`76bf652d IEFRAME!Detour_DefWindowProcA+0x97
00000000`042bf890 00000000`772ec521 kernel32!BaseThreadInitThunk+0xd
00000000`042bf8c0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
After restarting Internet Explorer and the debugger (leading different memory layout), I set a breakpoint at the address of the faulting instruction.
0:005> g
Breakpoint 0 hit
MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x134:
00000000`69f0aff5 410fb70c4c movzx ecx,word ptr [r12+rcx*2] ds:00000002`0421cb2e=????
r12
is the base address of a structure. The instruction reads a word value from the structure using a delta value that is RCX*2
. In our case, RCX
is 0`ffffffff
which cannot be a valid delta value as the size of the memory region of the structure is only 4000
bytes as seen below. Therefore by executing the instruction it reads out of the bounds of the structure, that is likely an invalid memory address.
0:005> !vprot r12
BaseAddress: 000000000421c000
AllocationBase: 0000000004020000
AllocationProtect: 00000004 PAGE_READWRITE
RegionSize: 0000000000004000
State: 00001000 MEM_COMMIT
Protect: 00000004 PAGE_READWRITE
Type: 00020000 MEM_PRIVATE
By allocating memory at the address of the invalid read we can execute the instruction without causing exception.
MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x134:
00000000`69f0aff5 410fb70c4c movzx ecx,word ptr [r12+rcx*2] ds:00000002`0421cb2e=????
0:005> .dvalloc /b 2`0421c000 1000
Allocated d000 bytes starting at 00000002`04210000
As seen below, movzx
now tries to access to a valid memory region recently allocated by .dvalloc
, filled with zeroes. If we let the program execute now it doesn't crash. Even if we refresh the SVG file it doesn't cause a read access violation because now the memory movzx
tries to access is a valid address in the virtual address space.
0:005> r
rax=0000000000000000 rbx=0000000000000000 rcx=00000000ffffffff
rdx=0000000000000000 rsi=0000000000000000 rdi=000000000033bd50
rip=0000000069f0aff5 rsp=000000000421ca90 rbp=000000000df58d90
r8=0000000000000000 r9=000000000e03fee0 r10=000000000dfe4401
r11=0000000000000001 r12=000000000421cb30 r13=0000000000000000
r14=000000000df58d90 r15=0000000000000001
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x134:
00000000`69f0aff5 410fb70c4c movzx ecx,word ptr [r12+rcx*2] ds:00000002`0421cb2e=0000
When the instruction is executed it reads the tainted word value to ECX
. As seen below, the word value is copied to a second structure. At this point the tainted data is copied to a second structure.
0:005> p
MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x13c:
00000000`69f0affd 66894f7c mov word ptr [rdi+7Ch],cx ds:00000000`0033bdcc=0020
Using the PoC, later on the execution the program doesn't read the tainted data from the second structure that is verified by processor breakpoint.The tainted word value was zero as
.dvalloc
filled the memory with zeroes. Since we know the address of the word value we can try all the 65536
possibilities and watch what happens when processing the different word values. To achieve this I wrote a wrapper HTML which calls the SVG file periodically. Also, I set up the below commands which does the job automatically, so in every iteration we provide different tainted input in range of 0
to 65535
.
0:005> bc 0
0:005> r $t0 = 0;
0:005> bp 69f0aff5 "j @ecx=ffffffff 'ew 2`0421cb2e @$t0; r $t0 = @$t0 + 1; .if (@$t0 <= 0xffff) { gc }'; gc"
0:005> g
When the command completed I concluded that we tried all the word values but neither of them seem to trigger unusual code paths using the PoC.Further analyzing the data flow there is a possible place when the tainted data could be used to make decision on the execution flow, and it's the following.
MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x87:
00000000`69f4823e 0fb74f7c movzx ecx,word ptr [rdi+7Ch]
00000000`69f48242 6685c9 test cx,cx
00000000`69f48245 0f8511b8d2ff jne MSHTML!CHtmSpaceParseCtx::AddNonspaces+0x90 (00000000`69c73a5c)
If this code can be reached when [rdi+7c]
is the tainted data that might have consequence due to the complexity of the code paths in this function. I however was unable to reach this code path in my experience.The address of invalid read is predictable by the attacker as it's always
2 x 0`ffffffff
plus something which can be filled with attacker controlled data. But this doesn't seem to be relevant as Microsoft just confirmed that they had finished their investigation into this crash and as they say "[we] can confirm that this crash is caused by a read AV which is not exploitable." I believe them and I no longer investigate this but will focus on other crashes, but thought to share this.