TL;DR: I found a XML bug in Apache Xerces by fuzzing, but it turned out that a fix was already pending but not committed.

So in spare time, after giving a try to expat in another post, I decided to switch to Apache Xerces, another xml parser written in C/C++.

I reported the UAF on oss, but Gustavo Grieco (kudos) pointed me out that he already disclosed the same bug, and a fix was pending but not committed. (CVE-2016-2099)

ASAN Crash

Here you can see the crash (StdInParse is just a tool that takes from stdin the xml):

➜  xml cat xerces_uaf | xerces-c-3.1.3/samples/StdInParse 
=================================================================
==16010==ERROR: AddressSanitizer: heap-use-after-free on address 0xf4a0dfcc at pc 0x0836c7f4 bp 0xfff9a198 sp 0xfff9a188
READ of size 1 at 0xf4a0dfcc thread T0
    #0 0x836c7f3 in xercesc_3_1::ReaderMgr::getLastExtEntityInfo(xercesc_3_1::ReaderMgr::LastExtEntityInfo&) const xercesc/internal/ReaderMgr.cpp:833
    #1 0x83a42d4 in xercesc_3_1::XMLScanner::emitError(xercesc_3_1::XMLErrs::Codes, xercesc_3_1::XMLExcepts::Codes, unsigned short const*, unsigned short const*, unsigned short const*, unsigned short const*) xercesc/internal/XMLScanner.cpp:927
    #2 0x8e40963 in xercesc_3_1::IGXMLScanner::scanDocument(xercesc_3_1::InputSource const&) xercesc/internal/IGXMLScanner.cpp:276
    #3 0x84b4cca in xercesc_3_1::SAXParser::parse(xercesc_3_1::InputSource const&) xercesc/parsers/SAXParser.cpp:575
    #4 0x80533d6 in main src/StdInParse/StdInParse.cpp:186
    #5 0xf6dd5636 in __libc_start_main (/lib32/libc.so.6+0x18636)
    #6 0x80624f1  (/home/bob/VulnResearch/misc/xml/xerces-c-3.1.3/samples/StdInParse+0x80624f1)

0xf4a0dfcc is located 44 bytes inside of 56-byte region [0xf4a0dfa0,0xf4a0dfd8)
freed by thread T0 here:
    #0 0xf7228034 in operator delete(void*) (/usr/lib32/libasan.so.3+0xc5034)
    #1 0x80992df in xercesc_3_1::XMemory::operator delete(void*) xercesc/util/XMemory.cpp:89

previously allocated by thread T0 here:
    #0 0xf72279b4 in operator new(unsigned int) (/usr/lib32/libasan.so.3+0xc49b4)
    #1 0x8357ad9 in xercesc_3_1::MemoryManagerImpl::allocate(unsigned int) xercesc/internal/MemoryManagerImpl.cpp:40
    #2 0x8099042 in xercesc_3_1::XMemory::operator new(unsigned int, xercesc_3_1::MemoryManager*) xercesc/util/XMemory.cpp:68

SUMMARY: AddressSanitizer: heap-use-after-free xercesc/internal/ReaderMgr.cpp:833 in xercesc_3_1::ReaderMgr::getLastExtEntityInfo(xercesc_3_1::ReaderMgr::LastExtEntityInfo&) const
Shadow bytes around the buggy address:
  0x3e941ba0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e941bb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e941bc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e941bd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e941be0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x3e941bf0: fa fa fa fa fd fd fd fd fd[fd]fd fa fa fa fa fa
  0x3e941c00: fd fd fd fd fd fd fd fa fa fa fa fa 00 00 00 00
  0x3e941c10: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x3e941c20: fa fa fa fa 00 00 00 00 00 00 00 00 fa fa fa fa
  0x3e941c30: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00
  0x3e941c40: 00 00 04 fa fa fa fa fa 00 00 00 00 00 00 04 fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==16010==ABORTING

And you can download the minimized test case here