2019-04-08 23:22:26 +08:00

1 line
13 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<div class="body" role="main"><div class="section" id="module-xml"><h1><span class="yiyi-st" id="yiyi-10">20.4. </span><span class="yiyi-st" id="yiyi-11">XML处理模块</span></h1><p><span class="yiyi-st" id="yiyi-12"><strong>源代码:</strong> <a class="reference external" href="https://hg.python.org/cpython/file/3.5/Lib/xml/">Lib / xml /</a></span></p><p><span class="yiyi-st" id="yiyi-13">用于处理XML的Python接口分组在<code class="docutils literal"><span class="pre">xml</span></code>包中。</span></p><div class="admonition warning"><p class="first admonition-title"><span class="yiyi-st" id="yiyi-14">警告</span></p><p class="last"><span class="yiyi-st" id="yiyi-15">XML模块对于错误或恶意构造的数据是不安全的。</span><span class="yiyi-st" id="yiyi-16">如果您需要解析不受信任或未经身份验证的数据,请参阅<a class="reference internal" href="#xml-vulnerabilities"><span>XML漏洞</span></a><a class="reference internal" href="#defused-packages"><span>defusedxml和defusedexpat Packages</span></a>部分。</span></p></div><p><span class="yiyi-st" id="yiyi-17">重要的是要注意,<a class="reference internal" href="#module-xml" title="xml: Package containing XML processing modules"><code class="xref py py-mod docutils literal"><span class="pre">xml</span></code></a>包中的模块要求至少有一个符合SAX的XML解析器可用。</span><span class="yiyi-st" id="yiyi-18">Expat解析器包含在Python中因此<a class="reference internal" href="pyexpat.html#module-xml.parsers.expat" title="xml.parsers.expat: An interface to the Expat non-validating XML parser."><code class="xref py py-mod docutils literal"><span class="pre">xml.parsers.expat</span></code></a>模块将始终可用。</span></p><p><span class="yiyi-st" id="yiyi-19"><a class="reference internal" href="xml.dom.html#module-xml.dom" title="xml.dom: Document Object Model API for Python."><code class="xref py py-mod docutils literal"><span class="pre">xml.dom</span></code></a><a class="reference internal" href="xml.sax.html#module-xml.sax" title="xml.sax: Package containing SAX2 base classes and convenience functions."><code class="xref py py-mod docutils literal"><span class="pre">xml.sax</span></code></a>包的文档是DOM和SAX接口的Python绑定的定义。</span></p><p><span class="yiyi-st" id="yiyi-20">XML处理子模块是</span></p><ul class="simple"><li><span class="yiyi-st" id="yiyi-21"><a class="reference internal" href="xml.etree.elementtree.html#module-xml.etree.ElementTree" title="xml.etree.ElementTree: Implementation of the ElementTree API."><code class="xref py py-mod docutils literal"><span class="pre">xml.etree.ElementTree</span></code></a>ElementTree API一个简单和轻量的XML处理器</span></li></ul><ul class="simple"><li><span class="yiyi-st" id="yiyi-22"><a class="reference internal" href="xml.dom.html#module-xml.dom" title="xml.dom: Document Object Model API for Python."><code class="xref py py-mod docutils literal"><span class="pre">xml.dom</span></code></a>DOM API定义</span></li><li><span class="yiyi-st" id="yiyi-23"><a class="reference internal" href="xml.dom.minidom.html#module-xml.dom.minidom" title="xml.dom.minidom: Minimal Document Object Model (DOM) implementation."><code class="xref py py-mod docutils literal"><span class="pre">xml.dom.minidom</span></code></a>最小的DOM实现</span></li><li><span class="yiyi-st" id="yiyi-24"><a class="reference internal" href="xml.dom.pulldom.html#module-xml.dom.pulldom" title="xml.dom.pulldom: Support for building partial DOM trees from SAX events."><code class="xref py py-mod docutils literal"><span class="pre">xml.dom.pulldom</span></code></a>支持构建部分DOM树</span></li></ul><ul class="simple"><li><span class="yiyi-st" id="yiyi-25"><a class="reference internal" href="xml.sax.html#module-xml.sax" title="xml.sax: Package containing SAX2 base classes and convenience functions."><code class="xref py py-mod docutils literal"><span class="pre">xml.sax</span></code></a>SAX2基类和便利函数</span></li><li><span class="yiyi-st" id="yiyi-26"><a class="reference internal" href="pyexpat.html#module-xml.parsers.expat" title="xml.parsers.expat: An interface to the Expat non-validating XML parser."><code class="xref py py-mod docutils literal"><span class="pre">xml.parsers.expat</span></code></a>Expat解析器绑定</span></li></ul><div class="section" id="xml-vulnerabilities"><h2><span class="yiyi-st" id="yiyi-27">20.4.1. </span><span class="yiyi-st" id="yiyi-28">XML漏洞</span></h2><p><span class="yiyi-st" id="yiyi-29">XML处理模块对于恶意构造的数据是不安全的。</span><span class="yiyi-st" id="yiyi-30">攻击者可以滥用XML功能来执行拒绝服务攻击访问本地文件生成与其他计算机的网络连接或规避防火墙。</span></p><p><span class="yiyi-st" id="yiyi-31">下表概述了已知的攻击以及各种模块是否易受攻击。</span></p><table border="1" class="docutils"><thead valign="bottom"><tr class="row-odd"><th class="head"><span class="yiyi-st" id="yiyi-32"></span></th><th class="head"><span class="yiyi-st" id="yiyi-33">sax</span></th><th class="head"><span class="yiyi-st" id="yiyi-34">etree</span></th><th class="head"><span class="yiyi-st" id="yiyi-35">minidom</span></th><th class="head"><span class="yiyi-st" id="yiyi-36">pulldom</span></th><th class="head"><span class="yiyi-st" id="yiyi-37">xmlrpc</span></th></tr></thead><tbody valign="top"><tr class="row-even"><td><span class="yiyi-st" id="yiyi-38">十亿笑</span></td><td><span class="yiyi-st" id="yiyi-39"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-40"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-41"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-42"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-43"><strong></strong></span></td></tr><tr class="row-odd"><td><span class="yiyi-st" id="yiyi-44">二次爆炸</span></td><td><span class="yiyi-st" id="yiyi-45"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-46"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-47"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-48"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-49"><strong></strong></span></td></tr><tr class="row-even"><td><span class="yiyi-st" id="yiyi-50">外部实体扩展</span></td><td><span class="yiyi-st" id="yiyi-51"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-52">1</span></td><td><span class="yiyi-st" id="yiyi-53">2</span></td><td><span class="yiyi-st" id="yiyi-54"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-55">3</span></td></tr><tr class="row-odd"><td><span class="yiyi-st" id="yiyi-56"><a class="reference external" href="https://en.wikipedia.org/wiki/Document_type_definition">DTD</a>检索</span></td><td><span class="yiyi-st" id="yiyi-57"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-58">没有</span></td><td><span class="yiyi-st" id="yiyi-59">没有</span></td><td><span class="yiyi-st" id="yiyi-60"><strong></strong></span></td><td><span class="yiyi-st" id="yiyi-61">没有</span></td></tr><tr class="row-even"><td><span class="yiyi-st" id="yiyi-62">减压炸弹</span></td><td><span class="yiyi-st" id="yiyi-63">没有</span></td><td><span class="yiyi-st" id="yiyi-64">没有</span></td><td><span class="yiyi-st" id="yiyi-65">没有</span></td><td><span class="yiyi-st" id="yiyi-66">没有</span></td><td><span class="yiyi-st" id="yiyi-67"><strong></strong></span></td></tr></tbody></table><ol class="arabic simple"><li><span class="yiyi-st" id="yiyi-68"><a class="reference internal" href="xml.etree.elementtree.html#module-xml.etree.ElementTree" title="xml.etree.ElementTree: Implementation of the ElementTree API."><code class="xref py py-mod docutils literal"><span class="pre">xml.etree.ElementTree</span></code></a>不会展开外部实体,并在实体发生时引发<code class="xref py py-exc docutils literal"><span class="pre">ParserError</span></code></span></li><li><span class="yiyi-st" id="yiyi-69"><a class="reference internal" href="xml.dom.minidom.html#module-xml.dom.minidom" title="xml.dom.minidom: Minimal Document Object Model (DOM) implementation."><code class="xref py py-mod docutils literal"><span class="pre">xml.dom.minidom</span></code></a>不会展开外部实体,而是直接返回未展开的实体。</span></li><li><span class="yiyi-st" id="yiyi-70"><code class="xref py py-mod docutils literal"><span class="pre">xmlrpclib</span></code>不会展开外部实体并省略它们。</span></li></ol><dl class="docutils"><dt><span class="yiyi-st" id="yiyi-71">十亿笑/指数实体扩张</span></dt><dd><span class="yiyi-st" id="yiyi-72"><a class="reference external" href="https://en.wikipedia.org/wiki/Billion_laughs">Billion Laughs</a>攻击 - 也称为指数实体扩展 - 使用多级嵌套实体。</span><span class="yiyi-st" id="yiyi-73">每个实体多次引用另一个实体,最终实体定义包含一个小字符串。</span><span class="yiyi-st" id="yiyi-74">指数扩展导致几GB的文本并消耗大量的内存和CPU时间。</span></dd><dt><span class="yiyi-st" id="yiyi-75">二次膨胀实体膨胀</span></dt><dd><span class="yiyi-st" id="yiyi-76">二次爆发攻击类似于<a class="reference external" href="https://en.wikipedia.org/wiki/Billion_laughs">Billion Laughs</a>攻击;它也滥用实体扩张。</span><span class="yiyi-st" id="yiyi-77">而不是嵌套的实体,它重复一个大型实体与几千个字符一遍又一遍。</span><span class="yiyi-st" id="yiyi-78">攻击不如指数情况那样有效,但它避免触发禁止深层嵌套实体的解析器对策。</span></dd><dt><span class="yiyi-st" id="yiyi-79">外部实体扩展</span></dt><dd><span class="yiyi-st" id="yiyi-80">实体声明可以包含多个替换的文本。</span><span class="yiyi-st" id="yiyi-81">它们还可以指向外部资源或本地文件。</span><span class="yiyi-st" id="yiyi-82">XML解析器访问资源并将内容嵌入到XML文档中。</span></dd><dt><span class="yiyi-st" id="yiyi-83"><a class="reference external" href="https://en.wikipedia.org/wiki/Document_type_definition">DTD</a>检索</span></dt><dd><span class="yiyi-st" id="yiyi-84">某些XML库如Python的<a class="reference internal" href="xml.dom.pulldom.html#module-xml.dom.pulldom" title="xml.dom.pulldom: Support for building partial DOM trees from SAX events."><code class="xref py py-mod docutils literal"><span class="pre">xml.dom.pulldom</span></code></a>)从远程或本地位置检索文档类型定义。</span><span class="yiyi-st" id="yiyi-85">该特征具有与外部实体扩展问题类似的影响。</span></dd><dt><span class="yiyi-st" id="yiyi-86">减压炸弹</span></dt><dd><span class="yiyi-st" id="yiyi-87">解压缩炸弹aka <a class="reference external" href="https://en.wikipedia.org/wiki/Zip_bomb">ZIP炸弹</a>适用于所有可以解析压缩XML流如gzip压缩的HTTP流或LZMA压缩文件的XML库。</span><span class="yiyi-st" id="yiyi-88">对于攻击者,它可以将传输的数据量减少三个数量级或更多。</span></dd></dl><p><span class="yiyi-st" id="yiyi-89">PyPI上<a class="reference external" href="https://pypi.python.org/pypi/defusedxml/">defusedxml</a>的文档有关于所有已知攻击向量的更多信息,包括示例和引用。</span></p></div><div class="section" id="the-defusedxml-and-defusedexpat-packages"><h2><span class="yiyi-st" id="yiyi-90">20.4.2. </span><span class="yiyi-st" id="yiyi-91">The <code class="xref py py-mod docutils literal"><span class="pre">defusedxml</span></code> and <code class="xref py py-mod docutils literal"><span class="pre">defusedexpat</span></code> Packages</span></h2><p><span class="yiyi-st" id="yiyi-92"><a class="reference external" href="https://pypi.python.org/pypi/defusedxml/">defusedxml</a>是一个纯Python包其中包含所有stdlib XML解析器的已修改子类可防止任何潜在的恶意操作。</span><span class="yiyi-st" id="yiyi-93">对于解析不受信任的XML数据的任何服务器代码建议使用此程序包。</span><span class="yiyi-st" id="yiyi-94">该软件包还附带了有关更多XML漏洞例如XPath注入的示例漏洞和扩展文档。</span></p><p><span class="yiyi-st" id="yiyi-95"><a class="reference external" href="https://pypi.python.org/pypi/defusedexpat/">defusedexpat</a>提供了修改的libexpat和修补的<code class="xref py py-mod docutils literal"><span class="pre">pyexpat</span></code>模块它们具有针对实体扩展DoS攻击的对策。</span><span class="yiyi-st" id="yiyi-96"><code class="xref py py-mod docutils literal"><span class="pre">defusedexpat</span></code>模块仍然允许完全可配置的实体扩展数量。</span><span class="yiyi-st" id="yiyi-97">这些修改可能包含在一些未来的Python版本中但不会包含在任何Python修正版本中因为它们会破坏向后兼容性。</span></p></div></div></div>