mirror of
https://github.com/fofolee/uTools-Manuals.git
synced 2025-06-08 23:14:06 +08:00
205 lines
87 KiB
HTML
205 lines
87 KiB
HTML
<div class="body" role="main"><div class="section" id="module-difflib"><h1><span class="yiyi-st" id="yiyi-10">6.3. <a class="reference internal" href="#module-difflib" title="difflib: Helpers for computing differences between objects."><code class="xref py py-mod docutils literal"><span class="pre">difflib</span></code></a> - 计算增量的助手</span></h1><p><span class="yiyi-st" id="yiyi-11"><strong>源代码:</strong> <a class="reference external" href="https://hg.python.org/cpython/file/3.5/Lib/difflib.py">Lib/difflib.py</a></span></p><p><span class="yiyi-st" id="yiyi-12">此模块提供了用于比较序列的类和函数。</span><span class="yiyi-st" id="yiyi-13">它可以用于比较文件,例如,使用并可以产生各种格式,包括 HTML 和上下文和统一的差异的差异信息。</span><span class="yiyi-st" id="yiyi-14">为比较文件或目录, 参见<a class="reference internal" href="filecmp.html#module-filecmp" title="filecmp: Compare files efficiently."><code class="xref py py-mod docutils literal"><span class="pre">filecmp</span></code></a> 模块.</span></p><dl class="class"><dt id="difflib.SequenceMatcher"><span class="yiyi-st" id="yiyi-15"><em class="property">class </em><code class="descclassname">difflib.</code><code class="descname">SequenceMatcher</code></span></dt><dd><p><span class="yiyi-st" id="yiyi-16">这是一个灵活的类,用于比较的序列的任何类型,对,只要序列元素是<a class="reference internal" href="../glossary.html#term-hashable"><span class="xref std std-term">hashable</span></a>。</span><span class="yiyi-st" id="yiyi-17">基本算法比20世纪80年代末由Ratcliff和Obershelp在双曲线名称“gestalt模式匹配”下发布的算法早,并且是一个有趣的人。这个想法是找到不包含“垃圾”元素的最长的连续匹配子序列;这些“垃圾”元素在某种意义上是不感兴趣的元素,例如空白行或空格。</span><span class="yiyi-st" id="yiyi-18">(处理垃圾是Ratcliff和Obershelp算法的扩展。)</span><span class="yiyi-st" id="yiyi-19">同样的想法然后应用递归序列的左侧和右侧的匹配序列片段。</span><span class="yiyi-st" id="yiyi-20">这不会产生最小的编辑序列,但却倾向于产生"眺望权"的比赛的人。</span></p><p><span class="yiyi-st" id="yiyi-21"><strong>计时:</strong>基本Ratcliff-Obershelp算法是最坏情况下的三次时间和预期情况下的二次时间。</span><span class="yiyi-st" id="yiyi-22"><a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>是最坏情况的二次时间,并且预期情况行为以复杂的方式依赖于序列具有共同数量的元素;最佳情况时间是线性的。</span></p><p><span class="yiyi-st" id="yiyi-23"><strong>自动垃圾启发式:</strong> <a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>支持自动将某些序列项视为垃圾的启发式算法。</span><span class="yiyi-st" id="yiyi-24">启发式计数每个单独项目序列中出现了多少次。</span><span class="yiyi-st" id="yiyi-25">如果超过 1%的序列和序列项目 (后的第一个) 的重复帐户是至少 200 项这么长,这一项目被标记为"流行",被视为为序列匹配的垃圾。</span><span class="yiyi-st" id="yiyi-26">这启发式算法可以通过将<code class="docutils literal"><span class="pre">autojunk</span></code>参数设置为<code class="docutils literal"><span class="pre">False</span></code> ,创建<a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>时关闭。</span></p><div class="versionadded"><p><span class="yiyi-st" id="yiyi-27"><span class="versionmodified">版本3.2中的新功能:</span> <em>autojunk</em>参数。</span></p></div></dd></dl><dl class="class"><dt id="difflib.Differ"><span class="yiyi-st" id="yiyi-28"><em class="property">class </em><code class="descclassname">difflib.</code><code class="descname">Differ</code></span></dt><dd><p><span class="yiyi-st" id="yiyi-29">这是文本的一类比较序列、 行和生产人类可读的差异或增量。</span><span class="yiyi-st" id="yiyi-30">Differ使用<a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a> ,比较序列的线条,和比较的类似 (接近匹配) 行中的字符序列。</span></p><p><span class="yiyi-st" id="yiyi-31">每一行开头以两个<a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-class docutils literal"><span class="pre">Differ</span></code></a>的字母符号开头:</span></p><table border="1" class="docutils"><thead valign="bottom"><tr class="row-odd"><th class="head"><span class="yiyi-st" id="yiyi-32">码</span></th><th class="head"><span class="yiyi-st" id="yiyi-33">含义</span></th></tr></thead><tbody valign="top"><tr class="row-even"><td><span class="yiyi-st" id="yiyi-34"><code class="docutils literal"><span class="pre">'-</span> <span class="pre">'</span></code></span></td><td><span class="yiyi-st" id="yiyi-35">序列1独有</span></td></tr><tr class="row-odd"><td><span class="yiyi-st" id="yiyi-36"><code class="docutils literal"><span class="pre">'+</span> <span class="pre">'</span></code></span></td><td><span class="yiyi-st" id="yiyi-37">序列2独有</span></td></tr><tr class="row-even"><td><span class="yiyi-st" id="yiyi-38"><code class="docutils literal"><span class="pre">'</span> <span class="pre">'</span></code></span></td><td><span class="yiyi-st" id="yiyi-39">两个序列共有的</span></td></tr><tr class="row-odd"><td><span class="yiyi-st" id="yiyi-40"><code class="docutils literal"><span class="pre">'?</span> <span class="pre">'</span></code></span></td><td><span class="yiyi-st" id="yiyi-41">行不存在于任一输入序列中</span></td></tr></tbody></table><p><span class="yiyi-st" id="yiyi-42">以 '<code class="docutils literal"><span class="pre">?</span></code>' 开头的行试图高亮每行内的分歧,引导眼睛和任一输入序列中不在场。</span><span class="yiyi-st" id="yiyi-43">这些行可以是混乱的如果序列包含制表符字符。</span></p></dd></dl><dl class="class"><dt id="difflib.HtmlDiff"><span class="yiyi-st" id="yiyi-44"><em class="property">class </em><code class="descclassname">difflib.</code><code class="descname">HtmlDiff</code></span></dt><dd><p><span class="yiyi-st" id="yiyi-45">此类可用于创建一个 HTML table (或包含这个 table 的一个完整 HTML 文件) 并排逐行比较并突出显示行间和行内的不同。</span><span class="yiyi-st" id="yiyi-46">可以在完整模式或上下文差异模式下生成这个table。</span></p><p><span class="yiyi-st" id="yiyi-47">此类的构造函数是:</span></p><dl class="method"><dt id="difflib.HtmlDiff.__init__"><span class="yiyi-st" id="yiyi-48"> <code class="descname">__init__</code><span class="sig-paren">(</span><em>tabsize=8</em>, <em>wrapcolumn=None</em>, <em>linejunk=None</em>, <em>charjunk=IS_CHARACTER_JUNK</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-49">初始化<a class="reference internal" href="#difflib.HtmlDiff" title="difflib.HtmlDiff"><code class="xref py py-class docutils literal"><span class="pre">HtmlDiff</span></code></a>的实例。</span></p><p><span class="yiyi-st" id="yiyi-50"><em>tabsize</em>是一个可选的关键字参数,指定 tab 停止间距,默认值为<code class="docutils literal"><span class="pre">8</span></code>。</span></p><p><span class="yiyi-st" id="yiyi-51"><em>wrapcolumn</em>是一个可选的关键字,以指定在哪里行的破裂和包裹,默认值为<code class="docutils literal"><span class="pre">None</span></code>不换行的列数。</span></p><p><span class="yiyi-st" id="yiyi-52"><em>linejunk</em>和<em>charjunk</em>是传递到<a class="reference internal" href="#difflib.ndiff" title="difflib.ndiff"><code class="xref py py-func docutils literal"><span class="pre">ndiff()</span></code></a>中的可选关键字参数(由<a class="reference internal" href="#difflib.HtmlDiff" title="difflib.HtmlDiff"><code class="xref py py-class docutils literal"><span class="pre">HtmlDiff</span></code></a> HTML差异)。</span><span class="yiyi-st" id="yiyi-53">请参阅<a class="reference internal" href="#difflib.ndiff" title="difflib.ndiff"><code class="xref py py-func docutils literal"><span class="pre">ndiff()</span></code></a>参数默认值和说明文档。</span></p></dd></dl><p><span class="yiyi-st" id="yiyi-54">下列方法是公共的:</span></p><dl class="method"><dt id="difflib.HtmlDiff.make_file"><span class="yiyi-st" id="yiyi-55"><code class="descname">make_file</code><span class="sig-paren">(</span><em>fromlines</em>, <em>tolines</em>, <em>fromdesc=''</em>, <em>todesc=''</em>, <em>context=False</em>, <em>numlines=5</em>, <em>*</em>, <em>charset='utf-8'</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-56">比较<em>fromlines</em>和<em>tolines</em>(字符串列表),返回一个字符串,它是一个完整的 HTML 文件,包含一个表格,显示每行的差异,行之间和行内更改突出显示。</span></p><p><span class="yiyi-st" id="yiyi-57"><em>fromdesc</em> 和 <em>todesc</em> 是可选的关键字参数来指定 from/to 文件列标题字符串(两个都默认为空字符串)。</span></p><p><span class="yiyi-st" id="yiyi-58"><em>context</em> 和 <em>numlines</em> 两个都是可选的关键字参数。</span><span class="yiyi-st" id="yiyi-59">当要显示上下文差异时,设置 <em>context</em> 为 <code class="docutils literal"><span class="pre">True</span></code>,否则设置默认值为 <code class="docutils literal"><span class="pre">False</span></code> 以便显示完整的文件。</span><span class="yiyi-st" id="yiyi-60"><em>numlines</em> 默认值为 <code class="docutils literal"><span class="pre">5</span></code>。</span><span class="yiyi-st" id="yiyi-61">当 <em>context</em> 是 <code class="docutils literal"><span class="pre">True</span></code> 时,<em>numlines</em> 控制突出显示差异行的上下文行数。</span><span class="yiyi-st" id="yiyi-62">当 <em>context</em> 为<code class="docutils literal"><span class="pre"> False</span></code> 时,<em>numlines</em> 控制在使用“next”超链接时显示时突出显示的差异之前的行数(设置为 0 将导致“next”超链接将下一个突出显示的差异放置在浏览器的顶部而没有任何前导上下文)。</span></p><div class="versionchanged"><p><span class="yiyi-st" id="yiyi-63"><span class="versionmodified">在版本3.5中已更改:</span> <em>charset</em>仅添加了关键字参数。</span><span class="yiyi-st" id="yiyi-64">HTML文档的默认字符集从<code class="docutils literal"><span class="pre">'ISO-8859-1'</span></code>更改为<code class="docutils literal"><span class="pre">'utf-8'</span></code>。</span></p></div></dd></dl><dl class="method"><dt id="difflib.HtmlDiff.make_table"><span class="yiyi-st" id="yiyi-65"><code class="descname">make_table</code><span class="sig-paren">(</span><em>fromlines</em>, <em>tolines</em>, <em>fromdesc=''</em>, <em>todesc=''</em>, <em>context=False</em>, <em>numlines=5</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-66"><em>Fromlines</em>和<em>tolines</em> (字符串列表) 进行了比较,并返回一个字符串,它是完整 HTML 表显示一行一行的差异与跨线和内线条进行更改突出显示。</span></p><p><span class="yiyi-st" id="yiyi-67">此方法的参数是<a class="reference internal" href="#difflib.HtmlDiff.make_file" title="difflib.HtmlDiff.make_file"><code class="xref py py-meth docutils literal"><span class="pre">make_file()</span></code></a>方法相同。</span></p></dd></dl><p><span class="yiyi-st" id="yiyi-68"><code class="file docutils literal"><span class="pre">Tools/scripts/diff.py</span></code>是命令行前端向此类和包含它的使用很好的例子。</span></p></dd></dl><dl class="function"><dt id="difflib.context_diff"><span class="yiyi-st" id="yiyi-69"><code class="descclassname">difflib.</code><code class="descname">context_diff</code><span class="sig-paren">(</span><em>a</em>, <em>b</em>, <em>fromfile=''</em>, <em>tofile=''</em>, <em>fromfiledate=''</em>, <em>tofiledate=''</em>, <em>n=3</em>, <em>lineterm='\n'</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-70">比较<em>a</em>和<em>b</em>(字符串列表);在上下文差异格式中返回Δ(生成Δ行的<a class="reference internal" href="../glossary.html#term-generator"><span class="xref std std-term">generator</span></a>)。</span></p><p><span class="yiyi-st" id="yiyi-71">上下文差别是上下文的以紧凑的方式显示已更改的行,再加上的几行。</span><span class="yiyi-st" id="yiyi-72">更改显示在前/后风格。</span><span class="yiyi-st" id="yiyi-73">由<em>n</em>向三个默认设置的上下文行数。</span></p><p><span class="yiyi-st" id="yiyi-74">默认情况下,创建尾随换行符 diff 控制线 (那些<code class="docutils literal"><span class="pre">***</span></code>或<code class="docutils literal"><span class="pre">---</span></code>)。</span><span class="yiyi-st" id="yiyi-75">这有助于使从<a class="reference internal" href="io.html#io.IOBase.readlines" title="io.IOBase.readlines"><code class="xref py py-func docutils literal"><span class="pre">io.IOBase.readlines()</span></code></a>创建的输入产生适合与<a class="reference internal" href="io.html#io.IOBase.writelines" title="io.IOBase.writelines"><code class="xref py py-func docutils literal"><span class="pre">io.IOBase.writelines()</span></code></a>一起使用的差异,因为输入和输出具有尾随换行符。</span></p><p><span class="yiyi-st" id="yiyi-76">对于没有尾随换行符的输入,将<em>lineterm</em>参数设置为<code class="docutils literal"><span class="pre">""</span></code>,以便输出将一律换行。</span></p><p><span class="yiyi-st" id="yiyi-77">上下文 diff 格式通常有一个标题为文件名和修改时间。</span><span class="yiyi-st" id="yiyi-78">使用字符串为<em>fromfile</em>、 <em>tofile</em>、 <em>fromfiledate</em>、 <em>tofiledate</em>可能指定任何或所有这些。</span><span class="yiyi-st" id="yiyi-79">修改时间通常是使用 ISO 8601 格式表示的。</span><span class="yiyi-st" id="yiyi-80">如果未指定,字符串默认为空格。</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s1</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'bacon</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'eggs</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'ham</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'guido</span><span class="se">\n</span><span class="s1">'</span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="n">s2</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'python</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'eggy</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'hamster</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'guido</span><span class="se">\n</span><span class="s1">'</span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">writelines</span><span class="p">(</span><span class="n">context_diff</span><span class="p">(</span><span class="n">s1</span><span class="p">,</span> <span class="n">s2</span><span class="p">,</span> <span class="n">fromfile</span><span class="o">=</span><span class="s1">'before.py'</span><span class="p">,</span> <span class="n">tofile</span><span class="o">=</span><span class="s1">'after.py'</span><span class="p">))</span>
|
||
<span class="go">*** before.py</span>
|
||
<span class="go">--- after.py</span>
|
||
<span class="go">***************</span>
|
||
<span class="go">*** 1,4 ****</span>
|
||
<span class="go">! bacon</span>
|
||
<span class="go">! eggs</span>
|
||
<span class="go">! ham</span>
|
||
<span class="go"> guido</span>
|
||
<span class="go">--- 1,4 ----</span>
|
||
<span class="go">! python</span>
|
||
<span class="go">! eggy</span>
|
||
<span class="go">! hamster</span>
|
||
<span class="go"> guido</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-81">更详细的示例,请参阅<a class="reference internal" href="#difflib-interface"><span>A command-line interface to difflib</span></a> 。</span></p></dd></dl><dl class="function"><dt id="difflib.get_close_matches"><span class="yiyi-st" id="yiyi-82"> <code class="descclassname">difflib.</code><code class="descname">get_close_matches</code><span class="sig-paren">(</span><em>word</em>, <em>possibilities</em>, <em>n=3</em>, <em>cutoff=0.6</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-83">返回一个最佳的"足够好"匹配列表。</span><span class="yiyi-st" id="yiyi-84"><em>word</em>是需要密切匹配的序列(通常是字符串),<em>possibilities</em>是用来与<em>word</em>匹配的序列列表(通常是字符串列表)。</span></p><p><span class="yiyi-st" id="yiyi-85">可选参数<em>n</em>(默认<code class="docutils literal"><span class="pre">3</span></code>)是要返回的最大匹配数; <em>n</em>必须大于<code class="docutils literal"><span class="pre">0</span></code>。</span></p><p><span class="yiyi-st" id="yiyi-86">可选参数<em>cutoff</em>(默认<code class="docutils literal"><span class="pre">0.6</span></code>) 是一个在 [0,1] 范围内的浮点数。</span><span class="yiyi-st" id="yiyi-87">Possibilities 中与<em>word</em>相似得分不足的将被忽略</span></p><p><span class="yiyi-st" id="yiyi-88">在列表中,首先按相似性得分,最类似排序返回最佳 (不超过<em>n</em>) 匹配之中的可能性。</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">get_close_matches</span><span class="p">(</span><span class="s1">'appel'</span><span class="p">,</span> <span class="p">[</span><span class="s1">'ape'</span><span class="p">,</span> <span class="s1">'apple'</span><span class="p">,</span> <span class="s1">'peach'</span><span class="p">,</span> <span class="s1">'puppy'</span><span class="p">])</span>
|
||
<span class="go">['apple', 'ape']</span>
|
||
<span class="gp">>>> </span><span class="kn">import</span> <span class="nn">keyword</span>
|
||
<span class="gp">>>> </span><span class="n">get_close_matches</span><span class="p">(</span><span class="s1">'wheel'</span><span class="p">,</span> <span class="n">keyword</span><span class="o">.</span><span class="n">kwlist</span><span class="p">)</span>
|
||
<span class="go">['while']</span>
|
||
<span class="gp">>>> </span><span class="n">get_close_matches</span><span class="p">(</span><span class="s1">'pineapple'</span><span class="p">,</span> <span class="n">keyword</span><span class="o">.</span><span class="n">kwlist</span><span class="p">)</span>
|
||
<span class="go">[]</span>
|
||
<span class="gp">>>> </span><span class="n">get_close_matches</span><span class="p">(</span><span class="s1">'accept'</span><span class="p">,</span> <span class="n">keyword</span><span class="o">.</span><span class="n">kwlist</span><span class="p">)</span>
|
||
<span class="go">['except']</span>
|
||
</code></pre></dd></dl><dl class="function"><dt id="difflib.ndiff"><span class="yiyi-st" id="yiyi-89"><code class="descclassname">difflib.</code><code class="descname">ndiff</code><span class="sig-paren">(</span><em>a</em>, <em>b</em>, <em>linejunk=None</em>, <em>charjunk=IS_CHARACTER_JUNK</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-90">比较(字符串列表)<em>a</em>和<em>b</em>;返回 <a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-class docutils literal"><span class="pre">Differ</span></code></a> 风格的差异(一个生成差异行的 <a class="reference internal" href="../glossary.html#term-generator"><span class="xref std std-term">generator</span></a>)。</span></p><p><span class="yiyi-st" id="yiyi-91">可选的关键字参数<em>linejunk</em>和<em>charjunk</em>是过滤函数(或<code class="docutils literal"><span class="pre">None</span></code>):</span></p><p><span class="yiyi-st" id="yiyi-92"><em>linejunk</em>: 一个函数,接受单个字符串参数,并返回 true,如果字符串是垃圾或假如果不。</span><span class="yiyi-st" id="yiyi-93">默认值为<code class="docutils literal"><span class="pre">None</span></code>。</span><span class="yiyi-st" id="yiyi-94">还有一个模块级函数<a class="reference internal" href="#difflib.IS_LINE_JUNK" title="difflib.IS_LINE_JUNK"><code class="xref py py-func docutils literal"><span class="pre">IS_LINE_JUNK()</span></code></a>,它过滤掉没有可见字符的行,除了最多一个字符(<code class="docutils literal"><span class="pre">'#'</span></code>) - <a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>类对哪些行频繁构成噪声进行动态分析,这通常比使用此函数工作更好。</span></p><p><span class="yiyi-st" id="yiyi-95"><em>charjunk</em>: 不接受一个字符 (长度为 1 的字符串),并返回如果字符是垃圾或假如果一个函数。</span><span class="yiyi-st" id="yiyi-96">默认是模块级函数<a class="reference internal" href="#difflib.IS_CHARACTER_JUNK" title="difflib.IS_CHARACTER_JUNK"><code class="xref py py-func docutils literal"><span class="pre">IS_CHARACTER_JUNK()</span></code></a>,它过滤掉空格字符(空格或制表符;在这里包含换行符是个不错的主意!</span><span class="yiyi-st" id="yiyi-97">)。</span></p><p><span class="yiyi-st" id="yiyi-98"><code class="file docutils literal"><span class="pre">Tools/scripts/ndiff.py</span></code>是命令行对此函数的前端。</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">diff</span> <span class="o">=</span> <span class="n">ndiff</span><span class="p">(</span><span class="s1">'one</span><span class="se">\n</span><span class="s1">two</span><span class="se">\n</span><span class="s1">three</span><span class="se">\n</span><span class="s1">'</span><span class="o">.</span><span class="n">splitlines</span><span class="p">(</span><span class="n">keepends</span><span class="o">=</span><span class="kc">True</span><span class="p">),</span>
|
||
<span class="gp">... </span> <span class="s1">'ore</span><span class="se">\n</span><span class="s1">tree</span><span class="se">\n</span><span class="s1">emu</span><span class="se">\n</span><span class="s1">'</span><span class="o">.</span><span class="n">splitlines</span><span class="p">(</span><span class="n">keepends</span><span class="o">=</span><span class="kc">True</span><span class="p">))</span>
|
||
<span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="s1">''</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">diff</span><span class="p">),</span> <span class="n">end</span><span class="o">=</span><span class="s2">""</span><span class="p">)</span>
|
||
<span class="go">- one</span>
|
||
<span class="go">? ^</span>
|
||
<span class="go">+ ore</span>
|
||
<span class="go">? ^</span>
|
||
<span class="go">- two</span>
|
||
<span class="go">- three</span>
|
||
<span class="go">? -</span>
|
||
<span class="go">+ tree</span>
|
||
<span class="go">+ emu</span>
|
||
</code></pre></dd></dl><dl class="function"><dt id="difflib.restore"><span class="yiyi-st" id="yiyi-99"><code class="descclassname">difflib.</code><code class="descname">restore</code><span class="sig-paren">(</span><em>sequence</em>, <em>which</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-100">返回两个序列生成一个三角洲之一。</span></p><p><span class="yiyi-st" id="yiyi-101">给定一个显<em>序列</em>产生的<a class="reference internal" href="#difflib.Differ.compare" title="difflib.Differ.compare"><code class="xref py py-meth docutils literal"><span class="pre">Differ.compare()</span></code></a>或<a class="reference internal" href="#difflib.ndiff" title="difflib.ndiff"><code class="xref py py-func docutils literal"><span class="pre">ndiff()</span></code></a>,提取线来自文件,1 或 2 (参数<em>的</em>),脱线的前缀。</span></p><p><span class="yiyi-st" id="yiyi-102">示例:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">diff</span> <span class="o">=</span> <span class="n">ndiff</span><span class="p">(</span><span class="s1">'one</span><span class="se">\n</span><span class="s1">two</span><span class="se">\n</span><span class="s1">three</span><span class="se">\n</span><span class="s1">'</span><span class="o">.</span><span class="n">splitlines</span><span class="p">(</span><span class="n">keepends</span><span class="o">=</span><span class="kc">True</span><span class="p">),</span>
|
||
<span class="gp">... </span> <span class="s1">'ore</span><span class="se">\n</span><span class="s1">tree</span><span class="se">\n</span><span class="s1">emu</span><span class="se">\n</span><span class="s1">'</span><span class="o">.</span><span class="n">splitlines</span><span class="p">(</span><span class="n">keepends</span><span class="o">=</span><span class="kc">True</span><span class="p">))</span>
|
||
<span class="gp">>>> </span><span class="n">diff</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">diff</span><span class="p">)</span> <span class="c1"># materialize the generated delta into a list</span>
|
||
<span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="s1">''</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">restore</span><span class="p">(</span><span class="n">diff</span><span class="p">,</span> <span class="mi">1</span><span class="p">)),</span> <span class="n">end</span><span class="o">=</span><span class="s2">""</span><span class="p">)</span>
|
||
<span class="go">one</span>
|
||
<span class="go">two</span>
|
||
<span class="go">three</span>
|
||
<span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="s1">''</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">restore</span><span class="p">(</span><span class="n">diff</span><span class="p">,</span> <span class="mi">2</span><span class="p">)),</span> <span class="n">end</span><span class="o">=</span><span class="s2">""</span><span class="p">)</span>
|
||
<span class="go">ore</span>
|
||
<span class="go">tree</span>
|
||
<span class="go">emu</span>
|
||
</code></pre></dd></dl><dl class="function"><dt id="difflib.unified_diff"><span class="yiyi-st" id="yiyi-103"><code class="descclassname">difflib.</code><code class="descname">unified_diff</code><span class="sig-paren">(</span><em>a</em>, <em>b</em>, <em>fromfile=''</em>, <em>tofile=''</em>, <em>fromfiledate=''</em>, <em>tofiledate=''</em>, <em>n=3</em>, <em>lineterm='\n'</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-104">比较<em>a</em>和<em>b</em>(字符串列表);以统一差分格式返回增量(<a class="reference internal" href="../glossary.html#term-generator"><span class="xref std std-term">generator</span></a>生成增量线)。</span></p><p><span class="yiyi-st" id="yiyi-105">统一的差别是上下文的以紧凑的方式显示已更改的行,再加上的几行。</span><span class="yiyi-st" id="yiyi-106">更改以内联样式显示(而不是单独的前/后块)。</span><span class="yiyi-st" id="yiyi-107">由<em>n</em>向三个默认设置的上下文行数。</span></p><p><span class="yiyi-st" id="yiyi-108">默认情况下,比较控制线 (那些<code class="docutils literal"><span class="pre">---</span></code>、 <code class="docutils literal"><span class="pre">+++</span></code>或<code class="docutils literal"><span class="pre">@@</span></code>) 创建尾随换行符。</span><span class="yiyi-st" id="yiyi-109">这有助于使从<a class="reference internal" href="io.html#io.IOBase.readlines" title="io.IOBase.readlines"><code class="xref py py-func docutils literal"><span class="pre">io.IOBase.readlines()</span></code></a>创建的输入产生适合与<a class="reference internal" href="io.html#io.IOBase.writelines" title="io.IOBase.writelines"><code class="xref py py-func docutils literal"><span class="pre">io.IOBase.writelines()</span></code></a>一起使用的差异,因为输入和输出具有尾随换行符。</span></p><p><span class="yiyi-st" id="yiyi-110">对于没有尾随换行符的输入,将<em>lineterm</em>参数设置为<code class="docutils literal"><span class="pre">""</span></code>,以便输出将一律换行。</span></p><p><span class="yiyi-st" id="yiyi-111">上下文 diff 格式通常有一个标题为文件名和修改时间。</span><span class="yiyi-st" id="yiyi-112">使用字符串为<em>fromfile</em>、 <em>tofile</em>、 <em>fromfiledate</em>、 <em>tofiledate</em>可能指定任何或所有这些。</span><span class="yiyi-st" id="yiyi-113">修改时间通常是使用 ISO 8601 格式表示的。</span><span class="yiyi-st" id="yiyi-114">如果未指定,字符串默认为空格。</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s1</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'bacon</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'eggs</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'ham</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'guido</span><span class="se">\n</span><span class="s1">'</span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="n">s2</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'python</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'eggy</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'hamster</span><span class="se">\n</span><span class="s1">'</span><span class="p">,</span> <span class="s1">'guido</span><span class="se">\n</span><span class="s1">'</span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">writelines</span><span class="p">(</span><span class="n">unified_diff</span><span class="p">(</span><span class="n">s1</span><span class="p">,</span> <span class="n">s2</span><span class="p">,</span> <span class="n">fromfile</span><span class="o">=</span><span class="s1">'before.py'</span><span class="p">,</span> <span class="n">tofile</span><span class="o">=</span><span class="s1">'after.py'</span><span class="p">))</span>
|
||
<span class="go">--- before.py</span>
|
||
<span class="go">+++ after.py</span>
|
||
<span class="go">@@ -1,4 +1,4 @@</span>
|
||
<span class="go">-bacon</span>
|
||
<span class="go">-eggs</span>
|
||
<span class="go">-ham</span>
|
||
<span class="go">+python</span>
|
||
<span class="go">+eggy</span>
|
||
<span class="go">+hamster</span>
|
||
<span class="go"> guido</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-115">更详细的示例,请参阅<a class="reference internal" href="#difflib-interface"><span>A command-line interface to difflib</span></a> 。</span></p></dd></dl><dl class="function"><dt id="difflib.diff_bytes"><span class="yiyi-st" id="yiyi-116"><code class="descclassname">difflib.</code><code class="descname">diff_bytes</code><span class="sig-paren">(</span><em>dfunc</em>, <em>a</em>, <em>b</em>, <em>fromfile=b''</em>, <em>tofile=b''</em>, <em>fromfiledate=b''</em>, <em>tofiledate=b''</em>, <em>n=3</em>, <em>lineterm=b'\n'</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-117">使用<em>dfunc</em>比较<em>a</em>和<em>b</em>(字节对象列表);以<em>dfunc</em>返回的格式生成一个delta行(也是字节)序列。</span><span class="yiyi-st" id="yiyi-118"><em>dfunc</em>必须是可调用的,通常为<a class="reference internal" href="#difflib.unified_diff" title="difflib.unified_diff"><code class="xref py py-func docutils literal"><span class="pre">unified_diff()</span></code></a>或<a class="reference internal" href="#difflib.context_diff" title="difflib.context_diff"><code class="xref py py-func docutils literal"><span class="pre">context_diff()</span></code></a>。</span></p><p><span class="yiyi-st" id="yiyi-119">允许您比较未知或不一致编码的数据。</span><span class="yiyi-st" id="yiyi-120">除<em>n</em>之外的所有输入必须是字节对象,而不是str。</span><span class="yiyi-st" id="yiyi-121">通过将所有输入(除<em>n</em>)无损地转换为str并调用<code class="docutils literal"><span class="pre">dfunc(a,</span> <span class="pre">b,</span> <span class="pre"></span> <span class="pre">tofile,</span> <span class="pre">fromfiledate,</span> <span class="pre">tofiledate,</span> <span class="pre">n,</span> <span class="pre">lineterm)</span> 。</code></span><span class="yiyi-st" id="yiyi-122"><em>dfunc</em>的输出然后转换回字节,因此您接收的delta线具有与<em>a</em>和<em>b</em>相同的未知/不一致编码, 。</span></p><div class="versionadded"><p><span class="yiyi-st" id="yiyi-123"><span class="versionmodified">版本3.5中的新功能。</span></span></p></div></dd></dl><dl class="function"><dt id="difflib.IS_LINE_JUNK"><span class="yiyi-st" id="yiyi-124"> <code class="descclassname">difflib.</code><code class="descname">IS_LINE_JUNK</code><span class="sig-paren">(</span><em>line</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-125">返回 true 可忽略行。</span><span class="yiyi-st" id="yiyi-126">线<em>线</em>是可忽略的如果<em>线</em>为空或包含一个单一的<code class="docutils literal"><span class="pre">'#'</span></code>,否则就不是可忽略。</span><span class="yiyi-st" id="yiyi-127">用作旧版本中<a class="reference internal" href="#difflib.ndiff" title="difflib.ndiff"><code class="xref py py-func docutils literal"><span class="pre">ndiff()</span></code></a>中参数<em>linejunk</em>的默认值。</span></p></dd></dl><dl class="function"><dt id="difflib.IS_CHARACTER_JUNK"><span class="yiyi-st" id="yiyi-128"><code class="descclassname">difflib.</code><code class="descname">IS_CHARACTER_JUNK</code><span class="sig-paren">(</span><em>ch</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-129">返回 true 可忽略的字符。</span><span class="yiyi-st" id="yiyi-130">字符<em>ch</em>是可忽略的如果<em>ch</em>是空格或制表符),否则就不是可忽略。</span><span class="yiyi-st" id="yiyi-131">用作参数<em>charjunk</em>在<a class="reference internal" href="#difflib.ndiff" title="difflib.ndiff"><code class="xref py py-func docutils literal"><span class="pre">ndiff()</span></code></a>的默认值。</span></p></dd></dl><div class="admonition seealso"><p class="first admonition-title"><span class="yiyi-st" id="yiyi-132">请参见</span></p><dl class="last docutils"><dt><span class="yiyi-st" id="yiyi-133"><a class="reference external" href="http://www.drdobbs.com/database/pattern-matching-the-gestalt-approach/184407970">模式匹配:Gestalt方法</a></span></dt><dd><span class="yiyi-st" id="yiyi-134">讨论John W. Ratcliff和D. E. Metzener的类似算法。</span><span class="yiyi-st" id="yiyi-135">这发表在<a class="reference external" href="http://www.drdobbs.com/">Dr。 Dobb's Journal</a>。</span></dd></dl></div><div class="section" id="sequencematcher-objects"><h2><span class="yiyi-st" id="yiyi-136">6.3.1.</span><span class="yiyi-st" id="yiyi-137">SequenceMatcher对象</span></h2><p><span class="yiyi-st" id="yiyi-138"><a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>类具有此构造函数:</span></p><dl class="class"><dt><span class="yiyi-st" id="yiyi-139"><em class="property">class </em><code class="descclassname">difflib.</code><code class="descname">SequenceMatcher</code><span class="sig-paren">(</span><em>isjunk=None</em>, <em>a=''</em>, <em>b=''</em>, <em>autojunk=True</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-140">可选参数<em>isjunk</em>必须是<code class="docutils literal"><span class="pre">None</span></code>(默认值) 或一个单参数的函数,采用序列的元素,并返回 true,当且仅当该元素是"垃圾",并且应该忽略。</span><span class="yiyi-st" id="yiyi-141">将<code class="docutils literal"><span class="pre">None</span></code>传递给<em>isjunk</em>等效于传递<code class="docutils literal"><span class="pre">lambda</span> <span class="pre">x:</span> <span class="pre">0</span> </code>;换句话说,没有元素被忽略。</span><span class="yiyi-st" id="yiyi-142">例如,将传递:</span></p><pre><code class="language-python"><span></span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="ow">in</span> <span class="s2">" </span><span class="se">\t</span><span class="s2">"</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-143">如果您比较行作为序列的字符,并且不希望以同步上空白或硬制表符。</span></p><p><span class="yiyi-st" id="yiyi-144">可选参数<em>a</em>和<em>b</em>是要比较的序列;两者默认为空字符串。</span><span class="yiyi-st" id="yiyi-145">这两个序列的元素必须是<a class="reference internal" href="../glossary.html#term-hashable"><span class="xref std std-term">hashable</span></a>。</span></p><p><span class="yiyi-st" id="yiyi-146">可选参数<em>autojunk</em>可以用来禁用自动垃圾启发式算法。</span></p><div class="versionadded"><p><span class="yiyi-st" id="yiyi-147"><span class="versionmodified">版本3.2中的新功能:</span> <em>autojunk</em>参数。</span></p></div><p><span class="yiyi-st" id="yiyi-148">SequenceMatcher对象获得三个数据属性:<em>bjunk</em>是<em>b的元素集合,<em>isjunk</em>是<code class="docutils literal"><span class="pre">True</span></code>; <em>bpopular</em>是启发式算法流行的非垃圾元素集合(如果未禁用); <em>b2j</em>是将<em>b</em>的其余元素映射到它们出现的位置的列表的dict。</em></span><span class="yiyi-st" id="yiyi-149">当<em>b</em>由<a class="reference internal" href="#difflib.SequenceMatcher.set_seqs" title="difflib.SequenceMatcher.set_seqs"><code class="xref py py-meth docutils literal"><span class="pre">set_seqs()</span></code></a>或<a class="reference internal" href="#difflib.SequenceMatcher.set_seq2" title="difflib.SequenceMatcher.set_seq2"><code class="xref py py-meth docutils literal"><span class="pre">set_seq2()</span></code></a>复位时,</span></p><div class="versionadded"><p><span class="yiyi-st" id="yiyi-150"><span class="versionmodified">版本3.2中的新功能:</span> <em>bjunk</em>和<em>bpopular</em>属性。</span></p></div><p><span class="yiyi-st" id="yiyi-151"><a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>对象具有以下方法:</span></p><dl class="method"><dt id="difflib.SequenceMatcher.set_seqs"><span class="yiyi-st" id="yiyi-152"><code class="descname">set_seqs</code><span class="sig-paren">(</span><em>a</em>, <em>b</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-153">设置两个序列进行比较。</span></p></dd></dl><p><span class="yiyi-st" id="yiyi-154"><a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>计算和缓存有关第二个序列的详细的信息,所以如果你想要比较反对多个序列的一个序列,使用<a class="reference internal" href="#difflib.SequenceMatcher.set_seq2" title="difflib.SequenceMatcher.set_seq2"><code class="xref py py-meth docutils literal"><span class="pre">set_seq2()</span></code></a>来一次设置常用的序列和<a class="reference internal" href="#difflib.SequenceMatcher.set_seq1" title="difflib.SequenceMatcher.set_seq1"><code class="xref py py-meth docutils literal"><span class="pre">set_seq1()</span></code></a>反复,一次为每个调用的其他两个序列。</span></p><dl class="method"><dt id="difflib.SequenceMatcher.set_seq1"><span class="yiyi-st" id="yiyi-155"><code class="descname">set_seq1</code><span class="sig-paren">(</span><em>a</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-156">设置要比较的第一个序列。</span><span class="yiyi-st" id="yiyi-157">要比较的第二个序列不会更改。</span></p></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.set_seq2"><span class="yiyi-st" id="yiyi-158"><code class="descname">set_seq2</code><span class="sig-paren">(</span><em>b</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-159">设置要比较的第二个序列。</span><span class="yiyi-st" id="yiyi-160">要比较的第一个序列不会更改。</span></p></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.find_longest_match"><span class="yiyi-st" id="yiyi-161"><code class="descname">find_longest_match</code><span class="sig-paren">(</span><em>alo</em>, <em>ahi</em>, <em>blo</em>, <em>bhi</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-162"><code class="docutils literal"><span class="pre">a[alo:ahi]</span></code>和<code class="docutils literal"><span class="pre">b[blo:bhi]</span></code>中找到最长匹配块。</span></p><p><span class="yiyi-st" id="yiyi-163">If <em>isjunk</em> was omitted or <code class="docutils literal"><span class="pre">None</span></code>, <a class="reference internal" href="#difflib.SequenceMatcher.find_longest_match" title="difflib.SequenceMatcher.find_longest_match"><code class="xref py py-meth docutils literal"><span class="pre">find_longest_match()</span></code></a> returns <code class="docutils literal"><span class="pre">(i,</span> <span class="pre">j,</span> <span class="pre">k)</span></code> such that <code class="docutils literal"><span class="pre">a[i:i+k]</span></code> is equal to <code class="docutils literal"><span class="pre">b[j:j+k]</span></code>, where <code class="docutils literal"><span class="pre">alo</span> <span class="pre"><=</span> <span class="pre">i</span> <span class="pre"><=</span> <span class="pre">i+k</span> <span class="pre"><=</span> <span class="pre">ahi</span></code> and <code class="docutils literal"><span class="pre">blo</span> <span class="pre"><=</span> <span class="pre">j</span> <span class="pre"><=</span> <span class="pre">j+k</span> <span class="pre"><=</span> <span class="pre">bhi</span></code>. </span><span class="yiyi-st" id="yiyi-164">所有<code class="docutils literal"><span class="pre">(我 ',</span> <span class="pre">j',</span> <span class="pre">k')</span></code>满足这些条件,附加条件<code class="docutils literal"><span class="pre">k</span> <span class="pre">> =</span> <span class="pre">k'</span></code>,<code class="docutils literal"><span class="pre">我</span> <span class="pre">< =</span> <span class="pre">我 '</span></code>,如果<code class="docutils literal"><span class="pre">我</span> <span class="pre">=</span> =<span class="pre">我'</span></code>, <code class="docutils literal"> <span class="pre">j</span> <span class="pre">< =</span> <span class="pre">j'</span></code>还会见了。</span><span class="yiyi-st" id="yiyi-165">换句话说,所有的最大匹配块,返回一个启动最早的<em></em>,和所有那些大匹配块的最早在<em></em>开始,返回在<em>b</em>中最早启动的一个。</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="n">SequenceMatcher</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="s2">" abcd"</span><span class="p">,</span> <span class="s2">"abcd abcd"</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="n">s</span><span class="o">.</span><span class="n">find_longest_match</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">9</span><span class="p">)</span>
|
||
<span class="go">Match(a=0, b=4, size=5)</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-166">如果提供了<em>isjunk</em> ,第一次作为上述情况,但没有垃圾的元素出现在块中的附加限制确定最长匹配的块。</span><span class="yiyi-st" id="yiyi-167">然后,那块是由匹配 (仅限) 垃圾元素两边都尽量延伸。</span><span class="yiyi-st" id="yiyi-168">所以,产生的块永远不会匹配上除了垃圾由于相同的垃圾刚好是毗邻场有趣的比赛。</span></p><p><span class="yiyi-st" id="yiyi-169">这里是和以前一样,同样的示例,但考虑空白是垃圾。</span><span class="yiyi-st" id="yiyi-170">这可以防止<code class="docutils literal"><span class="pre">'</span> <span class="pre">abcd'</span></code>从匹配<code class="docutils literal"><span class="pre">'</span> <span class="pre">abcd'</span></code>第二个序列直接末端。</span><span class="yiyi-st" id="yiyi-171">而只有<code class="docutils literal"><span class="pre">'abcd'</span></code>可以匹配,匹配左边<code class="docutils literal"><span class="pre">'abcd'</span></code>在第二个序列:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="n">SequenceMatcher</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="o">==</span><span class="s2">" "</span><span class="p">,</span> <span class="s2">" abcd"</span><span class="p">,</span> <span class="s2">"abcd abcd"</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="n">s</span><span class="o">.</span><span class="n">find_longest_match</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">9</span><span class="p">)</span>
|
||
<span class="go">Match(a=1, b=0, size=4)</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-172">如果没有块匹配,这将返回<code class="docutils literal"><span class="pre">(alo,</span> <span class="pre">血压,</span> <span class="pre">0)</span></code>。</span></p><p><span class="yiyi-st" id="yiyi-173">此方法返回<a class="reference internal" href="../glossary.html#term-named-tuple"><span class="xref std std-term">named tuple</span></a>的<code class="docutils literal"><span class="pre">匹配 (a、</span> <span class="pre">b、</span> <span class="pre">大小)</span></code>。</span></p></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.get_matching_blocks"><span class="yiyi-st" id="yiyi-174"><code class="descname">get_matching_blocks</code><span class="sig-paren">(</span><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-175">返回描述匹配子序列的三元组列表。</span><span class="yiyi-st" id="yiyi-176">每个三是窗体的<code class="docutils literal"><span class="pre">(我,</span> <span class="pre">j,</span> <span class="pre">n)</span></code>,意味着那<code class="docutils literal"><span class="pre">[i:i + n]</span> <span class="pre">= =</span> <span class="pre">b [j:j + n]</span></code>。</span><span class="yiyi-st" id="yiyi-177">三元组在<em>第一和<em>j</em></em>递增。</span></p><p><span class="yiyi-st" id="yiyi-178">最后三人是假的和具有价值<code class="docutils literal"><span class="pre">(len(a),</span> <span class="pre">len(b),</span> <span class="pre">0)</span></code>。</span><span class="yiyi-st" id="yiyi-179">它是唯一的三人房含<code class="docutils literal"><span class="pre">n</span> <span class="pre">=</span> = <span class="pre">0</span></code>。</span><span class="yiyi-st" id="yiyi-180">如果<code class="docutils literal"><span class="pre">(i,</span> <span class="pre">j,</span> <span class="pre">n)</span></code>和<code class="docutils literal"><span class="pre">> <span class="pre">j',</span> <span class="pre">n')</span></span></code>是列表中的相邻三元组,第二个不是列表中的最后一个三元组,则<code class="docutils literal"><span class="pre"> <span class="pre">!=</span> <span class="pre">i'</span></span></code>或<code class="docutils literal"><span class="pre">j + n</span> <span class="pre">/ t14> <span class="pre">j'</span></span></code>;换句话说,相邻三元组总是描述不相邻的相等块。</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="n">SequenceMatcher</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="s2">"abxcd"</span><span class="p">,</span> <span class="s2">"abcd"</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="n">s</span><span class="o">.</span><span class="n">get_matching_blocks</span><span class="p">()</span>
|
||
<span class="go">[Match(a=0, b=0, size=2), Match(a=3, b=2, size=2), Match(a=5, b=4, size=0)]</span>
|
||
</code></pre></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.get_opcodes"><span class="yiyi-st" id="yiyi-181"><code class="descname">get_opcodes</code><span class="sig-paren">(</span><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-182">5 元组返回列表描述如何将<em></em>变成<em>b</em>。</span><span class="yiyi-st" id="yiyi-183">每个元组是窗体的<code class="docutils literal"><span class="pre">(标记,</span> <span class="pre">i1、</span> <span class="pre">i2、</span> <span class="pre">j1、</span> <span class="pre">j2)</span></code>。</span><span class="yiyi-st" id="yiyi-184">第一个元组有<code class="docutils literal"><span class="pre">i1</span> = <span class="pre">=</span> <span class="pre">j1</span> <span class="pre">=</span> = <span class="pre">0</span></code>,还有<em>i1</em>等于从前面的元组,和,同样,等于以前的<em>j2</em> <em>j1</em> <em>i2</em>剩余的元组。</span></p><p><span class="yiyi-st" id="yiyi-185"><em>标记</em>值是字符串,带有这些含义:</span></p><table border="1" class="docutils"><thead valign="bottom"><tr class="row-odd"><th class="head"><span class="yiyi-st" id="yiyi-186">值</span></th><th class="head"><span class="yiyi-st" id="yiyi-187">含义</span></th></tr></thead><tbody valign="top"><tr class="row-even"><td><span class="yiyi-st" id="yiyi-188"><code class="docutils literal"><span class="pre">'replace'</span></code></span></td><td><span class="yiyi-st" id="yiyi-189"><code class="docutils literal"><span class="pre">a[i1:i2]</span></code>应替换为<code class="docutils literal"><span class="pre">b[j1:j2]</span></code>。</span></td></tr><tr class="row-odd"><td><span class="yiyi-st" id="yiyi-190"><code class="docutils literal"><span class="pre">'delete'</span></code></span></td><td><span class="yiyi-st" id="yiyi-191"><code class="docutils literal"><span class="pre">a[i1:i2]</span></code>应被删除。</span><span class="yiyi-st" id="yiyi-192">注意,在这种情况下,<code class="docutils literal"><span class="pre">j1</span> <span class="pre">==</span> <span class="pre">j2</span></code></span></td></tr><tr class="row-even"><td><span class="yiyi-st" id="yiyi-193"><code class="docutils literal"><span class="pre">'insert'</span></code></span></td><td><span class="yiyi-st" id="yiyi-194">应在<code class="docutils literal"><span class="pre">a[i1:i1]</span></code>插入<code class="docutils literal"><span class="pre">b[j1:j2]</span></code>。</span><span class="yiyi-st" id="yiyi-195">请注意,在这种情况下,<code class="docutils literal"><span class="pre">i1</span> <span class="pre">==</span> <span class="pre">i2</span></code></span></td></tr><tr class="row-odd"><td><span class="yiyi-st" id="yiyi-196"><code class="docutils literal"><span class="pre">'equal'</span></code></span></td><td><span class="yiyi-st" id="yiyi-197"><code class="docutils literal"><span class="pre">a [i1:i2]</span> <span class="pre">==</span> <span class="pre">b [j1:j2]</span></code> 。</span></td></tr></tbody></table><p><span class="yiyi-st" id="yiyi-198">举个例子:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">a</span> <span class="o">=</span> <span class="s2">"qabxcd"</span>
|
||
<span class="gp">>>> </span><span class="n">b</span> <span class="o">=</span> <span class="s2">"abycdf"</span>
|
||
<span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="n">SequenceMatcher</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="k">for</span> <span class="n">tag</span><span class="p">,</span> <span class="n">i1</span><span class="p">,</span> <span class="n">i2</span><span class="p">,</span> <span class="n">j1</span><span class="p">,</span> <span class="n">j2</span> <span class="ow">in</span> <span class="n">s</span><span class="o">.</span><span class="n">get_opcodes</span><span class="p">():</span>
|
||
<span class="gp">... </span> <span class="nb">print</span><span class="p">(</span><span class="s1">'</span><span class="si">{:7}</span><span class="s1"> a[</span><span class="si">{}</span><span class="s1">:</span><span class="si">{}</span><span class="s1">] --> b[</span><span class="si">{}</span><span class="s1">:</span><span class="si">{}</span><span class="s1">] </span><span class="si">{!r:>8}</span><span class="s1"> --> </span><span class="si">{!r}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
|
||
<span class="gp">... </span> <span class="n">tag</span><span class="p">,</span> <span class="n">i1</span><span class="p">,</span> <span class="n">i2</span><span class="p">,</span> <span class="n">j1</span><span class="p">,</span> <span class="n">j2</span><span class="p">,</span> <span class="n">a</span><span class="p">[</span><span class="n">i1</span><span class="p">:</span><span class="n">i2</span><span class="p">],</span> <span class="n">b</span><span class="p">[</span><span class="n">j1</span><span class="p">:</span><span class="n">j2</span><span class="p">]))</span>
|
||
<span class="go">delete a[0:1] --> b[0:0] 'q' --> ''</span>
|
||
<span class="go">equal a[1:3] --> b[0:2] 'ab' --> 'ab'</span>
|
||
<span class="go">replace a[3:4] --> b[2:3] 'x' --> 'y'</span>
|
||
<span class="go">equal a[4:6] --> b[3:5] 'cd' --> 'cd'</span>
|
||
<span class="go">insert a[6:6] --> b[5:6] '' --> 'f'</span>
|
||
</code></pre></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.get_grouped_opcodes"><span class="yiyi-st" id="yiyi-199"><code class="descname">get_grouped_opcodes</code><span class="sig-paren">(</span><em>n=3</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-200">返回一种<a class="reference internal" href="../glossary.html#term-generator"><span class="xref std std-term">generator</span></a>的上下文的<em>n</em>行与团体。</span></p><p><span class="yiyi-st" id="yiyi-201">开始与团体经由<a class="reference internal" href="#difflib.SequenceMatcher.get_opcodes" title="difflib.SequenceMatcher.get_opcodes"><code class="xref py py-meth docutils literal"><span class="pre">get_opcodes()</span></code></a>,这种方法拆分出变化的较小群集和消除没有更改的干预范围。</span></p><p><span class="yiyi-st" id="yiyi-202">组在<a class="reference internal" href="#difflib.SequenceMatcher.get_opcodes" title="difflib.SequenceMatcher.get_opcodes"><code class="xref py py-meth docutils literal"><span class="pre">get_opcodes()</span></code></a>相同的格式返回。</span></p></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.ratio"><span class="yiyi-st" id="yiyi-203"><code class="descname">ratio</code><span class="sig-paren">(</span><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-204">在 [0,1] 范围内返回一个浮点数作为序列的相似性度量。</span></p><p><span class="yiyi-st" id="yiyi-205">这是 2.0 * M / T, 其中 T 是两个序列中的元素的总数, M 是匹配项的数目,注意,如果序列是相同的则为<code class="docutils literal"><span class="pre">1.0</span></code>,如果他们没有共同之处则为<code class="docutils literal"><span class="pre">0.0</span></code>。</span></p><p><span class="yiyi-st" id="yiyi-206">这是昂贵的计算如果已经调用没有<a class="reference internal" href="#difflib.SequenceMatcher.get_matching_blocks" title="difflib.SequenceMatcher.get_matching_blocks"><code class="xref py py-meth docutils literal"><span class="pre">get_matching_blocks()</span></code></a>或<a class="reference internal" href="#difflib.SequenceMatcher.get_opcodes" title="difflib.SequenceMatcher.get_opcodes"><code class="xref py py-meth docutils literal"><span class="pre">get_opcodes()</span></code></a> ,在这种情况下你可能想要尝试<a class="reference internal" href="#difflib.SequenceMatcher.quick_ratio" title="difflib.SequenceMatcher.quick_ratio"><code class="xref py py-meth docutils literal"><span class="pre">quick_ratio()</span></code></a>或<a class="reference internal" href="#difflib.SequenceMatcher.real_quick_ratio" title="difflib.SequenceMatcher.real_quick_ratio"><code class="xref py py-meth docutils literal"><span class="pre">real_quick_ratio()</span></code></a>第一次去一个上限。</span></p></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.quick_ratio"><span class="yiyi-st" id="yiyi-207"> <code class="descname">quick_ratio</code><span class="sig-paren">(</span><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-208">在<a class="reference internal" href="#difflib.SequenceMatcher.ratio" title="difflib.SequenceMatcher.ratio"><code class="xref py py-meth docutils literal"><span class="pre">ratio()</span></code></a>上相对较快地返回上限。</span></p></dd></dl><dl class="method"><dt id="difflib.SequenceMatcher.real_quick_ratio"><span class="yiyi-st" id="yiyi-209"><code class="descname">real_quick_ratio</code><span class="sig-paren">(</span><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-210">在<a class="reference internal" href="#difflib.SequenceMatcher.ratio" title="difflib.SequenceMatcher.ratio"><code class="xref py py-meth docutils literal"><span class="pre">ratio()</span></code></a>上很快返回上限。</span></p></dd></dl></dd></dl><p><span class="yiyi-st" id="yiyi-211">由于不同的近似水平,返回匹配总字符的比率的三种方法可以给出不同的结果,虽然<code class="xref py py-meth docutils literal"><span class="pre">quick_ratio()</span></code>和<code class="xref py py-meth docutils literal"><span class="pre">real_quick_ratio()</span></code>总是至少和<code class="xref py py-meth docutils literal"><span class="pre">ratio()</span></code>一样大:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="n">SequenceMatcher</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="s2">"abcd"</span><span class="p">,</span> <span class="s2">"bcde"</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="n">s</span><span class="o">.</span><span class="n">ratio</span><span class="p">()</span>
|
||
<span class="go">0.75</span>
|
||
<span class="gp">>>> </span><span class="n">s</span><span class="o">.</span><span class="n">quick_ratio</span><span class="p">()</span>
|
||
<span class="go">0.75</span>
|
||
<span class="gp">>>> </span><span class="n">s</span><span class="o">.</span><span class="n">real_quick_ratio</span><span class="p">()</span>
|
||
<span class="go">1.0</span>
|
||
</code></pre></div><div class="section" id="sequencematcher-examples"><h2><span class="yiyi-st" id="yiyi-212">6.3.2.</span><span class="yiyi-st" id="yiyi-213">SequenceMatcher示例</span></h2><p><span class="yiyi-st" id="yiyi-214">此示例比较两个字符串,将空格视为“junk”:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="n">SequenceMatcher</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="o">==</span> <span class="s2">" "</span><span class="p">,</span>
|
||
<span class="gp">... </span> <span class="s2">"private Thread currentThread;"</span><span class="p">,</span>
|
||
<span class="gp">... </span> <span class="s2">"private volatile Thread currentThread;"</span><span class="p">)</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-215"><code class="xref py py-meth docutils literal"><span class="pre">ratio()</span></code>返回一个浮点数在 [0,1],测量序列的相似性。</span><span class="yiyi-st" id="yiyi-216">作为一个经验法则,在 0.6 手段的序列是一个<code class="xref py py-meth docutils literal"><span class="pre">ratio()</span></code>值接近的匹配项:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">round</span><span class="p">(</span><span class="n">s</span><span class="o">.</span><span class="n">ratio</span><span class="p">(),</span> <span class="mi">3</span><span class="p">))</span>
|
||
<span class="go">0.866</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-217">如果你只对感兴趣的序列匹配的地方, <code class="xref py py-meth docutils literal"><span class="pre">get_matching_blocks()</span></code>是派上用场:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="k">for</span> <span class="n">block</span> <span class="ow">in</span> <span class="n">s</span><span class="o">.</span><span class="n">get_matching_blocks</span><span class="p">():</span>
|
||
<span class="gp">... </span> <span class="nb">print</span><span class="p">(</span><span class="s2">"a[</span><span class="si">%d</span><span class="s2">] and b[</span><span class="si">%d</span><span class="s2">] match for </span><span class="si">%d</span><span class="s2"> elements"</span> <span class="o">%</span> <span class="n">block</span><span class="p">)</span>
|
||
<span class="go">a[0] and b[0] match for 8 elements</span>
|
||
<span class="go">a[8] and b[17] match for 21 elements</span>
|
||
<span class="go">a[29] and b[38] match for 0 elements</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-218">请注意由<code class="xref py py-meth docutils literal"><span class="pre">get_matching_blocks()</span></code>返回的最后一个元组一直都是形同虚设, <code class="docutils literal"> <span class="pre">(len(a),</span> <span class="pre">len(b),</span> <span class="pre">0)</span></code>,这是最后一个元组元素 (元素匹配数目) <code class="docutils literal"><span class="pre">0</span></code>的唯一情况。</span></p><p><span class="yiyi-st" id="yiyi-219">如果你想要知道如何进入第二个更改的第一个序列,请使用<code class="xref py py-meth docutils literal"><span class="pre">get_opcodes()</span></code>:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="k">for</span> <span class="n">opcode</span> <span class="ow">in</span> <span class="n">s</span><span class="o">.</span><span class="n">get_opcodes</span><span class="p">():</span>
|
||
<span class="gp">... </span> <span class="nb">print</span><span class="p">(</span><span class="s2">"</span><span class="si">%6s</span><span class="s2"> a[</span><span class="si">%d</span><span class="s2">:</span><span class="si">%d</span><span class="s2">] b[</span><span class="si">%d</span><span class="s2">:</span><span class="si">%d</span><span class="s2">]"</span> <span class="o">%</span> <span class="n">opcode</span><span class="p">)</span>
|
||
<span class="go"> equal a[0:8] b[0:8]</span>
|
||
<span class="go">insert a[8:8] b[8:17]</span>
|
||
<span class="go"> equal a[8:29] b[17:38]</span>
|
||
</code></pre><div class="admonition seealso"><p class="first admonition-title"><span class="yiyi-st" id="yiyi-220">请参见</span></p><ul class="last simple"><li><span class="yiyi-st" id="yiyi-221">此模块中的<a class="reference internal" href="#difflib.get_close_matches" title="difflib.get_close_matches"><code class="xref py py-func docutils literal"><span class="pre">get_close_matches()</span></code></a>函数显示了如何使用<a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>创建简单的代码来进行有用的工作。</span></li><li><span class="yiyi-st" id="yiyi-222"><a class="reference external" href="https://code.activestate.com/recipes/576729/">简单版本控制配方</a>用于使用<a class="reference internal" href="#difflib.SequenceMatcher" title="difflib.SequenceMatcher"><code class="xref py py-class docutils literal"><span class="pre">SequenceMatcher</span></code></a>构建的小型应用程序。</span></li></ul></div></div><div class="section" id="differ-objects"><h2><span class="yiyi-st" id="yiyi-223">6.3.3.</span><span class="yiyi-st" id="yiyi-224">Differ对象</span></h2><p><span class="yiyi-st" id="yiyi-225">注意<a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-class docutils literal"><span class="pre">Differ</span></code></a>的-生成的增量使不自称是<strong>极小</strong>的差异。</span><span class="yiyi-st" id="yiyi-226">与此相反,极小的差异往往违反直觉的因为他们可能,有时意外的比赛 100 页分开角落同步。</span><span class="yiyi-st" id="yiyi-227">连续匹配限制同步点保留一些概念的地方,偶尔的成本产生更长的时间差异。</span></p><p><span class="yiyi-st" id="yiyi-228"><a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-class docutils literal"><span class="pre">Differ</span></code></a>类具有此构造函数:</span></p><dl class="class"><dt><span class="yiyi-st" id="yiyi-229"><em class="property">class </em><code class="descclassname">difflib.</code><code class="descname">Differ</code><span class="sig-paren">(</span><em>linejunk=None</em>, <em>charjunk=None</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-230">可选的关键字参数<em>linejunk</em>和<em>charjunk</em>是过滤函数 (或<code class="docutils literal"><span class="pre">None</span></code>):</span></p><p><span class="yiyi-st" id="yiyi-231"><em>linejunk</em>: 一个函数,接受单个字符串参数,并返回 true,如果字符串是垃圾。</span><span class="yiyi-st" id="yiyi-232">默认值是<code class="docutils literal"><span class="pre">None</span></code>意义没有线被认为是垃圾。</span></p><p><span class="yiyi-st" id="yiyi-233"><em>charjunk</em>: 一个函数,接受单个字符参数 (长度为 1 的字符串),并返回 true,则该字符是垃圾。</span><span class="yiyi-st" id="yiyi-234">默认值是<code class="docutils literal"><span class="pre">None</span></code>意义的照片中没有字符被认为是垃圾。</span></p><p><span class="yiyi-st" id="yiyi-235">这些垃圾过滤功能加速匹配以发现差异,并且不会导致任何不同的行或字符被忽略。</span><span class="yiyi-st" id="yiyi-236">请阅读<a class="reference internal" href="#difflib.SequenceMatcher.find_longest_match" title="difflib.SequenceMatcher.find_longest_match"><code class="xref py py-meth docutils literal"><span class="pre">find_longest_match()</span></code></a>方法的<em>isjunk</em>参数的说明,以获取说明。</span></p><p><span class="yiyi-st" id="yiyi-237"><a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-class docutils literal"><span class="pre">Differ</span></code></a>对象通过一个单独的方法是使用 (增量生成):</span></p><dl class="method"><dt id="difflib.Differ.compare"><span class="yiyi-st" id="yiyi-238"><code class="descname">compare</code><span class="sig-paren">(</span><em>a</em>, <em>b</em><span class="sig-paren">)</span></span></dt><dd><p><span class="yiyi-st" id="yiyi-239">比较两个序列的行,并生成三角洲 (行序列)。</span></p><p><span class="yiyi-st" id="yiyi-240">每个序列必须包含以换行符结尾的个别单行字符串。</span><span class="yiyi-st" id="yiyi-241">这些序列可索取文件类似物体的<a class="reference internal" href="io.html#io.IOBase.readlines" title="io.IOBase.readlines"><code class="xref py py-meth docutils literal"><span class="pre">readlines()</span></code></a>方法。</span><span class="yiyi-st" id="yiyi-242">生成的三角洲还包括换行符终止的字符串,准备作为打印-是通过文件类似对象的<a class="reference internal" href="io.html#io.IOBase.writelines" title="io.IOBase.writelines"><code class="xref py py-meth docutils literal"><span class="pre">writelines()</span></code></a>方法。</span></p></dd></dl></dd></dl></div><div class="section" id="differ-example"><h2><span class="yiyi-st" id="yiyi-243">6.3.4.</span><span class="yiyi-st" id="yiyi-244">Differ示例</span></h2><p><span class="yiyi-st" id="yiyi-245">本示例将两个文本进行比较。</span><span class="yiyi-st" id="yiyi-246">首先,我们建立的案文,以换行符结尾的各个单行字符串序列 (这种序列可以也得到从文件类似物体的<code class="xref py py-meth docutils literal"><span class="pre">readlines()</span></code>方法):</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">text1</span> <span class="o">=</span> <span class="s1">''' 1. Beautiful is better than ugly.</span>
|
||
<span class="gp">... </span><span class="s1"> 2. Explicit is better than implicit.</span>
|
||
<span class="gp">... </span><span class="s1"> 3. Simple is better than complex.</span>
|
||
<span class="gp">... </span><span class="s1"> 4. Complex is better than complicated.</span>
|
||
<span class="gp">... </span><span class="s1">'''</span><span class="o">.</span><span class="n">splitlines</span><span class="p">(</span><span class="n">keepends</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="nb">len</span><span class="p">(</span><span class="n">text1</span><span class="p">)</span>
|
||
<span class="go">4</span>
|
||
<span class="gp">>>> </span><span class="n">text1</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
|
||
<span class="go">'\n'</span>
|
||
<span class="gp">>>> </span><span class="n">text2</span> <span class="o">=</span> <span class="s1">''' 1. Beautiful is better than ugly.</span>
|
||
<span class="gp">... </span><span class="s1"> 3. Simple is better than complex.</span>
|
||
<span class="gp">... </span><span class="s1"> 4. Complicated is better than complex.</span>
|
||
<span class="gp">... </span><span class="s1"> 5. Flat is better than nested.</span>
|
||
<span class="gp">... </span><span class="s1">'''</span><span class="o">.</span><span class="n">splitlines</span><span class="p">(</span><span class="n">keepends</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-247">下一步我们实例化一个不同的对象:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">d</span> <span class="o">=</span> <span class="n">Differ</span><span class="p">()</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-248">注意,当实例化<a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-class docutils literal"><span class="pre">Differ</span></code></a>对象时,我们可以传递函数来过滤出行和字符“垃圾”。有关详细信息,请参见<a class="reference internal" href="#difflib.Differ" title="difflib.Differ"><code class="xref py py-meth docutils literal"><span class="pre">Differ()</span></code></a>构造函数。</span></p><p><span class="yiyi-st" id="yiyi-249">最后,我们比较这两个:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="n">result</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">d</span><span class="o">.</span><span class="n">compare</span><span class="p">(</span><span class="n">text1</span><span class="p">,</span> <span class="n">text2</span><span class="p">))</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-250"><code class="docutils literal"><span class="pre">result</span></code>是一个字符串列表,所以让我们漂亮的格式打印:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">pprint</span> <span class="k">import</span> <span class="n">pprint</span>
|
||
<span class="gp">>>> </span><span class="n">pprint</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
|
||
<span class="go">[' 1. Beautiful is better than ugly.\n',</span>
|
||
<span class="go"> '- 2. Explicit is better than implicit.\n',</span>
|
||
<span class="go"> '- 3. Simple is better than complex.\n',</span>
|
||
<span class="go"> '+ 3. Simple is better than complex.\n',</span>
|
||
<span class="go"> '? ++\n',</span>
|
||
<span class="go"> '- 4. Complex is better than complicated.\n',</span>
|
||
<span class="go"> '? ^ ---- ^\n',</span>
|
||
<span class="go"> '+ 4. Complicated is better than complex.\n',</span>
|
||
<span class="go"> '? ++++ ^ ^\n',</span>
|
||
<span class="go"> '+ 5. Flat is better than nested.\n']</span>
|
||
</code></pre><p><span class="yiyi-st" id="yiyi-251">作为一个单一的多行字符串,它看起来像这样:</span></p><pre><code class="language-python"><span></span><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">sys</span>
|
||
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">writelines</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
|
||
<span class="go"> 1. Beautiful is better than ugly.</span>
|
||
<span class="go">- 2. Explicit is better than implicit.</span>
|
||
<span class="go">- 3. Simple is better than complex.</span>
|
||
<span class="go">+ 3. Simple is better than complex.</span>
|
||
<span class="go">? ++</span>
|
||
<span class="go">- 4. Complex is better than complicated.</span>
|
||
<span class="go">? ^ ---- ^</span>
|
||
<span class="go">+ 4. Complicated is better than complex.</span>
|
||
<span class="go">? ++++ ^ ^</span>
|
||
<span class="go">+ 5. Flat is better than nested.</span>
|
||
</code></pre></div><div class="section" id="a-command-line-interface-to-difflib"><h2><span class="yiyi-st" id="yiyi-252">6.3.5.</span><span class="yiyi-st" id="yiyi-253">A command-line interface to difflib</span></h2><p><span class="yiyi-st" id="yiyi-254">此示例演示如何使用 difflib 来创建一个<code class="docutils literal"><span class="pre">diff</span></code>-喜欢实用程序。</span><span class="yiyi-st" id="yiyi-255">它也包含在 Python 源代码发行版,作为<code class="file docutils literal"><span class="pre">Tools/scripts/diff.py</span></code>。</span></p><pre><code class="language-python"><span></span><span class="ch">#!/usr/bin/env python3</span>
|
||
<span class="sd">""" Command line interface to difflib.py providing diffs in four formats:</span>
|
||
|
||
<span class="sd">* ndiff: lists every line and highlights interline changes.</span>
|
||
<span class="sd">* context: highlights clusters of changes in a before/after format.</span>
|
||
<span class="sd">* unified: highlights clusters of changes in an inline format.</span>
|
||
<span class="sd">* html: generates side by side comparison with change highlights.</span>
|
||
|
||
<span class="sd">"""</span>
|
||
|
||
<span class="kn">import</span> <span class="nn">sys</span><span class="o">,</span> <span class="nn">os</span><span class="o">,</span> <span class="nn">time</span><span class="o">,</span> <span class="nn">difflib</span><span class="o">,</span> <span class="nn">argparse</span>
|
||
<span class="kn">from</span> <span class="nn">datetime</span> <span class="k">import</span> <span class="n">datetime</span><span class="p">,</span> <span class="n">timezone</span>
|
||
|
||
<span class="k">def</span> <span class="nf">file_mtime</span><span class="p">(</span><span class="n">path</span><span class="p">):</span>
|
||
<span class="n">t</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">fromtimestamp</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">stat</span><span class="p">(</span><span class="n">path</span><span class="p">)</span><span class="o">.</span><span class="n">st_mtime</span><span class="p">,</span>
|
||
<span class="n">timezone</span><span class="o">.</span><span class="n">utc</span><span class="p">)</span>
|
||
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">astimezone</span><span class="p">()</span><span class="o">.</span><span class="n">isoformat</span><span class="p">()</span>
|
||
|
||
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
|
||
|
||
<span class="n">parser</span> <span class="o">=</span> <span class="n">argparse</span><span class="o">.</span><span class="n">ArgumentParser</span><span class="p">()</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-c'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="s1">'store_true'</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
|
||
<span class="n">help</span><span class="o">=</span><span class="s1">'Produce a context format diff (default)'</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-u'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="s1">'store_true'</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
|
||
<span class="n">help</span><span class="o">=</span><span class="s1">'Produce a unified format diff'</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-m'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="s1">'store_true'</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
|
||
<span class="n">help</span><span class="o">=</span><span class="s1">'Produce HTML side by side diff '</span>
|
||
<span class="s1">'(can use -c and -l in conjunction)'</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-n'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="s1">'store_true'</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
|
||
<span class="n">help</span><span class="o">=</span><span class="s1">'Produce a ndiff format diff'</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-l'</span><span class="p">,</span> <span class="s1">'--lines'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
|
||
<span class="n">help</span><span class="o">=</span><span class="s1">'Set number of context lines (default 3)'</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'fromfile'</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'tofile'</span><span class="p">)</span>
|
||
<span class="n">options</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">parse_args</span><span class="p">()</span>
|
||
|
||
<span class="n">n</span> <span class="o">=</span> <span class="n">options</span><span class="o">.</span><span class="n">lines</span>
|
||
<span class="n">fromfile</span> <span class="o">=</span> <span class="n">options</span><span class="o">.</span><span class="n">fromfile</span>
|
||
<span class="n">tofile</span> <span class="o">=</span> <span class="n">options</span><span class="o">.</span><span class="n">tofile</span>
|
||
|
||
<span class="n">fromdate</span> <span class="o">=</span> <span class="n">file_mtime</span><span class="p">(</span><span class="n">fromfile</span><span class="p">)</span>
|
||
<span class="n">todate</span> <span class="o">=</span> <span class="n">file_mtime</span><span class="p">(</span><span class="n">tofile</span><span class="p">)</span>
|
||
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">fromfile</span><span class="p">)</span> <span class="k">as</span> <span class="n">ff</span><span class="p">:</span>
|
||
<span class="n">fromlines</span> <span class="o">=</span> <span class="n">ff</span><span class="o">.</span><span class="n">readlines</span><span class="p">()</span>
|
||
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">tofile</span><span class="p">)</span> <span class="k">as</span> <span class="n">tf</span><span class="p">:</span>
|
||
<span class="n">tolines</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">readlines</span><span class="p">()</span>
|
||
|
||
<span class="k">if</span> <span class="n">options</span><span class="o">.</span><span class="n">u</span><span class="p">:</span>
|
||
<span class="n">diff</span> <span class="o">=</span> <span class="n">difflib</span><span class="o">.</span><span class="n">unified_diff</span><span class="p">(</span><span class="n">fromlines</span><span class="p">,</span> <span class="n">tolines</span><span class="p">,</span> <span class="n">fromfile</span><span class="p">,</span> <span class="n">tofile</span><span class="p">,</span> <span class="n">fromdate</span><span class="p">,</span> <span class="n">todate</span><span class="p">,</span> <span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">)</span>
|
||
<span class="k">elif</span> <span class="n">options</span><span class="o">.</span><span class="n">n</span><span class="p">:</span>
|
||
<span class="n">diff</span> <span class="o">=</span> <span class="n">difflib</span><span class="o">.</span><span class="n">ndiff</span><span class="p">(</span><span class="n">fromlines</span><span class="p">,</span> <span class="n">tolines</span><span class="p">)</span>
|
||
<span class="k">elif</span> <span class="n">options</span><span class="o">.</span><span class="n">m</span><span class="p">:</span>
|
||
<span class="n">diff</span> <span class="o">=</span> <span class="n">difflib</span><span class="o">.</span><span class="n">HtmlDiff</span><span class="p">()</span><span class="o">.</span><span class="n">make_file</span><span class="p">(</span><span class="n">fromlines</span><span class="p">,</span><span class="n">tolines</span><span class="p">,</span><span class="n">fromfile</span><span class="p">,</span><span class="n">tofile</span><span class="p">,</span><span class="n">context</span><span class="o">=</span><span class="n">options</span><span class="o">.</span><span class="n">c</span><span class="p">,</span><span class="n">numlines</span><span class="o">=</span><span class="n">n</span><span class="p">)</span>
|
||
<span class="k">else</span><span class="p">:</span>
|
||
<span class="n">diff</span> <span class="o">=</span> <span class="n">difflib</span><span class="o">.</span><span class="n">context_diff</span><span class="p">(</span><span class="n">fromlines</span><span class="p">,</span> <span class="n">tolines</span><span class="p">,</span> <span class="n">fromfile</span><span class="p">,</span> <span class="n">tofile</span><span class="p">,</span> <span class="n">fromdate</span><span class="p">,</span> <span class="n">todate</span><span class="p">,</span> <span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="p">)</span>
|
||
|
||
<span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">writelines</span><span class="p">(</span><span class="n">diff</span><span class="p">)</span>
|
||
|
||
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
|
||
<span class="n">main</span><span class="p">()</span>
|
||
</code></pre></div></div></div> |