This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<pid="module-searx.engines.startpage">Startpage’s language & region selectors are a mess ..</p>
<sectionid="startpage-regions">
<spanid="id2"></span><h2><aclass="toc-backref"href="#id9"role="doc-backlink">Startpage regions</a><aclass="headerlink"href="#startpage-regions"title="Link to this heading">¶</a></h2>
<p>In the list of regions there are tags we need to map to common region tags:</p>
<p>For reference see languages-subtag at iana; <codeclass="docutils literal notranslate"><spanclass="pre">no</span></code> is the macrolanguage <aclass="footnote-reference brackets"href="#id5"id="id3"role="doc-noteref"><spanclass="fn-bracket">[</span>1<spanclass="fn-bracket">]</span></a> and
W3C recommends subtag over macrolanguage <aclass="footnote-reference brackets"href="#id6"id="id4"role="doc-noteref"><spanclass="fn-bracket">[</span>2<spanclass="fn-bracket">]</span></a>.</p>
<p>Use macrolanguages with care. Some language subtags have a Scope field set to
macrolanguage, i.e. this primary language subtag encompasses a number of more
specific primary language subtags in the registry. … As we recommended for
the collection subtags mentioned above, in most cases you should try to use
the more specific subtags … <aclass="reference external"href="https://www.w3.org/International/questions/qa-choosing-language-tags#langsubtag">W3: The primary language subtag</a></p>
</aside>
</aside>
</section>
<sectionid="startpage-languages">
<spanid="id7"></span><h2><aclass="toc-backref"href="#id10"role="doc-backlink">Startpage languages</a><aclass="headerlink"href="#startpage-languages"title="Link to this heading">¶</a></h2>
<dl>
<dt><aclass="reference internal"href="#searx.engines.startpage.send_accept_language_header"title="searx.engines.startpage.send_accept_language_header"><codeclass="xref py py-obj docutils literal notranslate"><spanclass="pre">send_accept_language_header</span></code></a>:</dt><dd><p>The displayed name in Startpage’s settings page depend on the location of the
IP when <codeclass="docutils literal notranslate"><spanclass="pre">Accept-Language</span></code> HTTP header is unset. In <aclass="reference internal"href="#searx.engines.startpage.fetch_traits"title="searx.engines.startpage.fetch_traits"><codeclass="xref py py-obj docutils literal notranslate"><spanclass="pre">fetch_traits</span></code></a>
<p>to get uniform names independent from the IP).</p>
</dd>
</dl>
</section>
<sectionid="startpage-categories">
<spanid="id8"></span><h2><aclass="toc-backref"href="#id11"role="doc-backlink">Startpage categories</a><aclass="headerlink"href="#startpage-categories"title="Link to this heading">¶</a></h2>
<p>Startpage’s category (for Web-search, News, Videos, ..) is set by
<aclass="reference internal"href="#searx.engines.startpage.startpage_categ"title="searx.engines.startpage.startpage_categ"><codeclass="xref py py-obj docutils literal notranslate"><spanclass="pre">startpage_categ</span></code></a> in settings.yml:</p>
<p>The default category is <codeclass="docutils literal notranslate"><spanclass="pre">web</span></code> .. and other categories than <codeclass="docutils literal notranslate"><spanclass="pre">web</span></code> are not
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">startpage_categ</span></span><emclass="property"><spanclass="w"></span><spanclass="p"><spanclass="pre">=</span></span><spanclass="w"></span><spanclass="pre">'web'</span></em><aclass="headerlink"href="#searx.engines.startpage.startpage_categ"title="Link to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">send_accept_language_header</span></span><emclass="property"><spanclass="w"></span><spanclass="p"><spanclass="pre">=</span></span><spanclass="w"></span><spanclass="pre">True</span></em><aclass="headerlink"href="#searx.engines.startpage.send_accept_language_header"title="Link to this definition">¶</a></dt>
<dd><p>Startpage tries to guess user’s language and territory from the HTTP
<codeclass="docutils literal notranslate"><spanclass="pre">Accept-Language</span></code>. Optional the user can select a search-language (can be
different to the UI language) and a region filter.</p>
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">max_page</span></span><emclass="property"><spanclass="w"></span><spanclass="p"><spanclass="pre">=</span></span><spanclass="w"></span><spanclass="pre">18</span></em><aclass="headerlink"href="#searx.engines.startpage.max_page"title="Link to this definition">¶</a></dt>
<dd><p>Tested 18 pages maximum (argument <codeclass="docutils literal notranslate"><spanclass="pre">page</span></code>), to be save max is set to 20.</p>
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">search_form_xpath</span></span><emclass="property"><spanclass="w"></span><spanclass="p"><spanclass="pre">=</span></span><spanclass="w"></span><spanclass="pre">'//form[@id="search"]'</span></em><aclass="headerlink"href="#searx.engines.startpage.search_form_xpath"title="Link to this definition">¶</a></dt>
<dd><p>XPath of Startpage’s origin search form</p>
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">sc_code_cache_sec</span></span><emclass="property"><spanclass="w"></span><spanclass="p"><spanclass="pre">=</span></span><spanclass="w"></span><spanclass="pre">30</span></em><aclass="headerlink"href="#searx.engines.startpage.sc_code_cache_sec"title="Link to this definition">¶</a></dt>
<dd><p>Time in seconds the sc-code is cached in memory <aclass="reference internal"href="#searx.engines.startpage.get_sc_code"title="searx.engines.startpage.get_sc_code"><codeclass="xref py py-obj docutils literal notranslate"><spanclass="pre">get_sc_code</span></code></a>.</p>
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">get_sc_code</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">searxng_locale</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">params</span></span></em><spanclass="sig-paren">)</span><aclass="reference internal"href="../../../_modules/searx/engines/startpage.html#get_sc_code"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink"href="#searx.engines.startpage.get_sc_code"title="Link to this definition">¶</a></dt>
<dd><p>Get an actual <codeclass="docutils literal notranslate"><spanclass="pre">sc</span></code> argument from Startpage’s search form (HTML page).</p>
<p>Startpage puts a <codeclass="docutils literal notranslate"><spanclass="pre">sc</span></code> argument on every HTML <aclass="reference internal"href="#searx.engines.startpage.search_form_xpath"title="searx.engines.startpage.search_form_xpath"><codeclass="xref py py-obj docutils literal notranslate"><spanclass="pre">search</span><spanclass="pre">form</span></code></a>. Without this argument Startpage considers the request
is from a bot. We do not know what is encoded in the value of the <codeclass="docutils literal notranslate"><spanclass="pre">sc</span></code>
argument, but it seems to be a kind of a <em>time-stamp</em>.</p>
<p>Startpage’s search form generates a new sc-code on each request. This
function scrap a new sc-code from Startpage’s home page every
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">request</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">query</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">params</span></span></em><spanclass="sig-paren">)</span><aclass="reference internal"href="../../../_modules/searx/engines/startpage.html#request"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink"href="#searx.engines.startpage.request"title="Link to this definition">¶</a></dt>
<dd><p>Assemble a Startpage request.</p>
<p>To avoid CAPTCHA we need to send a well formed HTTP POST request with a
cookie. We need to form a request that is identical to the request build by
Startpage’s search form:</p>
<ulclass="simple">
<li><p>in the cookie the <strong>region</strong> is selected</p></li>
<li><p>in the HTTP POST data the <strong>language</strong> is selected</p></li>
</ul>
<p>Additionally the arguments form Startpage’s search form needs to be set in
HTML POST data / compare <codeclass="docutils literal notranslate"><spanclass="pre"><input></span></code> elements: <aclass="reference internal"href="#searx.engines.startpage.search_form_xpath"title="searx.engines.startpage.search_form_xpath"><codeclass="xref py py-obj docutils literal notranslate"><spanclass="pre">search_form_xpath</span></code></a>.</p>
<spanclass="sig-prename descclassname"><spanclass="pre">searx.engines.startpage.</span></span><spanclass="sig-name descname"><spanclass="pre">fetch_traits</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">engine_traits</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="w"></span><spanclass="n"><aclass="reference internal"href="../enginelib.html#searx.enginelib.traits.EngineTraits"title="searx.enginelib.traits.EngineTraits"><spanclass="pre">EngineTraits</span></a></span></em><spanclass="sig-paren">)</span><aclass="reference internal"href="../../../_modules/searx/engines/startpage.html#fetch_traits"><spanclass="viewcode-link"><spanclass="pre">[source]</span></span></a><aclass="headerlink"href="#searx.engines.startpage.fetch_traits"title="Link to this definition">¶</a></dt>
<dd><p>Fetch <aclass="reference internal"href="#startpage-languages"><spanclass="std std-ref">languages</span></a> and <aclass="reference internal"href="#startpage-regions"><spanclass="std std-ref">regions</span></a> from Startpage.</p>