1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 00:03:28 +01:00

proxy-mode tweaks: (fixes #302): (#304)

- don't include wombat.js in banner only mode, including in proxy mode
  (instead, do set devicePixelRatio to fix certain fidelity issues)
- default_banner: set title to document.title on load when frameless, including in proxy mode
- improve docs for configuring proxy mode cert
- tests: update tests to ensure no wombat.js injected in proxy or banner-only mode
This commit is contained in:
Ilya Kreymer 2018-02-27 15:52:19 -08:00 committed by GitHub
parent e2cbdbc27c
commit 61bf5e09ca
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 33 additions and 15 deletions

View File

@ -403,7 +403,7 @@ Configuring HTTP Proxy
At this time, pywb requires the collection to be configured at setup time (though collection switching will be added soon).
The collection can be specified by running: ``wayback --proxy my-coll`` or by adding to the config::
To enable proxy mode, the collection can be specified by running: ``wayback --proxy my-coll`` or by adding to the config::
proxy:
coll: my-coll
@ -432,24 +432,28 @@ HTTPS Proxy and pywb Certificate Authority
For HTTPS proxy access, pywb provides its own Certificate Authority and dynamically generates certificates for each host and signs the responses
with these certificates. By design, this allows pywb to act as "man-in-the-middle" serving archived copies of a given site.
However, the pywb certificate authority (CA) will need to be accepted by the browser. The CA cert can be downloaded from pywb directly
However, the pywb Certificate Authority (CA) certificate will need to be accepted by the browser. The CA cert can be downloaded from pywb directly
using the special download paths. Recommended set up for using the proxy is as follows:
1. Configure the browser proxy settings host port, for example ``localhost`` and ``8080`` (if running locally)
1. Start pywb with proxy mode enabled (with ``--proxy`` option or with a ``proxy:`` option block present in the config).
2. Download the CA:
(The CA root certificate will be auto-created when first starting pywb with proxy mode if it doesn't exist.)
2. Configure the browser proxy settings host port, for example ``localhost`` and ``8080`` (if running locally)
3. Download the CA:
* For most browsers, use the PEM format: ``http://wsgiprox/download/pem``
* For windows, use the PKCS12 format: ``http://wsgiprox/download/p12``
3. You may need to agree to "Trust this CA" to identify websites.
4. You may need to agree to "Trust this CA" to identify websites.
The pywb CA file is automatically generated if it does not exist, and may be added to the key store directly.
The auto-generated pywb CA, created at ``./proxy-certs/pywb-ca.pem`` may also be added to a keystore directly.
Additional proxy options ``ca_name`` and ``ca_file_cache`` allow configuring the location and name of the CA file.
The location of the CA file and the CA name displayed can be changed by setting the ``ca_file_cache`` and ``ca_name`` proxy options, respectively.
The following are all the available proxy options (only ``coll`` is required)::
The following are all the available proxy options -- only ``coll`` is required::
proxy:
coll: my-coll

View File

@ -142,6 +142,8 @@ For example, ``wayback --proxy my-web-archive`` will start pywb and enable proxy
You can then configure a browser to Proxy Settings host port to: ``localhost:8080`` and then loading any url, eg. ``http://example.com/`` should
load the latest copy from the ``my-web-archive`` collection.
See :ref:`https-proxy` section for additional configuration details.
Deployment
----------

View File

@ -94,7 +94,8 @@ This file is part of pywb, https://github.com/webrecorder/pywb
set_banner(window.wbinfo.url,
window.wbinfo.timestamp,
window.wbinfo.is_live);
window.wbinfo.is_live,
window.wbinfo.is_framed ? "" : document.title);
} else {
init("_wb_frame_top_banner");

View File

@ -1,7 +1,6 @@
<!-- WB Insert -->
<script src='{{ host_prefix }}/{{ static_path }}/wombat.js'> </script>
<script>
{% set urlsplit = cdx.url | urlsplit %}
{% set urlsplit = cdx.url | urlsplit %}
wbinfo = {}
wbinfo.url = "{{ cdx.url }}";
wbinfo.timestamp = "{{ cdx.timestamp }}";
@ -14,8 +13,11 @@
wbinfo.coll = "{{ coll }}";
wbinfo.proxy_magic = "{{ env.pywb_proxy_magic }}";
wbinfo.static_prefix = "{{ host_prefix }}/{{ static_path }}/";
</script>
{% if not wb_url.is_banner_only %}
<script src='{{ host_prefix }}/{{ static_path }}/wombat.js'> </script>
<script>
wbinfo.wombat_ts = "{{ wombat_ts }}";
wbinfo.wombat_sec = "{{ wombat_sec }}";
wbinfo.wombat_scheme = "{{ urlsplit.scheme }}";
@ -30,9 +32,12 @@
} else {
console.warn("_wb_wombat missing!");
}
{% endif %}
</script>
{% else %}
<script>
window.devicePixelRatio = 1;
</script>
{% endif %}
{% if config.enable_flash_video_rewrite %}
<script src='{{ host_prefix }}/{{ static_path }}/vidrw.js'> </script>

View File

@ -163,8 +163,8 @@ class TestWbIntegration(BaseConfigTest):
def test_replay_banner_only(self):
resp = self.testapp.get('/pywb/20140126201054bn_/http://www.iana.org/domains/reserved')
# wombat.js header insertion
assert 'wombat.js' in resp.text
# wombat.js header not inserted
assert 'wombat.js' not in resp.text
# no wombat present
assert '_WBWombat' not in resp.text

View File

@ -60,6 +60,9 @@ class TestProxy(BaseTestProxy):
assert 'WB Insert' in res.text
assert 'Example Domain' in res.text
# no wombat.js
assert 'wombat.js' not in res.text
assert res.headers['Link'] == '<http://example.com>; rel="memento"; datetime="Mon, 27 Jan 2014 17:12:51 GMT"; collection="pywb"'
assert res.headers['Memento-Datetime'] == 'Mon, 27 Jan 2014 17:12:51 GMT'
@ -73,6 +76,9 @@ class TestProxy(BaseTestProxy):
assert 'WB Insert' in res.text
assert 'Example Domain' in res.text
# no wombat.js
assert 'wombat.js' not in res.text
assert res.headers['Link'] == '<http://test@example.com/>; rel="memento"; datetime="Mon, 29 Jul 2013 19:51:51 GMT"; collection="pywb"'
assert res.headers['Memento-Datetime'] == 'Mon, 29 Jul 2013 19:51:51 GMT'