From 819e8adf4880fe89cdf5973594444c15b0888407 Mon Sep 17 00:00:00 2001 From: Ilya Kreymer Date: Wed, 27 Jun 2018 09:02:01 -0700 Subject: [PATCH] text updates: (#352) - Update CHANGES.rst for 2.0.4 - Docs: Improve new proxy docs for (#316), fix URL-T->URI-T - Requirements: bump to wsgiprox>=1.5.1 --- CHANGES.rst | 37 ++++++++++++++++++++++++++++++++----- docs/manual/configuring.rst | 5 +++-- docs/manual/memento.rst | 2 +- requirements.txt | 2 +- 4 files changed, 37 insertions(+), 9 deletions(-) diff --git a/CHANGES.rst b/CHANGES.rst index 9a15945d..a1c15156 100644 --- a/CHANGES.rst +++ b/CHANGES.rst @@ -2,18 +2,45 @@ pywb 2.0.4 changelist ~~~~~~~~~~~~~~~~~~~~~ * Replay Fidelity Improvements: - - Improved wombat's ``document.write`` and ``document.writeln`` overrides to account for the variadic case (#325) - - Improved wombat's ``postmessage`` override's handling of the sending the message to the target origin (#328 and #338) + - Ensure title-only change event correctly handled by top-frame banner (#327) + - Improved wombat ``document.write`` and ``document.writeln`` overrides to account for the variadic case (#325) + - Improved wombat ``postMessage`` override logic of determining correct target origin (#328 and #338) - Improved server-side rewriting of ``link[rel=preload]`` (#332) - - Improved server-side and client-side rewriting of "super relative" script src values ``script[src=path/it.php?js]`` (#334) - - Improved wombat's un-rewrite regular expression (#332) - - Improved wombat's ``Node.[appendChild|replaceChild|insertBefore]`` overrides to account for edge cases (#332) + - Improved server-side and client-side rewriting of "super relative" script src values ``script[src=path/it.php?js]`` (#334) + - Improved wombat un-rewrite regular expression (#332) + - Improved wombat ``Node.[appendChild|replaceChild|insertBefore]`` overrides to account for edge cases (#332) - Added ``MouseEvent`` override to wombat (#332) - Added ``insertAdjacentElement`` override to wombat (#332) - Added client-side rewriting of ``link[rel=preload]`` and ``link[rel=import]`` to wombat (#332) - Added FontFace override to wombat (#340) - Added server-side rewriting of ``link[rel=import]`` (#334) - Added SVG filter attribute rewriting to wombat (#341) + - Improved detection of ServiceWorker JS, use ``sw_`` modifier which performs no rewriting but adds ``Service-Worker-Allowed`` header. + - Don't bind already overridden ``requestAnimationFrame/clearAnimationFrame`` functions via JS object proxy (#350) + - Don't reinit wombat in same window if new document is imported (#339) + - Cookies: Use default mod ``mp_`` for client-side rewriting to ensure cookies set correctly on client-side documents (#330) + +* Server-Side Rewriting: + - Flash: Improved Rewriting for AMF, supporting py2 and py3 (#321) + - Improved ``Origin`` header detection: Detect from ``Referer`` header if available (#329) + - Expand JSONP matching if url contains 'callback=jsonp' (#336) + - Ensure entity-escaped urls are rewritten, with escaping preserved (#337) + +* Redirect Improvements: + - Improved self-redirect detection for adjacent self-redirect capture results, avoiding self-redirect loops (#345) + - Fix possible leak when handling self-redirects + - Add slash-preserving redirect, if original ended in '/', ensure replayed version also ends with '/' (#344, #346) + +* Misc Fixes: + - Testing: Run local ``httpbin`` for any ``httpbin.org`` or ``test.httpbin.org`` tests to avoid external dependency. + - Indexing: Avoid indexing error in py2 by decoding in utf-8 if warc has non-ascii target url (#312) + - Gevent: Preserve %-escaped request url via ``REQUEST_URI`` (if available) to pass correct url to live upstream. + +* Proxy Mode Options (#316, #317): + - Add ``use_banner`` option, if false, disables banner insert in proxy mode (default: true) + - Add ``use_head_insert`` option, if false, disables injecting ``head_insert.html`` in proxy mode (default: true) + - Add ``FrontEndApp.proxy_route_request()`` to allow more customized proxy routing (default: route to fixed default collection) + - Expand proxy mode docs pywb 2.0.3 changelist diff --git a/docs/manual/configuring.rst b/docs/manual/configuring.rst index 42d98852..d65d07aa 100644 --- a/docs/manual/configuring.rst +++ b/docs/manual/configuring.rst @@ -481,10 +481,11 @@ The following are all the available proxy options -- only ``coll`` is required:: The HTTP/S functionality is provided by the separate :mod:`wsgiprox` utility which provides HTTP/S proxy routing to any WSGI application. -Using ``wsgiprox``, pywb sets ``FrontEndApp.proxy_route_request()`` as the proxy resolver, and this function returns the full collection path that pywb uses to route each proxy request. +Using `wsgiprox `_, pywb sets ``FrontEndApp.proxy_route_request()`` as the proxy resolver, and this function returns the full collection path that pywb uses to route each proxy request. The default implementation returns a path to the fixed collection ``coll`` and injects content into ```` if ``use_head_insert`` is true. The default banner is inserted if ``use_banner`` is set to true. + Extensions to pywb can override ``proxy_route_request()`` to provide custom handling, such as setting the collection dynamically or based on external data sources. -See the `wsgiprox README `_ for additional details on how it works. +See the `wsgiprox README `_ for additional details on setting a proxy resolver. For more information on custom certificate authority (CA) installation, the `mitmproxy certificate page `_ provides a good overview for installing a custom CA on different platforms. diff --git a/docs/manual/memento.rst b/docs/manual/memento.rst index 8b128eb6..541a8453 100644 --- a/docs/manual/memento.rst +++ b/docs/manual/memento.rst @@ -14,7 +14,7 @@ TimeMap API The timemap API is available at ``//timemap//`` for any pywb collection ```` and ```` in the collection. -The timemap (URL-T) can be provided in several output formats, as specified by the ```` param: +The timemap (URI-T) can be provided in several output formats, as specified by the ```` param: * ``link`` -- returns an ``application/link-format`` as required by the `Memento spec `_ * ``cdxj`` -- returns a timemap in the native CDXJ format. diff --git a/requirements.txt b/requirements.txt index f86c8c1c..f877ac8c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -12,4 +12,4 @@ webencodings gevent==1.2.2 webassets==0.12.1 portalocker -wsgiprox>=1.5.0 +wsgiprox>=1.5.1