pywb 0.11.5 changelist
~~~~~~~~~~~~~~~~~~~~~~
* cdx index bug fix: fix bug with cdx indexing with post-append when WARC request and response records do not alternate in the WARC.
* load yaml config: ensure file stream gets closed.
* zipnum: resolve paths specified in zipnum .loc file relative to the .loc file, not to application root.
pywb 0.11.4 changelist
~~~~~~~~~~~~~~~~~~~~~~
* wombat: overrides ``window.crypto.getRandomValues()`` to use predictable 'random' values for improved
replayability in many JS applications.
* fix gevent/uwsgi: run ``gevent.monkey.patch_all()`` explicitly when loading ``pywb.apps.wayback`` if ``GEVENT_MONKEY_PATCH=1`` env var is set. Set by default in ``uwsgi.ini`` for use with uwsgi. (Was previously relying on uwsgi ``gevent-early-monkey-patch`` but this flag is not yet available until uwsgi 2.1 is released).
pywb 0.11.3 changelist
~~~~~~~~~~~~~~~~~~~~~~
* rewrite: fix typo in ```` rewrite (modifier was not being set)
pywb 0.11.2 changelist
~~~~~~~~~~~~~~~~~~~~~~
* Rewriting: if no charset specified in original page, don't add charset to allow browser to detect.
* Rewriting: rewrite ```` attribute if it is a url.
* wb.js: pad shorter timestamp to 14 digits.
* Indexing: fixed exception when indexing empty files.
pywb 0.11.1 changelist
~~~~~~~~~~~~~~~~~~~~~~
* WombatLocation: overriden properties (href, host, etc...) are enumerable to match Location to support cloning methods.
* WombatLocation: reload() override now works.
* Proxy: Custom ``Pywb-Rewrite-Prefix`` allows adding a custom prefix for proxy mode rewriting
* Proxy: Better error for invalid collection in ip resolve mode
* Warc Indexing Refactor: Allow custom iterators to buffer payload by overriding ``create_payload_buffer()`` to return a writable buffer.
pywb 0.11.0 changelist
~~~~~~~~~~~~~~~~~~~~~~
* New client-side test system for Wombat.js in place using Karma and SauceLabs with initial set of tests and travis integration.
* Wombat Improvements:
- Better Safari/IE support: accessors overriden only when actually supported in browser, override gracefully skipped otherwise
- Use ``getOwnPropertyDescriptor()`` to get properties in addition to ``__lookupGetter__``, ``__lookupSetter__``
- ``baseURI`` overriden on correct prototype
- ``CSSStyleSheet.href`` override
- ``HTMLAnchorElement.toString()`` override
- Avoid making ``.href`` read-only
* Proxy Mode Improvements:
- To avoid breaking HTTPS envelope, if no content-length provided, chunked encoding is used (HTTP/1.1) or response is buffered and content-length is computed (HTTP/1.0)
- Rewriter: Scheme-only rewriter converts embedded urls to http or https to match the scheme of containing page.
- IP Resolver: Supports IP cache in Redis
- Default resolver set to cookie resolver, eg. ``cookie_resolver: true`` is the default.
- Collection/datetime switching options removed from UI when auth or ip resolvers.
* Encoding: Use webencoding lib to better encode head-insert to match page encoding
* Live Proxy: Support for explicit recording mode, decoupled from using http/https proxy. Enabled when ``LiveRewriter.is_recording()`` is true. By default, http/s proxies imply recording but can be overriden in derived class.
* Rewriting: Convert relative urls for ``rel=canonical`` to absolute urls, even if not rewriting to ensure correct url.
* UI: Use custom webkit scrollbars to minimize scrollbar-in-iframe issues that sometimes occur in Chrome.
* Memento Improvements:
- ``/collinfo.json`` by default returns a JSON spec for all collections as Memento endpoints, in a format compatible with MemGator.
- ``Add /collinfo.json`` endpoint customizable via ``templates/collinfo.json`` and must be enabled with ``enable_coll_info: true``
- 'Not Found' error for timemap query returns empty timemap instead of standard HTML 404.
* WARC Indexing:
- Better detection of content-length < payload, skip to next record boundary and warn, if possible.
- Use ujson if proper version (without forward-slash escaping) is available when writing CDXJ
pywb 0.10.10 changelist
~~~~~~~~~~~~~~~~~~~~~~
* extensible BlockLoadres: supported 'http', 'https', 's3' and local file system, additional
loaders can now be registered by scheme.
* rewriting fixes:
- wombat: fix occasional style rewrite bug that resulted in leaks.
- strip leading or trailing spaces in url
- charset: default to utf-8 if unknown charset specified in HTML
* live rewrite: LiveRewriter class overridable in config
* WARC indexing: ignore empty records when indexing and continue, rather than stopping at first empty record.
* tests: refactor integration tests to run signficantly faster.
* cdx-indexer
pywb 0.10.9.1 changelist
~~~~~~~~~~~~~~~~~~~~~~
* wombat: fix relative '/' rewrite which incorrectly handles rel scheme '//' urls
pywb 0.10.9 changelist
~~~~~~~~~~~~~~~~~~~~~~
* IPProxyResolver: Support new simple proxy resolver where collection and timestamp stored in server-side cache by IP and set via a rest api through `pywb.proxy` eg: ``curl -x "localhost:8080" http://pywb.proxy/set?ts=2015&coll=all``. No cookies or proxy auth needed in this mode. Useful for Docker-based deployments where virtual IP is fixed. Enabled with ``cookie_resolver: ip`` in ``proxy_options``.
* CDX Server: Add support for timestamp-bounded queries CDX queries ``from=`` and ``to=``, also support calendar query with (inclusive) ranges, eg. ``/2010-2015/example.com``, ``/2010-/example.com/``, ``/-2015/example.com/``.
* Proxy options: add ``use_banner`` to toggle banner insert, and ``use_client_rewrite`` to toggle wombat rewriting in proxy mode. (Client rewriting requires banner insert).
* Proxy and Video: When in proxy mode, load youtube-dl video info via proxy magic host `pywb.proxy`, and ensure CORS support.
* Rewrite: ensure ```` tag has trailing slash, or add ```` with trailing slash for host-name only urls, eg: ``http://localhost:8080/example.com``
* Rules: improved blogspot nav and yt rules, rule file cleanup
* Wombat 2.9 improvements, including:
- improved handling of relative paths, '..', '.', '/'
- better support for proxy mode, avoid cross-origin top-frame issues
- rewrite_html() (document.write) override only if any html changed
- improved form action rewrite
- improved rewriting in 'root collection' mode
pywb 0.10.8 changelist
~~~~~~~~~~~~~~~~~~~~~~
* Rewrite: url attribute entity unencoding only if attr starts with 'http', catch any exceptions.
* Fix top frame detection to avoid occasional banner insertion into intermediate frames.
* Fix special case ``href = "."`` rewriting.
pywb 0.10.7 changelist
~~~~~~~~~~~~~~~~~~~~~~
* wombat 2.8 improvements, including:
- cookies: fixed rewriting with respect to comma, proper path and domain replacement
- form action and textContent rewriting
- document.write() improvements, buffering split tag and removing extraneous end tag
- document.writeln() rewriting
- object data attr conditional rewriting
- proper ``setAttribute("style", ...`` rewriting
- style rewrite regex now case-insensitive
* 10-field CDX format fully supported.
* rewrite: "background" attr rewriting, proper rewriting of entity-encoded attributes.
* Fix for regression for Vimeo videos that were recorded as Flash but replay as HTML.
pywb 0.10.6 changelist
~~~~~~~~~~~~~~~~~~~~~~
* Disable url rewriting in JS by default! No longer needed due to improved client side rewriting of all urls.
* wombat 2.7 more rewriting improvements:
- ``document.write`` override rewrites all elements, not just one top level elements.
- iframe ``srcdoc`` also rewritten.
- support for custom modifiers, such as ``js_`` for ``SCRIPT`` tag rewriting, otherwise for element overrides.
- improved css rewriting, override standard css attributes on ``CSSStyleDeclaration`` to avoid mutation observers, rewrite ``STYLE`` text content.
- ``postMessage``: original ``source`` window now also preserved along with origin.
- cookie rewrite: don't remove expires, but adjust by date offset. Allow cookies to be deleted by setting to expired date.
* Embed mode, pywb framed replay can now be embedded in an iframe when ``embeddable: True`` option is set. ``postMessage`` on framed replay proxies between replay frame and embedded frame, and ``window.parent`` is not set to top replay frame, allowing access to containing frame.
* vidrw: don't replace video with generic swf, find better match.
* path index loader: ensure each request handled by own file reader.
pywb 0.10.5 changelist
~~~~~~~~~~~~~~~~~~~~~~
* wombat 2.6 client side rewriting improvements:
- Override JS prototype getters and setters on ``href`` and ``src`` attributes of standard HTML elements, so that JavaScript access receives and sets the original url, but the element actually contains the rewritten url internally.
- For ```` element override other url properties ``href``, ``hostname``, ``host``, ``pathname``, ``origin``, ``search``, ``port``, ``protocol``
- Improved ``postMessage`` emulation: Ensure the original ``origin`` of the caller is saved, by wrapping ``X.postMessage`` in a special ``X.__WB_pmw(window).postMessage()`` call which will save origin of current window in X. Store origin and destination hosts.
- Improved ``message`` listener emulation: Add filtering to skip messages that were not inteded for destination host.
- Restored wombat if wiped by ``document.write`` / ``document.open`` (happens on FF).
- When rewriting html for ``document.write``, keep ````, ````, ```` tags in rewritten html.
* Relative urls rewritten to stay relative, eg. ``/path/file.html`` -> ``/coll/http://example.com/path/file.html``
Can be disabled with ``no_match_rel=True`` in ``rewrite_opts``.
* Optional ``force_html_decl`` option to add a ```` or other HTML declaration if none is present.
* Improved handling for `redir_to_exact=False`` mode. When set, no redirect on memento timegate, and serve ``Content-Location `` headers for actual memento, in conformance with Mememnto RFC Pattern 2.2 (http://tools.ietf.org/html/rfc7089#section-4.2.2)
* Proxy Mode Fixes: Ensure ``Content-Length`` header is always added and correct in proxy mode, needed for proper HTTPS
handling within ``CONNECT`` envelope.
* New default ``HostScopeCookieRewriter`` sets cookies with domain ``/coll/https://example.com/`` instead of ``/coll/``.
Can be specified with ``cookie_scope: host`` per collection.
This is now the default live rewrite proxy and should be much safer/secure. For rare login use cases, the collection
root scope can be specified with ``cookie_scope: coll``.
* Cookie ``Path=`` value always a relative path for all cookie scopes, previously were often absolute paths.
* Default WSGI handler for ``wayback`` back to ``wsgiref``, as ``waitress`` does not support proxy mode.
pywb 0.10.2 changelist
~~~~~~~~~~~~~~~~~~~~~~
* wombat 2.5 update -- significant wombat improvements:
- Cookies: more comprehensive client-side cookie overriding, including Path, Domain, and expires removal.
- ``WB_wombat_location`` overriden on Object prototype, defaults to ``location`` if ``_WB_wombat_location``, the actual, property is not set.
- ``WB_wombat_location.href`` proxies to actual location, responsive to ``pushState`` / ``replaceState`` location changes.
- ``.href`` and ``.src`` attributes correctly return original url in JavaScript.
- More consistent and ``lookupGetter/lookupSetter`` overrides with ``Object.defineProperty``.
- Added baseURI override, ``Element.prototype and ``document``.
- Added ``insertAdjacentHTML()`` override.
- Improved iframe override, including check for `contentDocument` changes.
- Don't rewrite urls that start with ``{``
- Frames mode: ensure hash changes synchronized between inner and outer frames.
- video: don't rewrite generic 'swf' with flowplayer
- deprefix: support deprefixing of url-encoded queries.
pywb 0.10.1 changelist
~~~~~~~~~~~~~~~~~~~~~~
- Support ``Content-Encoding: deflate`` which was not being handled.
- Fix issues with ``fallback`` handlers: A POST request could result in double read of POST input data.
- ``youtube-dl`` removed from dependency as it is only needed for live proxy. (related tests only run if ``youtube-dl`` is installed).
pywb 0.10.0 changelist
~~~~~~~~~~~~~~~~~~~~~~
* Per-collection cacheing settings: ``rewrite_opts.http_cache`` can be set to:
- ``pass`` - keep cacheing headers as-is (applies to ``Cache-Control``, ``Expires``, ``Etag`` and ``Last-Modified``)
- ``0`` - add ``Cache-Control: no-cache; no-store``
- ``N`` - add ``Cache-Control: max-age=N`` and corresponding ``Expires`` header
- None (default) -- Rewrite cache headers, effectively removing them (current behavior)
* New improved Wombat, including:
- better handling of new iframes set to ``about:blank``, add all overrides
- createElement() override (can be disabled)
- innerHTML prototype override (can be disabled)
* Rules: Improved rewriting for Google+, Twitter, YT comments
* Video: Improved support for LiveStream playlist, detect newly added