- improved worker rewriting: updated worker rewriting handles non-blob urls, added SharedWorker override
ww_rw.js:
- updated to be a much more complete rewriting system: overrides for importScripts, and fetch
content_rewriter.py:
- added wkr_ mod for handling Worker/SharedWorker, follows convention of service worker
test_content_rewriter.py
- added test for content rewriting of Worker/SharedWorker
* bump version to 2.0.5
* regexrewriter: work on splitting rules into separate class hierarchy from rewriter.
rules logic and regexs can be inited once, while rewriter is per response being rewritten
* regexrewriter: refactor remaining rewriters to use a shared rules factory to avoid reiniting rules
* fix spacing
* fixes: ensure custom rules added first, fix fb rewrite_dash
content_rewriter tests: update tests to check with location-only and js obj proxy rewriter, check fb dash rewriter
* simplify JSNoneRewriter
New integration tests using webrecorder-tests:
- WR_TEST=true is set for integration test run (only run with py3.6, excluded for py2.7, 3.5)
- Added .travis directory that includes two scripts: install.sh and test.sh.
- install.sh handles all installation and test.sh handles running of unit or integration tests
- sudo: true required to run headless chrome
- Removed strict version limit (1.2.2), using latest gevent
- changed the import "gevent.wsgi" to "gevent.pywsgi" (needed in latest gevent)
- Installing with extra requirement gevent[dnspython] (existing dns resolver in gevent considered deprecated)
* Updated html_rewriter.py to account for rewriting of script[src] values that are super relative (http://fotopaulmartens.netcam.nl/vucht.php) and added link rel='import' rewriting
Updated test_html_rewriter.py for super rel script[src] rewriting and link rel='import'
Updated wombat to account for the new rewriting of script[src] (http://fotopaulmartens.netcam.nl/vucht.php)
Changed the postMessage override in wombat to use $wbwindow rather than window to fix google calendar replay / recording (http://qasrcc.org/events/calendar/)
* Updated tests for forcing absolute and fixed merge conflicts
* wombat: extracted removal and retrieval of __wb_original_src into own functions
* self-redirect fix for multiple continuous 3xx responses: if after one self-redirect, next match is also a redirect where url canonicalizes to same as previously rejected, also treat as self-redirect
tests: add new test_self_redirect for generating example pattern where self-redirect could occur
* self-redirect: ensure warc records are closed when handling self-redirect exception!
* service worker rewrite work:
- use sw_ modifier to add Server-Worker-Allowed: <domain root>
- force scope if none set to domain url
- resolve sw url to absolute url
* wombat: don't reinit wombat paths if already inited (eg. from imported documents)
* service-worker rewrite test: add test to verify sw rewrite is identity, Service-Worker-Allowed header is added
- renamed obj to this_obj to reflect that we using the deproxied this
- use this_obj rather than window in the first if block that populates
the from variable in order to match the logic in pm_origin and
because proxy_to_obj returns raw this if not proxy
- treat as jsonp if url query contains 'callback=jsonp',
- fuzzy match query containing 'callback=jsonp'
- tests: add test for additional jsonp matching
Updated rewrite modifiers for server-side rewriting of `link rel='preload' as='x'`
Added client-side rewriting of `link rel='[preload|import]' as='x'`
Added helper method for determining the correct rewrite modifier to be used in client-side rewriting and updated duplicate modifier logic in wombat
Added Element.insertAdjacentElement override and added special case rewriting of nested elements in insertAdjacentElement and Node.[appendChild|replaceChild|insertBefore]
Add MouseEvent override to account for the view argument which is windowProxy
Fixed implicit variable declaration that resulted in global pollution and possible variable collisions in rewriting logic
Updated wb_unrewrite_rx to now consider protocol and host as optional to fix imgur
Nit document.[write|writeln] override: rather than using Array.apply then Array.join we now use just Array.join as it works on array like objects
- server-side: rewrite '}(this)' or '})(this)' with js object proxy override convert
- client-side: fix typo in 'onstorage' override, fix typo that prevented SameOriginListener() from being used -- ensure
custom 'onstorage' events only sent to original window
* Add representation for Amf requests to index them correctly
* rewind the stream in case of an error append during amf decoding. (pyamf seems to have a problem supporting multi-bytes utf8)
* fix python 2.7 retrocompatibility
* update inputrequest.py
* reorganize import and for appveyor to retest
- if targetOrigin is the replay host, default to unrewritten from origin, not '*'
- don't set targetOrigin to 'null' or empty to avoid errors
- if target window's unrewritten origin is actually 'null' or '', don't pass message at all, and don't set to '*' -- represents actual behavior,
as postMessage to 'null' origin (about:blank page) will be received only if targetOrigin is already '*'.
(Origin header received will be the pywb host, using Referer will result in more accurate Origin, which may not be the target url)
tests: add tests to verify Origin header with and without Referer
tests and LiveIndexSource improvements:
- run local instance of httpbin in separate gevent server for any httpbin.org requests
- LiveIndexSource: has overridable get_load_url(), also use 'load_url' for HEAD check, remove unused proxy_url
- test update: add HttpBinLiveTests which patches LiveIndexSource.get_load_url() to redirect httpbin requests to local instance
- test update: just use httpbin.org/get instead of httpbin.org/anything, unsupported in older version (0.5.0) require for windows support
- setup: add 'httpbin==0.5.0' to test requires, remove jinja2 pin to old version
* proxy mode options: #316
- add 'use_banner' option, if false, will disable standard banner.html from being added
- add 'use_head_insert' option, if false, will disable injecting head_insert.html in proxy mode
both options default to true
* docs: add docs for new proxy options
* also add 'override_route' option and docs for extending proxy routing
geventserver: use custom handler to set raw 'REQUEST_URI' when running default gevent wsgi server. (uwsgi already sets REQUEST_URI)
testing: add REQUEST_URI check to proxy tests as real server is being used (webtest tests decodes %-encoding)
bump version to 2.0.4
- fix memento aggregation if timeout is 0.0
- use default timeout (5.0), instead of default to 0.0 and failing
- add 'timeout' property to warcserver aggregation tests
- docs: mention property in warcserver docs also
* redisindex: use decode_resposes=True for redisindex
* recorder: close_file(): return true if closed, close_key() return filename if closed
* logging: if debug=True, log warc load failures
* appveyor build fix: add pypiwin32 as dependency for windows build