- reworked both proxy and non-proxy mode backing workers to no-longer fetch in burst mode but as sent with a maximum of 20 fetches running at a time
- added just-fetch to non-proxy mode backing worker
- updated the auto fetch worker abstraction in non-proxy mode used by wombat to exposed like in proxy mode and ensured that value property for the srcset object is used when sending rewritten srcset values to the backing worker
- combined the backing worker proxy & non-proxy mode into a single file
- added rollup config for back auto fetch worker
* proxy: add option to set default timestamp for proxy mode, fixes#452
- set via flag --proxy-default-timestamp or config 'proxy_options.default_timestamp'
- can be iso date or all-digit timestamp
- overridable via accept-datetime header
* docs: update docs for proxy timestamp
- add docs on memento support in proxy mode
* update-version: script can update version only, commit with 'update-version.sh commit'
* indexer post append: remove 'WB_wombat_' from POST query, could have been added in previous versions of pywb!
Rename:
- rename auto-fetch config to 'enable_auto_fetch' and '--enable-auto-fetch' cli param
- rename 'use_head_insert' -> 'enable_content_rewrite'
- rename 'use_banner' -> 'enable_banner'
- rename 'use_wombat' -> 'enable_wombat'
Misc Cleanup:
- enable_auto_fetch applies to both proxy and non-proxy mode
- remove setting 'wbinfo.use_wombat', implied if wombatProxyMode.js is included
- docs: add docs for auto-fetch system, improved docs for proxy rewrite options
- tests: test with enable_auto_fetch, update tests for renames
- bump version to 2.1.0 due to breaking changes
- changelist: updates to changelist
- requirements: use bounded version for gevent
- Split wombat and auto-fetch worker into two files (proxy mode and non-proxy mode)
- Renamed preservationWorker to autoFetchWorker in order to better convey what it does
- Root config file control over including wombat and auto-fetch worker in proxy or non-proxy mode
- Added additional proxy mode + auto-fetch worker only route for fetching the auto-fetch worker code nicely for CORS
- templateview: add 'tobool' formatter to more cleanly format python bools to JS 'true'/'false'
- proxy options: config and command line:
'use_auto_fetch_worker' and '--proxy-with-auto-fetch'
'use_wombat' and '--proxy-with-wombat'
- head_insert.html: only include wombat in proxy mode when use_wombat or use_auto_fetch_worker are set.
- wombatProxyMode.js: slimmed down wombat for proxy mode only including auto-fetch support.
- more consistent naming: rename 'preserveWorker' and 'autoArchive' to 'auto-fetch'
Updated tests:
- test_wbrequestresponse.py: added tests covering constructor defaults, _init_derived, options_response, json_response, encode_stream, text_stream
- test_auto_colls.py: fixed broken test test_more_custom_templates, reason using ujson now not json so spacing was off
- test_proxy.py: updated existing tests to reflect splitting wombat into proxy and non-proxy mode, added tests covering auto-fetch worker specific endpoints in proxy mode
removed duplicate addons key in .travis.yml
- test_cli.py: updated to properly test the cli with these changes
added ultrajon dep to tests_require in setup.py to reflect its usage by wbrequestresponse.py
Fully documented:
- cli.py
- frontendapp.py
- templateview.py
- wbrequestresponse.py
Removed duplicate addons key in .travis.yml
Added ultrajson dependency to tests_require in setup.py to reflect its usage by wbrequestresponse.py
Fixes#371
tests and LiveIndexSource improvements:
- run local instance of httpbin in separate gevent server for any httpbin.org requests
- LiveIndexSource: has overridable get_load_url(), also use 'load_url' for HEAD check, remove unused proxy_url
- test update: add HttpBinLiveTests which patches LiveIndexSource.get_load_url() to redirect httpbin requests to local instance
- test update: just use httpbin.org/get instead of httpbin.org/anything, unsupported in older version (0.5.0) require for windows support
- setup: add 'httpbin==0.5.0' to test requires, remove jinja2 pin to old version
* proxy mode options: #316
- add 'use_banner' option, if false, will disable standard banner.html from being added
- add 'use_head_insert' option, if false, will disable injecting head_insert.html in proxy mode
both options default to true
* docs: add docs for new proxy options
* also add 'override_route' option and docs for extending proxy routing
geventserver: use custom handler to set raw 'REQUEST_URI' when running default gevent wsgi server. (uwsgi already sets REQUEST_URI)
testing: add REQUEST_URI check to proxy tests as real server is being used (webtest tests decodes %-encoding)
bump version to 2.0.4
- don't include wombat.js in banner only mode, including in proxy mode
(instead, do set devicePixelRatio to fix certain fidelity issues)
- default_banner: set title to document.title on load when frameless, including in proxy mode
- improve docs for configuring proxy mode cert
- tests: update tests to ensure no wombat.js injected in proxy or banner-only mode
* support for 'classic' pywb features and misc improvements:
- add support for redirect to exact timestamp mode via 'redirect_to_exact: true' config setting
- tests: ensure memento headers added for redirect-to-exact
- memento: ensure Link header added for intermediate resources, check for 'enable_memento' before adding
- config: config passed to head_insert template as 'config'
- insert legacy 'vidrw.js' script if 'enable_flash_video_rewrite' config is set to true
- config: use_js_obj_proxy now defaults to true
- memento/tests: add proxy with custom accept-datetime test
* include the collection in Memento Link outputs:
- add new cdx 'source-coll' field, storing only the collection
- ensure rel="collection" property included in the TimeMap and Link header
- tests: update all tests to include the 'source-coll' property
- docs: add 'collection provenance' to auto-all collection configuration docs
- auto/dyn collections: use overridable 'index_paths' and 'archive_paths', support list for archive_paths
- all-auto collection: supported at warcserver layer via special '$all' index
- cleanup default_config.yaml and config.yaml, remove obsolete properties
- remove obsolete docker-compose.yaml
- default_config: simplify list of managed properties
- test_cli: add tests for cli options
- proxy and recorder config loaded from 'proxy' and 'recorder' string or dicts in config
- proxy settings loaded from config, wsgiproxmiddleware applied within main init path
- cli --proxy-record add to indicate recording, optional dict to set options
- optional recorder dict to configure other recorder options, file max_size, filename_template, etc..
- proxy tests: add proxy cli tests
- recorder tests: add recorder custom config test
proxy mode support readded!
- use wsgiprox wrapper in FrontEndApp.init_proxy() with fixed collection prefix, ca options
- cli --proxy <coll> flag added to specify proxy collection
- cleanup: remove cookie rw (already disabled), fix post handling paths
- headers: ensure request headers are not rewritten when in proxy mode, response headers marked with 'url-rewrite' also no rewritten if no url rewrite/proxy mode
- urlrewriter: add IdentityRewriter with no rewriting as default, instead of SchemeOnlyUrlRewriter
- memento support: for now, only include rel="original" and Memento-Datetime in for proxy replay response
- responseloader: disable urllib3 unsecure response warnings
- tests: add test for proxy replay and proxy record/replay of new collection
rename tests to test_proxy_https_cookie, test_proxy_http_cookie, test_proxy_http_auth
to allow for testing, http proxy, cookie resolver, proxy auth and https support in a more
orthogonal way
rename configs to test_config_proxy_http and test_config_proxy_https
will run (this was cumbersome to maintain and not really useful)
ReferRedirect just checks that the current request host header, if present, matches that of the referrer
and checks that the coll and script name match.
* removed proxy_pac as it was also unneeded/unused and required use of the hostpaths
* added test for invalid CONNECT usage (405 response)