* eval fix: instead of rewriting to 'WB_wombat_eval', rewrite to 'self.eval' for non-top-level eval
the wombat object will handle rewriting the eval arg on 'self.eval'
tighten rewriting for top-level 'eval', add additional tests
part of fix for #663
* rewrite wrap: add extra {, } to avoid collisions, as suggested in webrecorder/wombat#72
eval rewrite: exclude ',eval' as more likely than not causing a false positive, as per #643
* update to latest wombat 3.3.0 with corresponding fixes
* add support for custom data being added via 'PUT /<coll>/record' when in recording mode and 'enable_put_custom_record: true' set in 'recorder' config
- url specified via 'url' query arg and content type via request Content-Type
- update docs for put custom record options
* bump version to 2.6.0b4
update CHANGES
comment out default locales in config.yaml
only show warning for installing i18n extra when locales actually specified in config
bump to 2.6.0b3
* more locale fixes:
- fix running wb-manager w/o i18n dependencies
- dependencies: move babel to extra_requires, show warning if locale used or 'wb-manager i18n' called and i18n are not installed
- not found page: don't language switch header banner on nested content frame
- replace erroneous/outdated `/coll-cdx` API endpoint
by default API endpoint `/<coll>/cdx`
- if clear from preceding context: reduce examples
to params only `?url=...¶m1=...`
* localization / doc fixes:
- add missing header.html
- docs: support 'i18n' extra, mention in docs
- use 'default_locale' for html lang tag
- access control docs: fix documentation for adding user with acl command
* localization: add compile_catalog after extract as well to simplify updates for identity (en) locale
* ui:
- include locale in home page collection listing
- keep locale on error page home link
* autoescape:
- ensure jinja2 templates are autoescaped to prevent xss issues (thanks @sebastian-nagel for suggested fix)
- ensure banner inserts are not double-escaped
- update tests for template autoescaping
* update CHANGES.rst
* bump version to 2.6.0b1
* add localization utilities:
- add locmanager to support extract, update, remove, list using pybabel
- add po2csv/csv2po conversion with translate-utils
- docs: add localization.rst to manual!
* add language switch header (via header.html) to all pages if more than one locale is present.
* localization: wrap more text strings in templates in existing templates
* docs:
- document `wb-manager i18n` commands
- mention `<html lang>` setting
- include csv example
- add info about adding localizable text in templates
* add localization to CHANGES
* embargo: add support for per-collection date range embargo with embargo options of 'before', 'after', 'newer' and 'older'
'before' and 'after' accept a timestamp
'newer' and 'older' options configured with a dictionary consisting of any combo of 'years', 'months', 'days'
add basic test for each embargo option
* acl/embargo work:
- support acl access value 'allow_ignore_embargo' for overriding embargo
- support 'user' in acl setting, matched with value of 'X-Pywb-ACL-User' header
- support passing through 'X-Pywb-ACL-User' setting to warcserver
- aclmanager: support -u/--user param for adding, removing and matching rules
- tests: add test for 'allow_ignore_embargo', user-specific acl rule matching
* docs: add docs for new embargo system!
* docs: add info on how to configure ACL header with short examples to usage page.
sample-deploy: add examples of configuring X-pywb-ACL-user header based on IP for nginx and apache sample deployments
* docs: fix access control page header, text tweaks
* bump version to 2.6.0b0
* post append improvements:
- parse json primitives for post query
- for text/plain, attempt to parse as json, then as binary
- standardize post append indexing
- include '__wb_method' in urlkey
- add 'requestBody' and 'method' to cdxj
- support unique dupe params for json-to-query conversion
* test fixes:
- update tests for test_inputreq,
- update post-test.cdxj and post-test.cdx
* ci: fixes
- tox: run full test suite!
- disable appveyor
* inputrequest buffering fix:
- never truncate reading POST request, must read entire POST data to avoid hung request in live mode
- truncate final query string to 4096
The field is unfortunately misnamed compressedendoffset in XML but OWB
actually uses this for the compressed length 'S' CDX field.
Without this field when WARC files are accessed over HTTP pywb will make
open byte range requests which results in a lot more data being read
from disk than necessary.
* FrontendApp: forward HTTP status of CDX backend to allow clients
to handle errors more easily
* Handle CDXExceptions properly, returning the exception status code
- make that CDXException is raised early so that it can be handled
in the IndexHandler
* FrontendApp: forward HTTP status of CDX backend to allow clients
to handle errors more easily
* WarcServer: keep the HTTP status lines short
- append the exception message only if the status isn't a string
(WbException and inherited classes already have nice status string)
- avoid overlong status lines, eg.
HTTP/1.1 404 Not Found No Captures found for: https://very-long.url/...
* Add unit test to verify whether ACL exact-match rules in a single-line
*.aclj file are found
* Fix AccessChecker to match exact rules in a single-line rule file
* dedup improvements on top of #597, work towards patching support (#601)
- single key 'dedup_policy' of 'skip', 'revisit', 'keep'
- optional 'dedup_index_url', defaults to redis urls
- support for 'cache: always' to further add cacheing on all requests that have a referrer
- updated docs to mention latest config, explain 'instant replay' that is possible when dedup_policy is set
- add check to ensure only redis:// URLs can be set for dedup_index_url for now
- config: convert shorthand 'recorder: <source_coll>' setting string to dict, don't override custom config
* rules: updated rule to fix replay of latest youtube watch and embed pages
include youtube-nocookie variant
fixes#607
part of fix for webrecorder/browsertrix-crawler#4
* rules: additional rules fix for vimeo