1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

2205 Commits

Author SHA1 Message Date
Ilya Kreymer
02cc7035e8
query: fix query for IE11, don't use ES6 syntax, add URL polyfill (#514) 2019-10-31 17:09:42 -07:00
Yvan
8baa8cbdb7 docs: fix doc typo in BaseWarcServer example (#507) 2019-10-31 17:09:25 -07:00
Ilya Kreymer
fed3263ac6
Docs: Fix access controls and ui customizations docs links (#513)
* docs: ensure docs added to access controls, fix typos

* begin changelist for 2.4.0
2019-10-31 16:56:36 -07:00
Ilya Kreymer
6f79840b79
Docs, custom metadata improvements (#509)
* metadata/coll_config: don't confuse user metadata with collection config, don't display collection config settings as metadata (ukwa/ukwa-pywb#47)
- for collection template, add separate 'coll_config' dict, keep user metadata only in 'metadata' dict (default to empty)
- for static collections, assume metadata is in the 'metadata' dict of collection config
- for dynamic collections, load metadata.yaml into 'metadata' dict
- ensure 'metadata' key is passed to frame_insert
- ensure 'metadata' added consistently in framed and non-framed mode
- tests: update tests to ensure metadata is added consistently

- fuzzymatch: don't match 204 OPTIONS responses, update fuzzymatcher test

* documentation
- add documentation for metadata in ui-customization, rebuild docs, 
- add link to ui customization from configuring
- work on access control docs
* fixed small typo's in ui-customization.rst
* frontendapp: fix doc string

- misc: remove warning on urllib3 Retry init

- set version to pywb 2.4.0rc0

Co-Authored-By: John Berlin <n0tan3rd@gmail.com>
2019-10-27 01:39:52 +01:00
John Berlin
35004c1675 Fixed calendar view dropping query parameters by using encodeURIComponent fixes #510 (#512) 2019-10-26 09:25:13 +01:00
Ilya Kreymer
59b735ee99
tests: fix all tests for updated to webenact, use https when possible for webenact and example page tests (#511) 2019-10-26 09:03:25 +01:00
Ilya Kreymer
9ce324212a
Merge pull request #453 from webrecorder/ukwa-merge
Merge ukwa/pywb changes into mainline!
2019-10-08 14:13:44 -07:00
Ilya Kreymer
dc30c890a6 enable new transclusion system for tests (not enabled by default) v-2.4.0-beta 2019-09-11 09:34:57 -07:00
Ilya Kreymer
2f6fb74ea1 bump version to 2.4.0 2.4.0-beta 2019-09-11 09:17:41 -07:00
Ilya Kreymer
a3294c8b25 fix exception handling:
- don't rethrow HTTPException from WbException
- catch RequestRedirect to issue 307 redirect, check referrer
- tests: add referrer redirect tests with missing slash
defaults: don't enable new transclusions by default
2019-09-11 09:03:55 -07:00
John Berlin
802b9fa4f5
apps:
- frontendapp.py: restored the pulling out of collection route creation into its own function
 - rewriterapp.py: reformated file and added documentation

 utils:
  - geventserver.py: added documentation
  - wbexception.py: updated documentation
2019-09-10 14:45:05 -04:00
John Berlin
379f7de1ba
manual
- split out the ui customization documentation into its own file ui-customization.rst
 - added initial documentation covering the new template setup to the ui-customization.rst
2019-09-05 18:13:12 -04:00
John Berlin
d6ab31d529
templates:
- migrated proxy templates to use new template setup
2019-09-05 16:41:14 -04:00
John Berlin
5ab97a41c2
templates:
- not_found.html: removed un-needed closing div
2019-09-04 15:39:47 -04:00
John Berlin
69f7f02006
static files:
- re-formatted: default_banner.js, queryWorker.js, search.js, wb_frame.js
2019-09-04 14:59:50 -04:00
John Berlin
ae78a955de
templates
- base.html: removed including the query pages query.css in every page
 - query.html: include query.css in head block
2019-09-04 14:57:09 -04:00
John Berlin
e34606cecb
static files:
- formatted them according to project
 - query.js: ensured correct timestamp to date function is used
templates:
 - head_insert.html: is_framed check is no longer a string it is a boolean, corrected redirect check
tests:
 - test_html_rewriter.py: added missing rewrite modifier test checking i.style containing a background image html encoded
 warcserver:
  - added missing quote_plus import and cleaned up imports
2019-09-04 14:28:54 -04:00
John Berlin
61b6ff21e1
added missing comma to setup.py's tests_require list
removed package.json from project as it is no longer required
removed npm install command from .travis/install.sh
2019-09-04 13:41:56 -04:00
John Berlin
8d98b9111e
added additional code documentation in order to meet the documentation requirements of pywb 2019-09-03 18:40:35 -04:00
John Berlin
9a40d29ac3
added lxml requirments entry to extra_requirments.txt and documented pywb.warcserver.index.indexsource.XmlQueryIndexSource 2019-09-03 18:39:31 -04:00
John Berlin
41c37129c0
documented and cleaned up the aclmanager.py2 2019-09-03 18:37:46 -04:00
John Berlin
1a7fdd0d70
documented and cleaned up the aclmanager.py 2019-09-03 18:37:45 -04:00
Ilya Kreymer
ce10d9af7c
docstrings: add docstrings, remove duplicate call, cleanup ACLManager init 2019-09-03 18:37:45 -04:00
Ilya Kreymer
e04adea7a8
transclusions/augmentations: add new video/audio translcusions script
- enabled with 'transclusions: 2' (default) config option
- legacy flash-supporting transclusions script (still working) available via 'transclusions: 1' or enable_flash_video_rewrite option
- add transclusions.js with support for poster image
- legacy vidrw: don't add undefined url as source
- locatization: wrap text in not_found.html to be translatable
2019-09-03 18:37:15 -04:00
Ilya Kreymer
7ac9a37bb4
acl: support for exact acl rules via '###' suffix
- ex: rule 'com,example)/###' matches http://example.com/ only
- wb-manager acl add/remove --exact-match adds/remove exact match rules
- tests: add tests for exact match queries, acl
2019-09-03 18:37:14 -04:00
Ilya Kreymer
3589240431
ui template overhaul to simplify customization:
- add base.html template with head, header, footer optional customizations
- refactor all top-level templates to extend base.html, except frame_insert.html
- localization: add placeholder support for jinja2 localization extension, '{% trans %}' and _('') tags, placeholder null localization
- refactor new query UI to support localization
- update some text to match localized versions used in ukwa-pywb, update test
2019-09-03 18:37:14 -04:00
Ilya Kreymer
1b0c9c6895
misc fixes from merge:
- xmlqueryindexsource: fix typo, improve tests to be more clear with url encoding
- exceptions: move UpstreamException and AppNotFound to wbexceptions
- docker: ensure sample_archive is added to Dockerfile still
- yaml: use python Loader to support custom intrepolation of env vars
- content rewrite: ensure custom exceptions passed up to frontendapp
2019-09-03 18:30:42 -04:00
Ilya Kreymer
42b8c3a22b
merge: additional fixes after merge of ukwa/pywb and 2.2
rewrite: remove custom modifiers for now, use oe_ for non-import css embeds
bump version to 2.3.dev0
2019-09-03 18:26:09 -04:00
Ilya Kreymer
e92b1969e8
xmlindexsource: fix tests for double escaping of query (for ukwa/ukwa-pywb#29) 2019-09-03 18:24:03 -04:00
Andrew Jackson
cb3d1196f2
Use space and let quote_plus encode to plus (and avoid it becoming %2B). For ukwa/ukwa-pywb#29. 2019-09-03 18:24:02 -04:00
Andrew Jackson
2a30731a0c
Log query being executed. 2019-09-03 18:24:02 -04:00
Andy Jackson
c00f30e897
Double-quoting XmlQueryIndexSource lookups for #29
The OpenWayback reference implementation of this API relies on doubly-escaped queries. This change should bring this implementation into line with OutbackCDX and OWB's original API.
2019-09-03 18:24:02 -04:00
Ilya Kreymer
54a4e38531
memento 404 fix: ensure timemap only includes memento headers on success 200 response
fuzzy match limit: add 'fuzzy_search_limit' option to default_filters in rules.yaml
default fuzzy matching search limit to 100 results to avoid timeouts for large result sets that don't have any matches
2019-09-03 18:24:01 -04:00
Ilya Kreymer
0a9ad5c8dc
timemap format fix: fixes ukwa-pywb/pywb#37
- ensure timemap returns full url-m warcserver supports 'memento_format' param which, if present, specifies
full format to use for memento links in timemap
- memento tests: timemap tests include full url-m, test both framed and frameless timemap responses
2019-09-03 18:24:01 -04:00
Ilya Kreymer
3868f5b915
fix typo: undo unintended change from warning in earlier commit, stick with 'not items', fixes issue with 404s mentioned in ukwa/ukwa-pywb#39 2019-09-03 18:24:01 -04:00
Ilya Kreymer
5da6122d83
memento timemap fix: further fix for ukwa/ukwa-pywb#37
- fix timemap in 'redirect-to-exact' mode, (ensure timegate redirect condition applies only to top-frame)
- tests: add additional timemap tests, with and without exact redirect
2019-09-03 18:24:00 -04:00
Ilya Kreymer
c65f66e03a
acl optimize/fixes:
- optimize 'wb-manager acl match' command to not load entire file before matching
- acl match <coll_or_file): if 'coll_or_file' exists as file, use it, don't check if auto-collection exist
2019-09-03 18:24:00 -04:00
Ilya Kreymer
9b2ae35b93
acl optimization: fixes ukwa/ukwa-pywb#39
- don't parse json on every aclj line until key prefix matches, resulting in speed boost!
- convert aclj to dict (via cdxobject) only when match is found (disable aggregator source tracking)
2019-09-03 18:23:59 -04:00
Ilya Kreymer
ce0ed610bd
memento-fix: fix for ukwa/ukwa-pywb#37.
- support memento timegate on top-frame (when no timestamp is provided)
- treat top-frame no-timestamp url as canonical timegate
- tests: update tests, add memento redirect mode tests for timegate, timegate with accept-dt header
2019-09-03 18:19:59 -04:00
Ilya Kreymer
0c08b9b5d5
acl optimization: addresses ukwa/ukwa-pywb#38
- stop checking acl rules linearly if acl key < tld
- use existing rule for same url (at least until date-range checking)
2019-09-03 18:13:20 -04:00
Andrew Jackson
60ad1739b7
Moar prints. 2019-09-03 18:13:20 -04:00
Ilya Kreymer
b8124e3931
lxml query parsing fix: (addressing part of ukwa/ukwa-pywb#38)
- ensure lxml-enabled parsing in XmlQueryIndexSource works by passing the raw bytestring instead of unicode text to the parser
- tests: add lxml and non-lxml parsing tests to test_xmlquery_indexsource.py, add lxml to test install
- misc fixes: fix typo in banner.html, update gevent api to support latest gevent
2019-09-03 18:13:19 -04:00
Andrew Jackson
8bf2f9debb
Added some print statements for debugging. 2019-09-03 18:12:28 -04:00
Ilya Kreymer
465195f203
static path prefix fix to support non-root pywb deployment:
- store original wsgi SCRIPT_NAME (before collection path is pushed)
- add 'static_prefix' jinja env global which defaults to original prefix + /static/
- update existing templates to use '{{ static_prefix }}' instead of '{{ host_prefix }}/{{ static_path }''
- set 'pywb.host_prefix' via rewriterapp, set 'static_prefix' to absolute url if available (to support proxy mode)
2019-09-03 18:12:28 -04:00
Ilya Kreymer
af3e9c6293
error reporting: ensure NotFoundException used for replay not found errors! 2019-09-03 18:08:35 -04:00
Ilya Kreymer
43537fead3
error messaging: app path not found use default error.html template
- add AppPageNotFound() exception to differntiate app-level not found path from replay content not found
- add custom error messages for collectino not found and static file not found
tests: add tests for collection not found and static file not found errors
2019-09-03 18:08:35 -04:00
Ilya Kreymer
f30b280437
self-redirect check: run redirect check if status code is blank or does not start with 2, 4, 5,
to more aggressively check invalid status codes, should fix ukwa/ukwa-pywb#21
2019-09-03 17:59:09 -04:00
Ilya Kreymer
871cef26a8
proxy mode and prefer header: (ukwa/ukwa-pywb#16)
- fix proxy mode when 'redirect_to_exact=True' is set config, don't redirect in proxy mode
- more general prefer support, moved to content_rewriter to support preference<->mod mappings
- add 'banner-only' preference mapped to bn_ modifier
- proxy mode: allow 'raw' and 'banner-only' preferences
- proxy mode: 'Prefer: rewritten' forced to 'banner-only', served with 'Preference-Applied: banner-only'
- tests: test proxy with prefer header, 'redirect_to_exact=True', add 'banner-only' to Prefer header tests in rewriting mode
2019-09-03 17:59:09 -04:00
Ilya Kreymer
a301dda0fb
memento prefer header improvements: (ukwa/ukwa-pywb#12)
- support Prefer on top-frame url in framed mode, Prefer check runs before custom response
- update Prefer test fixtures to test framed vs frameless and no-mod vs mp_ modifier, all combinations
2019-09-03 17:59:08 -04:00
Ilya Kreymer
5364275ef5
memento prefer header: add support for Prefer header for specifying 'raw' or 'rewritten' mementos (ukwa/ukwa-pywb#12, based on mementoweb/rfc-extensions#6)
- 'enable_prefer: true' in config can be used to enable experimental Memento Prefer behavior
- Prefer header support both redirect and non-redirect style negotiation, extending existing Memento patterns
- Prefer header can be applied both on memento and timegate endpoints
- for redirect style negotiation, Prefer results in a redirect to final memento (if needed), both on Timegate and URL-M (Memento Pattern 2.3)
- for non-redirect style negotiation (Memento Pattern 2.2), Prefer header affects content being served and changes the Content-Location to the canonical representation
- Vary: Prefer and Preference-Applied headers always added to URL-M and Timegate responses
2019-09-03 17:59:08 -04:00