1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-14 15:53:28 +01:00

46 Commits

Author SHA1 Message Date
Tessa Walsh
83b2113be2
Add config.yaml UI option to disable printing from replay banner (#815)
* Add UI option to disable printing

* Initialize VueUI.main with config dict
2023-03-27 10:23:37 -04:00
Tessa Walsh
3d0673e32a
Add ui.logo_home_url as config.yaml option (#790)
* Add ui.logo_home_url as config option
* Add ui.logo_home_url to docs
2023-01-05 17:33:00 -05:00
Tessa Walsh
6cc9cdc3ad Remove debugging cdx sources 2022-11-24 16:03:10 -05:00
Tessa Walsh
c28941a0b6 Rework Vue banner UI
- Make Vue banner responsive with Bootstrap 4
- Add previous/next year arrows to calendar
- Make navbar background, text color, and button outlines configurable
via config.yaml
- Toggle calendar and timeline separately
- Fix bug preventing title from displaying
- Make app keyboard-navigable
- Fix banner background color configuration
- Comment out vue_navbar_background_hash
- Display linear timeline tooltip centrally on enter
- Improve header styling on small screens
- Add titles to font awesome icons
- Remove old default banner (calendar retained for advanced search
  results)
- Fix TimelineLinear TypeError that broke calendar
- Bump version to 2.7.0b2
- Set Cache-Control header on CDXJ API response to mark returned CDX as
stale after 1 day
- Add commented out UI values to config.yaml to aid users
- Remove timeline and calendar card borders
- Fix issues with snapshot navigation
- Center search bar and align with buttons
- Make Vue app bfcache-ineligible: By adding an empty unload event
listener, we make pages serving the Vue app ineligible for bfcache,
which prevents unexpected behavior when navigating via the browser's
back/forward buttons.
2022-11-21 12:46:09 -05:00
Ilya Kreymer
19032e4512 misc vue + i18n fixes:
- make calendar tooltips into real links and clickable and ctrl+clickable
- ensure advanced view uses old ui (not supported in new ui)
- add language popup on frame_insert for vueui
- i18n: consolidate loc strings for vueui into vue_loc.html
- spinner: for replay, only show over banner
- spinner: show after 500ms delay
- i18n: add one more string ('no captures')
- make calendar popup links regular links to support cmd+click
- i18n: implement lang switcher in vue as well (for query and framed_insert), don't include non-vue language switcher in header
2022-11-21 12:46:09 -05:00
Ivan Velev
dc81e78393 vueui: second batch of fixes from @vanecat:
- support for 'cdx-simulator'
- add loading spinner
2022-11-21 12:46:09 -05:00
Ivan Velev
028e7102c0 initial pass on integrating vueui calendar + banner from @vanecat
- use rollup to build vue ui, build at vue/vueui.js
- calendar page renders via both /*/ and /*? queries
- banner support for framed mode only for now
- ensure banner updated in response to inner frame events
- combine cdxquery classes into vue build, expose single entrypoint VueUI.main

- initial set of vueui dev, calendar, banner, calendar hover mode

- ensure embedded mode replay is working

- config: rename ui vars to 'vue_calendar_ui' and 'vue_timeline_banner'

- dependencies: update to wombat 3.3.6

- bump to 2.7.0b0
2022-11-21 12:46:06 -05:00
Ilya Kreymer
e64e58f040
2.6.2 fix (#682)
2.6.2 release:
* fix for regression caused by 2.6.1, invalid static path #681
* add missing base.css
2021-11-12 17:51:34 -08:00
Ilya Kreymer
81308780ec
version display: add -V/--version flag to wb-manager and wayback/pywb commands to display version and exit (#654)
update CHANGES
comment out default locales in config.yaml
only show warning for installing i18n extra when locales actually specified in config

bump to 2.6.0b3
2021-06-24 11:28:48 -07:00
Ilya Kreymer
3ca765f847 add autoescapding disable to banner.html
update CHANGES
bump version to 2.6.0b2
2021-06-17 17:40:15 -07:00
Ilya Kreymer
f7bd84cdac
Localization / doc fixes (#650)
* localization / doc fixes:
- add missing header.html
- docs: support 'i18n' extra, mention in docs
- use 'default_locale' for html lang tag
- access control docs: fix documentation for adding user with acl command

* localization: add compile_catalog after extract as well to simplify updates for identity (en) locale

* ui: 
- include locale in home page collection listing
- keep locale on error page home link

* autoescape:
- ensure jinja2 templates are autoescaped to prevent xss issues (thanks @sebastian-nagel for suggested fix)
- ensure banner inserts are not double-escaped
- update tests for template autoescaping

* update CHANGES.rst

* bump version to 2.6.0b1
2021-06-14 17:09:00 -07:00
Ilya Kreymer
12fcc87962
Localization Support (#647)
* add localization utilities:
- add locmanager to support extract, update, remove, list using pybabel
- add po2csv/csv2po conversion with translate-utils
- docs: add localization.rst to manual!

* add language switch header (via header.html) to all pages if more than one locale is present.

* localization: wrap more text strings in templates in existing templates

* docs:
- document `wb-manager i18n` commands
- mention `<html lang>` setting
- include csv example
- add info about adding localizable text in templates

* add localization to CHANGES
2021-06-09 13:12:53 -07:00
Ilya Kreymer
459cd706d3 include the collection in Memento Link outputs: (#259)
* include the collection in Memento Link outputs:
- add new cdx 'source-coll' field, storing only the collection
- ensure rel="collection" property included in the TimeMap and Link header
- tests: update all tests to include the 'source-coll' property
- docs: add 'collection provenance' to auto-all collection configuration docs
2017-10-23 15:33:23 -07:00
Ilya Kreymer
1dbabef410 config: custom rules.yaml support and config improvements (addresses #176) (#257)
- allow custom 'rules.yaml' to be specified via 'rules_file' config entry,
and used by FuzzyMatcher and DefaultRewriter
- default rules file specified by DEFAULT_RULES_FILE in pywb package
- 'archive_paths' is the key for archive paths instead of 'resource'
- 'use_js_obj_proxy' not auto-added to metadata, just set per-deployment
2017-10-18 10:39:18 -07:00
Ilya Kreymer
b631a24a0e config cleanup:
- auto/dyn collections: use overridable 'index_paths' and 'archive_paths', support list for archive_paths
- all-auto collection: supported at warcserver layer via special '$all' index
- cleanup default_config.yaml and config.yaml, remove obsolete properties
- remove obsolete docker-compose.yaml
- default_config: simplify list of managed properties
- test_cli: add tests for cli options
2017-10-03 15:31:08 -07:00
Ilya Kreymer
1360723f95 Fuzzy Rules Improvements (#231)
* separate default rules config for query matching: 'not_exts', 'mimes', and new 'url_normalize'
- regexes in 'url_normalize' applied on each cdx entry to see if there's a match with requested url
- jsonp: allow for '/* */' comments prefix in jsonp (experimental)
- fuzzy rule: add rule for '\w+=jquery[\d]+' collapsing, supports any callback name
- fuzzy rule: add rule for more generic 'cache busting' params, 'bust' in name, possible timestamp in value (experimental)
- fuzzy rule add: add ga utm_* rule & tests
tests: improve fuzzy matcher tests to use indexing system, test all new rules
tests: add jsonp_rewriter tests
config: use_js_obj_proxy=true in default config.yaml, setting added to each collection's metadata
2017-08-21 11:01:31 -07:00
Ilya Kreymer
a6ab167dd3 JS Object Proxy Override System (#224)
* Init commit for Wombat JS Proxies off of https://github.com/ikreymer/pywb/tree/develop

Changes
- cli.py: add import os for os.chdir(self.r.directory)
- frontendapp.py: added initial support for cors requests.
- static_handler.py: add import for NotFoundException
- wbrequestresponse.py: added the intital implementation for cors requests, webrecoder needs this for recording!
- default_rewriter.py: added JSWombatProxyRewriter to default js rewriter class for internal testing
- html_rewriter.py: made JSWombatProxyRewriter to be default js rewriter class for internal testing
- regex_rewriters.py: implemented JSWombatProxyRewriter and JSWombatProxyRewriter to support wombat JS Proxy
- wombat.js: added JS Proxy support
- remove print

* wombat proxy: simplify mixin using 'first_buff'

* js local scope rewrite/proxy work:
- add DefaultHandlerWithJSProxy to enable new proxy rewrite (disabled by default)
- new proxy toggleable with 'js_local_scope_rewrite: true'
- work on integrating john's proxy work
- getAllOwnProps() to generate list of functions that need to be rebound
- remove non-proxy related changes for now, remove angular special cases (for now)

* local scope proxy work:
- add back __WB_pmw() prefix for postMessage
- don't override postMessage() in proxy obj
- MessageEvent resolve proxy to original window obj

* js obj proxy: use local_init() to load local vars from proxy obj

* wombat: js object proxy improvements:
- use same object '_WB_wombat_obj_proxy' on window and document objects
- reuse default_proxy_get() for get operation from window or document
- resolve and Window/Document object to the proxy, eg. if '_WB_wombat_obj_proxy' exists, return that
- override MessageEvent.source to return window proxy object

* obj proxy work:
- window proxy: defineProperty() override calls Reflect.defineProperty on dummy object as well as window to avoid exception
- window proxy: set() also sets on dummy object, and returns false if Reflect.set returns false (eg. altered by Reflect.defineProperty disabled writing)
- add override_prop_to_proxy() to add override to return proxy obj for attribute
- add override for Node.ownerDocument and HTMLElement.parentNode to return document proxy
server side rewrite: generalize local proxy insert, add list for local let overrides

* js obj proxy work:
- add default '__WB_pmw' to self if undefined (for service workers)
- document.origin override
- proxy obj: improved defineProperty override to work with safari
- proxy obj: catch any exception in dummy obj setter

* client-side rewriting:
- proxy obj: catch exception (such as cross-domain access) in own props init
- proxy obj: check for self reference '_WB_wombat_obj_proxy' access to avoid infinite recurse
- rewrite style: add 'cursor' attr for css url rewriting

* content rewriter: if is_ajax(), skip JS proxy obj rewriting also (html rewrite also skipped)

* client-side rewrite: rewrite 'data:text/css' as inline stylesheet when set via setAttribute() on 'href' in link

* client-side document override improvements:
- fix document.domain, document.referrer, forms add document.origin overrides to use only the document object
- init_doc_overrides() called as part of proxy init
- move non-document overrides to main init
rewrite: add rewrite for "Function('return this')" pattern to use proxy obj

* js obj proxy: now a per-collection (and even a per-request) setting 'use_js_obj_prox' (defaults to False)
live-rewrite-server: defaults to enabled js obj proxy
metadata: get_metadata() loads metadata.yaml for config settings for dynamic collections),
or collection config for static collections
warcserver: get_coll_config() returns config for static collection
tests: use custom test dir instead of default 'collections' dir
tests: add basic test for js obj proxy
update to warcio>=1.4.0

* karma tests: update to safari >10

* client-side rewrite:
- ensure wombat.js is ES5 compatible (don't use let)
- check if Proxy obj exists before attempting to init

* js proxy obj: RewriteWithProxyObj uses user-agent to determine if Proxy obj can be supported
content_rewriter: add overridable get_rewriter()
content_rewriter: fix elif -> if in should_rw_content()
tests: update js proxy obj test with different user agents (supported and unsupported)
karma: reset test to safari 9

* compatibility: remove shorthand notation from wombat.js

* js obj proxy: override MutationObserver.observe() to retrieve original object from proxy
wombat.js: cleanup, remove commented out code, label new proxy system functions, bump version to 2.40
2017-08-05 10:37:32 -07:00
Ilya Kreymer
ec7a29a3ba static paths: ensure consistent renaming of static/default -> static/__pywb for bundled static path 2015-03-23 16:15:37 -07:00
Ilya Kreymer
fe1c32c8f7 cdxj: support loading cdxj (#76)
cdx obj: allow alt field names to be used (eg. mime, mimetype, m)
(status/statuscode/s) in querying and reading cdx
cdx minimal: (#75) now implies cdxj to avoid more formats
minimal includes digest always and mime when warc/revisit
tests for cdxj loading
indexing optimization: reuse same entry obj for records of same type
2015-03-19 12:36:49 -07:00
Ilya Kreymer
0efa2dc0ad rewrite/banner: add a seperate 'banner_html' setting which allows
overriding just the banner (and not the entire head_insert). Setting
banner_html: False will disable the banner, or setting to a custom
template will insert that template. Default template loads
default_banner.js which does the actual initialization.
2014-10-17 08:28:06 -07:00
Ilya Kreymer
9be7074183 bump version to 0.6.1
fix small typo in cert_download for not-available message, spacing in config.yaml
2014-09-07 11:58:03 -07:00
Ilya Kreymer
751084b097 update CHANGES, config.yaml docs for proxy mode
ensure proxy_options match defaults in config.yaml
default cookie_resolver to true
2014-09-06 17:03:04 -07:00
Ilya Kreymer
eaaefbfd24 * config cleanup: remove 'hostpaths' setting entirely, avoiding the need to specify host on which pywb
will run (this was cumbersome to maintain and not really useful)
ReferRedirect just checks that the current request host header, if present, matches that of the referrer
and checks that the coll and script name match.
* removed proxy_pac as it was also unneeded/unused and required use of the hostpaths
* added test for invalid CONNECT usage (405 response)
2014-08-20 02:02:47 -04:00
Ilya Kreymer
eca3cf5fbf https proxy: add ca generator!
support uwsgi, gunicorn and ref
better handling of 407, other error responses in response to CONNECT
2014-07-26 13:24:53 -07:00
Ilya Kreymer
7c57345363 proxy: add 'unaltered_replay' option to proxy_options to replay
all content unaltered (no rewriting html, no banner, no wombat)
use 'proxy_options' instead of 'routing_options', add additional
tests for proxy mode
2014-07-21 16:42:14 -07:00
Ilya Kreymer
1b1a1f8115 proxy: add 'proxy_coll_select' config which will require a proxy-auth to select a collection for proxy mode.
Otherwise, defaults to first available collection, though proxy-auth can still be sent to specify different collection
2014-07-14 19:12:30 -07:00
Ilya Kreymer
70b7e29b36 pass raw bytes to htmlparser, assuming ascii-compatibility
(todo: add tests for non-ascii compatible encodings)
improved rendering of certain pages, needs more testing

lxml: remove lxml and complexity associated with having the parser,
as its too unpredictable for older html, does its own decoding.
2014-06-27 19:03:06 -07:00
Ilya Kreymer
80e80e97d3 replay: support 'framed_replay' option in config for both replay and live rewrite
split replay view into BaseContentView and ReplayView
refactor RewriteLiveHandler into RewriteLiveView
add additional tests for framed and non-framed mode
default to framed replay!
2014-06-14 18:26:19 -07:00
Ilya Kreymer
d7516f4cd7 rewrite: fix <base> rewriting, urlrewriter replacement
turn off lxml rewriter by default
2014-06-13 16:44:37 -07:00
Ilya Kreymer
80f2da9548 refactor: move configs/config.yaml to root again
remove cdx-server specific config, instead make cdx server api-only
path configurable from regular config
2014-04-02 21:26:53 -07:00
Ilya Kreymer
093d8310e5 config: move config files to ./configs/
PYWB_CONFIG_FILE setting overrides passed in config
2014-03-27 14:31:27 -07:00
Ilya Kreymer
f35e82a4d5 ensure final output from close() is encoded!
add config option to 'use_lxml_parser' if available, if not,
will default to regular parser
testing on travis with lxml (not adding to dep yet)
2014-03-17 13:19:51 -07:00
Ilya Kreymer
6461af030b refactoring: clean up handlers and replay_views for pep8
use BlockLoader().load for StaticHandler static file resolving
update static paths to point to pywb/static instead of static
2014-03-14 18:17:22 -07:00
Ilya Kreymer
a1ab54c340 first pass at memento support #10!
memento support enabled by default, togglable via 'enable_memento' config property
supporting timegate and memento apis, no timemap yet
supporting pattern 2.3 for archival and pattern 1.3 for proxy modes
also:
simplify exception hierarchy a bit more, move down to utils
make WbRequest and WbResponse extensible with mixins (eg for memento)
2014-03-14 10:46:20 -07:00
Ilya Kreymer
ff428ed43e exclusions: add AllAllowPerms and refactor exclusions interface
add TestExclusionPerms and a sample exclusion integration test
refactor cdx server init params into **kwargs
convert all cdx params to use camelCase
2014-02-19 20:20:31 -08:00
Ilya Kreymer
a09dec4b3e cdx: add domain-specific rules at cdx layer for custom canonicalization!
and 'fuzzy' matching when not found
handled via cdxdomainspecific.py
BaseCDXServer contains a canonicalizer object and a fuzzy query
canonicalizer abstracted to seperate class (in canonicalizer.py)
clean up cdx related exceptions
default rules read from cdx/rules.yaml
filename configurable via 'domain_specific_rules' setting in config.yaml
fix typo in pywb/rewrite
2014-02-18 14:56:13 -08:00
Ilya Kreymer
44f38f44d5 paths cleanup:
- don't store explicit static path, but allow it to be set in the insert
- store host_prefix, which is either server name or empty
- for archival mode, absolute_paths settings controls if using absolute paths,
- for proxy always use absolute_paths
- default static path is: /static/default/
- allow extension apps to provide custom /static/X/ path

Route overriding:
- ability to set Route class
- custom init method

Archival Relative Redirect:
- if starting with timestamp, drop timestamp and assume host-relative path

Integration Tests:
- test proxy mode by using REQUEST_URI
- test archival relative redirect!
2014-02-08 20:07:16 -08:00
Ilya Kreymer
b11f4fad93 add support for pywb static content routes (seperate from uwsgi)
adding StaticHandler and loading templates and static resources from current package
add default template and static data to be included in the pywb package
add test for custom static route
2014-02-07 19:32:58 -08:00
Ilya Kreymer
00a7691f69 add optional filters to default Route
add examples to config.yaml and test_config.yaml and integration test
per route config is inherited globally if only name is set
2014-02-06 17:28:08 -08:00
Ilya Kreymer
1a1aa814d0 first pass at simple http proxy! #8
* proxy router for handling only proxy
* proxy/archival router for handling both archival and proxy mode,
  togglable with 'enable_http_proxy' setting in config
* supports only most recent capture playback -- no support for
selecting replay date/calendar view yet
* not testable with WebTest -- need better way to unit test proxy mode
2014-02-05 13:08:10 -08:00
Ilya Kreymer
3168b80cfa improve docs for config.yaml, group all ui settings together
create seperate test_config.yaml for testing
rename ArchivalRequestRouter -> ArchivalRouter for consistency
2014-02-05 10:10:33 -08:00
Ilya Kreymer
6388a78162 refactor: replay_views to support cleaner inheritance, no longer
wrapping previous WbResponse

overhaul yaml config to be much simpler, move best resolver and
best index reader to respective classes

add config_utils for sharing config, standard non-yaml config
provides defaults for testing

fix bug in query.html
2014-02-03 09:24:40 -08:00
Ilya Kreymer
86a093d164 support cdx server query at (/cdx in default config)
also enable /echo_env and /echo_req debug handlers
2014-02-01 00:43:24 -08:00
Ilya Kreymer
304ddbec84 Support for new UI, as per #16
* Refactor views class to support more Jinja2 views (J2Template)
* Add a home page, collection search page, and error pages, all optional
* all exceptions appear on error page
* wbrequest supports a request with an empty or / wb_url
2014-01-31 10:04:21 -08:00
Ilya Kreymer
411e7fe8a3 cleanup pywb_init, work on documenting config.yaml! 2014-01-29 00:03:24 -08:00
Ilya Kreymer
43a46b373d move sample/test data to ./sample_archive/warcs and ./sample_archive/cdx
pywb_init now driven by config.yaml! (#14)

Not yet supporting customized handlers, views, etc...
2014-01-28 22:03:01 -08:00