1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-04-01 11:41:27 +02:00

140 Commits

Author SHA1 Message Date
Ilya Kreymer
381f350917 proxy: switching not available for ip resolver either
tests: update tests for auth and ip resolver to check that proxy magic is not set
2015-12-12 22:59:32 -08:00
Ilya Kreymer
e9b11fcbf2 proxy: default to cookie resolver, which allows switching collections and datetime, instead of auth resolver
auth resolver can be used by setting 'cookie_resolver: false' explicitly. when using auth resolver,
don't set proxy magic path as switching collections or datetime is not possible with auth resolver
closes #160
2015-12-12 21:58:12 -08:00
Ilya Kreymer
7a0680fb35 memento: for not found timemap query, return empty timemap, instead of html query error page, closes #158 2015-11-30 09:40:07 -08:00
Ilya Kreymer
d98c1f6cf7 memento/api: add a new /collinfo.json end-point, enabled with 'enable_coll_info' config setting, which returns
the value fo collinfo.json template. Default template returns an entry for each handler route,
including the route path (id), title (name) and memento timegate and timemap paths, to be used with
an aggregator. Using a custom 'info_json' template can specify a different collinfo template, alternative to #69 (local aggregation)
Closes #146
2015-11-04 15:36:44 -08:00
Ilya Kreymer
3132bfa7f4 cache: add a simple RedisCache implementation (alongside local and uwsgi)
proxy_ip_resolver: add option to use RedisCache if redis_cache_key set in config
proxy_ip_resolver: add 'delete' option to delete ip from cache, closes #145
2015-10-30 13:15:07 -07:00
Ilya Kreymer
16cf997a07 proxy: stick with http1.0 as not really supporting 1.1, #143 2015-10-26 15:25:00 -07:00
Ilya Kreymer
eeb35ea3b4 proxy: add ProxyRouter wrapper to check for content-length and, if missing, perform full buffering (http1.0) or chunked encoding (http1.1) (separate from replay view buffering)
add tests for buffering and chunked encoding, fixes #143, also tests no banner url-rewrite only proxy related to #142
2015-10-25 18:02:51 -07:00
Ilya Kreymer
0c96591c49 proxy: change HttpsUrlRewriter to SchemeOnlyUrlRewriter, which fixes http->https or https->http to match
the scheme of the current page.
url-rewrite-only mode: add uo_ mod and use that to rewrite only urls (no banner, no client side rewrite)
addresses #142
2015-10-24 15:10:30 -07:00
Ilya Kreymer
4ba4521b56 tests: use random port instead of 8080 for cli test to avoid conflicts with running services 2015-10-23 11:53:28 -07:00
Ilya Kreymer
b612c584de tests: test fixes for windows 2015-10-13 21:36:27 -07:00
Ilya Kreymer
4dfe187174 proxy improvements:
use proxy_magic path to get video info to ensure video info, addresses #106
video info: ensure vi_ replay has CORS support to support serving from magic path
proxy & wombat improvements: set replay_top to window.top and avoid causing cross-domain errors
2015-10-11 21:03:30 -07:00
Ilya Kreymer
6f7bd8c291 proxy resolvers: add tests for ip-based resolver
cache: default cache returns empty instead of raise KeyError on invalid key, to be consistent with uwsgi
2015-10-11 17:46:12 -07:00
Ilya Kreymer
d6ccee6650 proxy: more resolver typo fix 2015-09-09 13:22:32 -07:00
Ilya Kreymer
1392168ed0 proxy: ip resolver fix typo 2015-09-09 13:22:32 -07:00
Ilya Kreymer
31912b3bf7 proxy: update tests for new use_banner, use_client_rewrite options, #107 2015-09-09 13:22:32 -07:00
Ilya Kreymer
e3f734d99d proxy: better options 'use_banner' to specify using banner insert,
'use_client_rewrite' to specify adding wombat client rewriting, as per #107
2015-09-09 13:22:32 -07:00
Ilya Kreymer
49fe672b91 proxy: add support for new ip-address based resolver 2015-09-09 13:22:32 -07:00
Ilya Kreymer
0fd3d39ab8 cli: use wsgiref w/ threading by default to support proxy mode
waitress still available with --server=waitress flag
2015-07-31 02:47:03 -07:00
Ilya Kreymer
c2f99d6cfd replay/memento: always include 'Content-Location' for in no-redir mode replay (not just for memento timegate), #122 2015-07-19 00:11:25 -07:00
Ilya Kreymer
66f5ad62b3 memento: when redir_to_exact is false, don't redirect latest replay/timegate to current timestamp, but return directly latest capture.
when memento enabled, the timegate now follows memento pattern 2.2  (http://tools.ietf.org/html/rfc7089#section-4.2.2)
also return content-location instead of location, update memento no-redirect tests to match new behavior. closes #122
2015-07-18 23:30:31 -07:00
Ilya Kreymer
2d0c526053 post handling: when reading post data in extract_post_query(), add optional buffer_stream which would hold the original POST
data. This is necessary to override the `wsgi.input` to allow the post data to be read again via a fallback handler, even
after reading POST query data in replay handler, addresses #117
2015-06-25 15:58:58 -07:00
Ilya Kreymer
74c6b60d5e wombat customization: pass custom options from config.yaml 'rewrite_opts.client' as a json obj
to wombat.js #96
currently supporting no_rewrite_prefixes, and skipping dom, setAttribute and postmessage overrides
(used by via.hypothes.is) -- other options to be added later
2015-04-16 12:24:01 -07:00
Ilya Kreymer
4469754a5a routing: improved support for root collection via empty route. If '' route present,
add it last to avoid conflicting w/ other routes (#94)
templates: pass in custom params to jinja2 template via env 'pywb.template_params' dictionary.
If present, dictionary contents will be added directly to jinja context for the request (#95)
2015-04-10 14:47:02 -07:00
Ilya Kreymer
c089ba35bc proxy init: instead of using first route, find first valid route (eg. not static)
move static paths to be checked first
2015-04-04 12:54:32 -07:00
Ilya Kreymer
fcb6e94736 framework refactor: move rel_request_uri() call down to the routers, for easier reuse
each router now calls ensure_rel_uri_set() to ensure that REL_REQUEST_URI field is set before
use. allows router to be called directly without setup.
add optional fallback_app to allow acting as middleware
2015-04-03 08:45:18 -07:00
Ilya Kreymer
8bd6787595 'inverse' framed replay: ensure memento headers point to actual memento in inverse framed replay
add additional test for inverse framed replay, #92
fix framed replay url replace slash
2015-04-01 16:21:44 -07:00
Ilya Kreymer
bd21480db9 framed replay: add supporting for 'inverting' frame and replay modifiers,
setting default mod to be top-frame and inner frame to be 'mp_' #92
can enable this mode by setting framed_replay: inverse instead of true
modifiers passed to client side script via wbinfo as well
2015-04-01 10:13:56 -07:00
Ilya Kreymer
002fe6a338 certauth: change 'get_cert_for_host' -> 'cert_for_host' 2015-03-30 15:47:53 -07:00
Ilya Kreymer
dd30e3f2a7 refactor: fixes for compat with latest certauth>=1.1.0 2015-03-30 09:38:42 -07:00
Ilya Kreymer
cda7705075 split and refactor: remove certauth.py / test_certauth.py and instead use this functionality from 'certauth' package. Also remove proxy-cert-auth cli as
the 'certauth' tool superceeds this functionality. (#90).
To use https proxy mode, 'pip install certauth' is required. (update travis config)
2015-03-29 17:38:57 -07:00
Ilya Kreymer
df76bc3500 cli: change cdx-server and live-rewrite-server to go through shared cli
entry point
2015-03-23 09:08:09 -07:00
Ilya Kreymer
cc068f8ee8 init/import path: move DEFAULT_CONFIG to __init__ for faster shared import
proxy: move certauth/openssl init to only happen in enable_https_proxy is set to
make slow openssl import run only when used
2015-03-22 17:52:07 -07:00
Ilya Kreymer
ea460bb0f0 cdxj: support cdx json output from cdx server with output='json' (not yet default)
cdx field renaming: canonical cdx field name changes
statuscode -> status
mimetype -> mime
original -> url
old names still accept for query/filtering, however, cdx json will use new names
ensures consistency between .cdxj field names and names used by cdx server json output
collections manager now creates .cdxj by default
bump version to 0.9.0b2!
2015-03-19 13:33:49 -07:00
Ilya Kreymer
3d53fdde9e cleanup: remove unused __str__ from Handlers / Route, not as useful anymore 2015-03-15 22:55:23 -07:00
Ilya Kreymer
30454abb6b metadata: add support for user-defined per-collection metadata! #78
metadata stored in wbrequest.user_metadata and available to all templates

collections manager: refactor to use subparsers, add list collections and set metadata commands
update tests for new commands
index template: use user metadata title for collections listing
search template: display all metadata and title, if available
2015-03-15 21:24:15 -07:00
Ilya Kreymer
a932235f85 Merge branch 'develop' into config-work 2015-02-24 10:40:58 -08:00
Ilya Kreymer
39824711f0 memento tweak: ensure rel=memento link for timegate uses exact in Location (cdx original) as opposed to url from request 2015-02-23 23:21:39 -08:00
Ilya Kreymer
435fa390ed config system: initial work on automated directory-convention based config!
config.yaml file now optional, add default_config.yaml which for default settings #55
2015-02-23 21:59:41 -08:00
Ilya Kreymer
9623f95439 memento: add rel="memento" header to timegate as well, improve memento test, clearly differntiate between
timegate redirect and intermediate resource redirect, related to #70
2015-02-16 09:59:03 -08:00
Ilya Kreymer
afe49a91f4 rewrite: more fixes for IDN #66 - add _do_percent_encode field to wburl itself
defaults to true, may be disabled with 'punycode_links'
remove wbrequest and urlrewriter from get_url path, simply call wb_url.get_url() to get properly formatted url
2015-02-14 20:55:36 -08:00
Ilya Kreymer
f9452bf48e rewrite: refactor IDN support: instead of returning IRI, return utf-8 %-encoded url
remove support for  returning IRI, as that requires detecting charset, instead just use %-encoded form
and let browser decode. Should address #66

Add rewrite option 'punycode_links_only' (default to false) to skip the %-encoded conversion of host, and just return punycode.

wombat: use getAttribute('href') on <a> tag to get original url, not punycode version

replay: add extra sanity check on Location header to ensure utf-8
2015-02-14 17:26:39 -08:00
Ilya Kreymer
79cfdd6a08 framework/urlrewriter: allow overriding UrlRewriter with optional urlrewriter_class param,
easier to override create_rebased_rewriter() with custom rewriter as well
2015-02-12 10:34:04 -08:00
Ilya Kreymer
55426e7619 memento: fix headers to be more consistent for framed replay. when using
frames, outer frames 'mirrors' mementos of the inner frame to be
discoverable by client side memento tools, tracked via #70
2015-01-29 22:27:15 -08:00
Ilya Kreymer
695245d9e8 wburl idn: more complete support for idn urls (#66)
add distinct to_iri() and to_uri() functions in WbUrl
internal representation is always as ascii uri
for rewriting, defaults to iri representation unless
'rewrite_ascii_only_urls' is set to true per collection
add wbrequest.get_url() to get url as either iri or uri to be passed
to templates
2015-01-26 11:07:59 -08:00
Ilya Kreymer
8449647c5f wbexception: remove unused status in WbException, set default error for
any uncaught exception to 500, instead of 400
2015-01-11 23:53:34 -08:00
Ilya Kreymer
db75bda736 file open() pass: convert all read and write to ensure binary 'b' flag is set (#56) 2015-01-11 18:54:11 -08:00
Ilya Kreymer
14657fbe15 certauth: fix max cert duration to avoid int overflow 2015-01-11 15:04:19 -08:00
Ilya Kreymer
7ae0ff86d2 test certauth: fix paths 2015-01-11 13:10:14 -08:00
Ilya Kreymer
ad5a43db76 replay redirect: ensure no timestamp redirect when range request is
present, alter test to include inexact timestamp
2014-12-23 21:19:39 -08:00
Ilya Kreymer
181c18a1b8 pep8 pass: fix spacing, line length, issues
also remove references to obsolete cached_replay, hostnames in pywb_init
2014-12-23 15:14:03 -08:00