1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1660 Commits

Author SHA1 Message Date
Ilya Kreymer
37609ebdc9 rewrite: support custom cookie_rewriter passed to 'rewrite_content' 2016-04-30 01:35:55 -07:00
Ilya Kreymer
e669ecba15 rewrite: html rewrite fix such that head insert is placed before other <script> tags even if no head 2016-04-30 01:32:16 -07:00
Ilya Kreymer
7a0dd463cd webagg: responseloader: use urllib3 directly instead of requests to
take advantage of connection pooling w/o storing/sharing cookies
2016-04-27 10:16:54 -07:00
Ilya Kreymer
9010e52663 urlrewrite: refactor simpleapp to support live/record/replay 2016-04-27 10:15:48 -07:00
Ilya Kreymer
f119d05724 recorder: fix simplerec init
tests: improve tests for skipping request and response headers
2016-04-27 09:52:56 -07:00
Ilya Kreymer
a1e0c29a85 rules: add rule for twitter timeline 2016-04-26 17:02:54 -07:00
Ilya Kreymer
658303caad rewrite headers: undo not rewriting x- headers, needs more research and exclusions (eg. x-frame-options) 2016-04-26 13:11:08 -07:00
Ilya Kreymer
cf6cfc0c44 tests: fix cookie rewriter tests to exclude 2.6 2016-04-26 10:32:43 -07:00
Ilya Kreymer
4a60e15577 cookie rewrite improvements: #177
- don't remove max-age and expires if in 'live' rewrite mode (flag set on urlrewriter)
- remove secure only if replay prefix is not https
- fix expires UTC->GMT as cookie parsing chokes on UTC
- other rewriting: don't append rewrite prefix to x- headers
tests: add more cookie rewriting tests
2016-04-26 09:45:23 -07:00
Ilya Kreymer
a82e2785c7 tests: add basic test for rewriterapp 2016-04-25 14:29:28 -07:00
Ilya Kreymer
3b6cab1730 urlrewrite: remove dependency on bottle from rewriterapp,
add overridable error and query views, with extensible get_query_params() and process_cdx_query()
to extend cdx for query view
add get_top_url() for adding custom top_url for frame insert
add call_with_params() for adding custom params to environ
2016-04-25 12:05:43 -07:00
Ilya Kreymer
b056acd88e urlrewrite: add support for index query 2016-04-15 04:01:36 +00:00
Ilya Kreymer
0370470e68 urlrewrite: http range: support skipping record for range requests not starting at 0-
and performing async request,
support converting unbounded 0- to non-ranged and back
2016-04-15 02:21:39 +00:00
Ilya Kreymer
0b255819ff recorder warcwriter: allow skipping writing of only request or only response by overriding _is_write_req and _is_write_resp in subclass
(todo: rethink the interface)
2016-04-15 02:19:34 +00:00
Ilya Kreymer
a93f75dca2 webagg: add preliminary 'fuzzy matching' fallback support, currently enabled for all sources
(todo: need to only include sources that support it)
2016-04-15 02:18:20 +00:00
Ilya Kreymer
61381fcac6 wombat rewrite: remove cookie domain if hostname is an IP address 2016-04-07 15:53:26 -07:00
Ilya Kreymer
00bdddd1e9 recorder: SkipDupePolicy only skips if url is an exact match (not just by urlkey) 2016-04-07 10:44:05 -07:00
Ilya Kreymer
f4cc143dc7 urlrewrite: generalize support for overridable handle_custom_response() callback for handling modifiers (default support top-frame)
pass headers to add_custom_params, include error message on error if available
headers: use add_header() to support multiple headers with same name
is_ajax(): check for X-Pywb-Requested-With header to make as ajax and not pass to upstream
2016-04-07 10:39:12 -07:00
Ilya Kreymer
95a212ed79 wombat rewrite: add custom X-Pywb-Requested-With header with turns off rewriting and is never sent upstream 2016-04-06 12:05:53 -07:00
Ilya Kreymer
fa5d5e6bcc urlrewrite templates: add get_top_frame_params() callback for adding custom params for top frame,
also inject env['webrec.template_params'] if set
2016-04-05 02:45:00 -07:00
Ilya Kreymer
d40edfc22d warcwriter: add create_warcinfo_record() for creating a warcinfo and a SimpleTempWARCWriter for writing records to temp buff/file 2016-04-03 12:19:54 -07:00
Ilya Kreymer
fd76030cb3 urlrewriter: allow passing in existing jinja_env wrapper 2016-04-02 21:36:54 -07:00
Ilya Kreymer
01c21d3a43 recorder: redis indexer accepts arg list, supports separate redis and key_template args
add length param to add_urls_to_index() in redis indexer, return cdx list
2016-04-02 21:36:36 -07:00
Ilya Kreymer
6157cebcc9 testutils: when mock patching FakeStrictRedis, use a subclass with a shared pubsub (to match real redis) 2016-04-02 21:33:39 -07:00
Ilya Kreymer
ddee9236c6 webagg: rename key_prefix -> key_template 2016-04-02 21:33:23 -07:00
Ilya Kreymer
4b753d2612 Merge branch '0.11.5' into develop 2016-03-31 13:16:53 -07:00
Ilya Kreymer
9381acdaaf Merge branch 'zip-loc-fix' into develop 2016-03-31 13:14:39 -07:00
Ilya Kreymer
b901343067 update CHANGES.rst 2016-03-31 13:14:04 -07:00
Ilya Kreymer
e5ef51363c zipnum: backport fix for #173, paths specified in a zipnum .loc file are relative to the .loc file, not to
the working dir of the application
warnings: don't warn on .gz cdx files
2016-03-31 13:09:57 -07:00
Ilya Kreymer
ba7ac56230 release: bump to 0.11.5, update version and changelist 2016-03-31 12:45:16 -07:00
Ilya Kreymer
b5cf79072d loaders: ensure loader stream closed in load_yaml_config() 2016-03-31 12:42:23 -07:00
Ilya Kreymer
8e51ddc544 archiveiterator: don't reuse entries when post-append, as they may be cached for merge -- can break if records do not alternate
request/response fixes #175
2016-03-31 12:42:23 -07:00
Ilya Kreymer
70fbb5f7a6 ulrewrite: fix typos, add full package paths 2016-03-28 22:59:22 -07:00
Ilya Kreymer
f12be3bc91 urlrewrite app: add bottle-based app, templateview separate from pywb webapp framework 2016-03-27 17:34:45 -04:00
Ilya Kreymer
f8f0c3a76e loader: ensure file closed in load_yaml_config() 2016-03-27 13:56:19 -04:00
Ilya Kreymer
017e9802f8 tests: fix fakeredis patch not running on test_handlers,
use exc str instead of repr for error message for consistency
all tests pass on py2 and py3 again!
2016-03-26 22:32:21 -04:00
Ilya Kreymer
0399cc1046 webagg app: support bottle debug properly as opt param 2016-03-26 22:30:47 -04:00
Ilya Kreymer
3eac9be00b warc: ArchiveLoadFailed: add space in exception string 2016-03-26 22:28:38 -04:00
Ilya Kreymer
c5a166f601 tests: use httpbin.org instead of example.com/ for range-request test 2016-03-26 22:28:04 -04:00
Ilya Kreymer
7884d4394b recorder: close_file() by params rather than exact path, update tests 2016-03-26 13:07:53 -04:00
Ilya Kreymer
7deba42851 add urlrewrite pywb-adapter PlatformHandler for using traditional pywb
setup with webrecorder components recorder and webagg
2016-03-24 16:33:03 -04:00
Ilya Kreymer
2bfe5d4f9e inputreq: only use REQUEST_URI if no SCRIPT_NAME is set (otherwise reconstruct the path) 2016-03-24 16:17:46 -04:00
Ilya Kreymer
b6e988d9a1 self-redirect: if 'status' is a 3xx, call raise_on_self_redirect() to check Location for exact url redirect.
supports both WARC and live loaders, addresses #1
2016-03-24 16:08:29 -04:00
Ilya Kreymer
61921d6c4a tests: add FakeRedisTests class mixin for patching in FakeRedis for tests 2016-03-24 10:45:48 -04:00
Ilya Kreymer
7cc772329c redis: add tests for RedisMultiKeyIndexSource 2016-03-24 10:44:14 -04:00
Ilya Kreymer
64b32dc57a redis support: add RedisMultiKeyIndexSource for using redis SCAN wildcard query and aggregate results from several
redis keys
2016-03-24 01:17:18 -04:00
Ilya Kreymer
e5ddf9d4f4 utils: res_template() supports extra params for interpolation 2016-03-23 23:58:49 -04:00
Ilya Kreymer
ba66d0bb5e recorder: use res_template() to resolve params, rename indexing method to add_urls_to_index 2016-03-23 23:55:21 -04:00
Ilya Kreymer
5fd49f35ee zipnum: when using .loc file, resolve shard paths relative to the .loc file, not from working directory, fixes #173 2016-03-22 11:31:08 -07:00
Ilya Kreymer
a0347a3c42 typo fix 0.11.4 2016-03-21 13:09:03 -07:00