1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1678 Commits

Author SHA1 Message Date
Ilya Kreymer
1e7d4d27e3 bump version to 0.30.2 2016-05-06 09:43:11 -07:00
Ilya Kreymer
ab3af90df2 cookie_tracker: add support for redis-based subdomain cookie tracker, which temp caches cookies with Domain= set in redis and passes them upstream
when rewriting. addresses webrecorder/webrecorder#79
2016-05-04 16:39:47 -07:00
Ilya Kreymer
8e473f01fa add changelist for 0.30.1 2016-05-04 11:33:43 -07:00
Ilya Kreymer
2795802c77 recordloader: for request/response/revisit records, only parse urls starting with http:/https: as http 2016-05-04 11:20:38 -07:00
Ilya Kreymer
af920d77a0 rules: add fuzzy rules for TW video 2016-05-03 17:33:13 -07:00
Ilya Kreymer
07cc4fae0b bump version to 0.30.1 2016-05-03 17:32:35 -07:00
Ilya Kreymer
3a3110efdb fix README typo 2016-05-01 11:57:37 -07:00
Ilya Kreymer
e458bdcc77 CHANGES tweaks 2016-05-01 11:53:23 -07:00
Ilya Kreymer
033909efe0 wombat: set version to 1.12
return 'null' for frameElement ovevrride instead of undefined
2016-05-01 11:46:36 -07:00
Ilya Kreymer
4df45b4338 Update CHANGES for 0.30.0! 2016-05-01 11:45:01 -07:00
Ilya Kreymer
dd8ac42f2c encoding: ensure cdx fields are in the native encoding, except filename, which should stay as unicode in py2 for further use 2016-04-30 16:08:43 -07:00
Ilya Kreymer
e8c77c0538 encoding: encode before quote
setup: enable zip_safe=True again
2016-04-30 15:15:35 -07:00
Ilya Kreymer
ab8b4efaec encoding: cdx: only quote-encode 'url'
warc: ensure path index loads are utf-8 decoded
2016-04-30 14:38:48 -07:00
Ilya Kreymer
228ca58c5b recorer: actually fix content-type on warcinfo, add to test! 2016-04-30 13:07:53 -07:00
Ilya Kreymer
0fbae1c7f8 recorder: ensure warcinfo record has a content-type 2016-04-30 10:19:20 -07:00
Ilya Kreymer
67a02613e7 remove: remove unused/extraneous __iter__ 2016-04-30 01:43:53 -07:00
Ilya Kreymer
1c97a67763 rewrite client-side improvements:
add WB_wombat_frameElement Object prototype property to support frameElement rewriting
document.domain: allow changing to higher-level domain
rewrite_elem: also rewrite <form> action and <input> value, if they are absolute urls
2016-04-30 01:43:40 -07:00
Ilya Kreymer
1bea9d73ed rewrite: rewrite .frameElement -> WB_wombat_frameElement server-side to handle cases when default frameElement can not be overridden 2016-04-30 01:36:26 -07:00
Ilya Kreymer
37609ebdc9 rewrite: support custom cookie_rewriter passed to 'rewrite_content' 2016-04-30 01:35:55 -07:00
Ilya Kreymer
e669ecba15 rewrite: html rewrite fix such that head insert is placed before other <script> tags even if no head 2016-04-30 01:32:16 -07:00
Ilya Kreymer
7a0dd463cd webagg: responseloader: use urllib3 directly instead of requests to
take advantage of connection pooling w/o storing/sharing cookies
2016-04-27 10:16:54 -07:00
Ilya Kreymer
9010e52663 urlrewrite: refactor simpleapp to support live/record/replay 2016-04-27 10:15:48 -07:00
Ilya Kreymer
f119d05724 recorder: fix simplerec init
tests: improve tests for skipping request and response headers
2016-04-27 09:52:56 -07:00
Ilya Kreymer
a1e0c29a85 rules: add rule for twitter timeline 2016-04-26 17:02:54 -07:00
Ilya Kreymer
658303caad rewrite headers: undo not rewriting x- headers, needs more research and exclusions (eg. x-frame-options) 2016-04-26 13:11:08 -07:00
Ilya Kreymer
cf6cfc0c44 tests: fix cookie rewriter tests to exclude 2.6 2016-04-26 10:32:43 -07:00
Ilya Kreymer
4a60e15577 cookie rewrite improvements: #177
- don't remove max-age and expires if in 'live' rewrite mode (flag set on urlrewriter)
- remove secure only if replay prefix is not https
- fix expires UTC->GMT as cookie parsing chokes on UTC
- other rewriting: don't append rewrite prefix to x- headers
tests: add more cookie rewriting tests
2016-04-26 09:45:23 -07:00
Ilya Kreymer
a82e2785c7 tests: add basic test for rewriterapp 2016-04-25 14:29:28 -07:00
Ilya Kreymer
3b6cab1730 urlrewrite: remove dependency on bottle from rewriterapp,
add overridable error and query views, with extensible get_query_params() and process_cdx_query()
to extend cdx for query view
add get_top_url() for adding custom top_url for frame insert
add call_with_params() for adding custom params to environ
2016-04-25 12:05:43 -07:00
Ilya Kreymer
b056acd88e urlrewrite: add support for index query 2016-04-15 04:01:36 +00:00
Ilya Kreymer
0370470e68 urlrewrite: http range: support skipping record for range requests not starting at 0-
and performing async request,
support converting unbounded 0- to non-ranged and back
2016-04-15 02:21:39 +00:00
Ilya Kreymer
0b255819ff recorder warcwriter: allow skipping writing of only request or only response by overriding _is_write_req and _is_write_resp in subclass
(todo: rethink the interface)
2016-04-15 02:19:34 +00:00
Ilya Kreymer
a93f75dca2 webagg: add preliminary 'fuzzy matching' fallback support, currently enabled for all sources
(todo: need to only include sources that support it)
2016-04-15 02:18:20 +00:00
Ilya Kreymer
61381fcac6 wombat rewrite: remove cookie domain if hostname is an IP address 2016-04-07 15:53:26 -07:00
Ilya Kreymer
00bdddd1e9 recorder: SkipDupePolicy only skips if url is an exact match (not just by urlkey) 2016-04-07 10:44:05 -07:00
Ilya Kreymer
f4cc143dc7 urlrewrite: generalize support for overridable handle_custom_response() callback for handling modifiers (default support top-frame)
pass headers to add_custom_params, include error message on error if available
headers: use add_header() to support multiple headers with same name
is_ajax(): check for X-Pywb-Requested-With header to make as ajax and not pass to upstream
2016-04-07 10:39:12 -07:00
Ilya Kreymer
95a212ed79 wombat rewrite: add custom X-Pywb-Requested-With header with turns off rewriting and is never sent upstream 2016-04-06 12:05:53 -07:00
Ilya Kreymer
fa5d5e6bcc urlrewrite templates: add get_top_frame_params() callback for adding custom params for top frame,
also inject env['webrec.template_params'] if set
2016-04-05 02:45:00 -07:00
Ilya Kreymer
d40edfc22d warcwriter: add create_warcinfo_record() for creating a warcinfo and a SimpleTempWARCWriter for writing records to temp buff/file 2016-04-03 12:19:54 -07:00
Ilya Kreymer
fd76030cb3 urlrewriter: allow passing in existing jinja_env wrapper 2016-04-02 21:36:54 -07:00
Ilya Kreymer
01c21d3a43 recorder: redis indexer accepts arg list, supports separate redis and key_template args
add length param to add_urls_to_index() in redis indexer, return cdx list
2016-04-02 21:36:36 -07:00
Ilya Kreymer
6157cebcc9 testutils: when mock patching FakeStrictRedis, use a subclass with a shared pubsub (to match real redis) 2016-04-02 21:33:39 -07:00
Ilya Kreymer
ddee9236c6 webagg: rename key_prefix -> key_template 2016-04-02 21:33:23 -07:00
Ilya Kreymer
4b753d2612 Merge branch '0.11.5' into develop 2016-03-31 13:16:53 -07:00
Ilya Kreymer
9381acdaaf Merge branch 'zip-loc-fix' into develop 2016-03-31 13:14:39 -07:00
Ilya Kreymer
b901343067 update CHANGES.rst 2016-03-31 13:14:04 -07:00
Ilya Kreymer
e5ef51363c zipnum: backport fix for #173, paths specified in a zipnum .loc file are relative to the .loc file, not to
the working dir of the application
warnings: don't warn on .gz cdx files
2016-03-31 13:09:57 -07:00
Ilya Kreymer
ba7ac56230 release: bump to 0.11.5, update version and changelist 2016-03-31 12:45:16 -07:00
Ilya Kreymer
b5cf79072d loaders: ensure loader stream closed in load_yaml_config() 2016-03-31 12:42:23 -07:00
Ilya Kreymer
8e51ddc544 archiveiterator: don't reuse entries when post-append, as they may be cached for merge -- can break if records do not alternate
request/response fixes #175
2016-03-31 12:42:23 -07:00