1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1706 Commits

Author SHA1 Message Date
Ilya Kreymer
f119d05724 recorder: fix simplerec init
tests: improve tests for skipping request and response headers
2016-04-27 09:52:56 -07:00
Ilya Kreymer
a1e0c29a85 rules: add rule for twitter timeline 2016-04-26 17:02:54 -07:00
Ilya Kreymer
658303caad rewrite headers: undo not rewriting x- headers, needs more research and exclusions (eg. x-frame-options) 2016-04-26 13:11:08 -07:00
Ilya Kreymer
cf6cfc0c44 tests: fix cookie rewriter tests to exclude 2.6 2016-04-26 10:32:43 -07:00
Ilya Kreymer
4a60e15577 cookie rewrite improvements: #177
- don't remove max-age and expires if in 'live' rewrite mode (flag set on urlrewriter)
- remove secure only if replay prefix is not https
- fix expires UTC->GMT as cookie parsing chokes on UTC
- other rewriting: don't append rewrite prefix to x- headers
tests: add more cookie rewriting tests
2016-04-26 09:45:23 -07:00
Ilya Kreymer
a82e2785c7 tests: add basic test for rewriterapp 2016-04-25 14:29:28 -07:00
Ilya Kreymer
3b6cab1730 urlrewrite: remove dependency on bottle from rewriterapp,
add overridable error and query views, with extensible get_query_params() and process_cdx_query()
to extend cdx for query view
add get_top_url() for adding custom top_url for frame insert
add call_with_params() for adding custom params to environ
2016-04-25 12:05:43 -07:00
Ilya Kreymer
b056acd88e urlrewrite: add support for index query 2016-04-15 04:01:36 +00:00
Ilya Kreymer
0370470e68 urlrewrite: http range: support skipping record for range requests not starting at 0-
and performing async request,
support converting unbounded 0- to non-ranged and back
2016-04-15 02:21:39 +00:00
Ilya Kreymer
0b255819ff recorder warcwriter: allow skipping writing of only request or only response by overriding _is_write_req and _is_write_resp in subclass
(todo: rethink the interface)
2016-04-15 02:19:34 +00:00
Ilya Kreymer
a93f75dca2 webagg: add preliminary 'fuzzy matching' fallback support, currently enabled for all sources
(todo: need to only include sources that support it)
2016-04-15 02:18:20 +00:00
Ilya Kreymer
61381fcac6 wombat rewrite: remove cookie domain if hostname is an IP address 2016-04-07 15:53:26 -07:00
Ilya Kreymer
00bdddd1e9 recorder: SkipDupePolicy only skips if url is an exact match (not just by urlkey) 2016-04-07 10:44:05 -07:00
Ilya Kreymer
f4cc143dc7 urlrewrite: generalize support for overridable handle_custom_response() callback for handling modifiers (default support top-frame)
pass headers to add_custom_params, include error message on error if available
headers: use add_header() to support multiple headers with same name
is_ajax(): check for X-Pywb-Requested-With header to make as ajax and not pass to upstream
2016-04-07 10:39:12 -07:00
Ilya Kreymer
95a212ed79 wombat rewrite: add custom X-Pywb-Requested-With header with turns off rewriting and is never sent upstream 2016-04-06 12:05:53 -07:00
Ilya Kreymer
fa5d5e6bcc urlrewrite templates: add get_top_frame_params() callback for adding custom params for top frame,
also inject env['webrec.template_params'] if set
2016-04-05 02:45:00 -07:00
Ilya Kreymer
d40edfc22d warcwriter: add create_warcinfo_record() for creating a warcinfo and a SimpleTempWARCWriter for writing records to temp buff/file 2016-04-03 12:19:54 -07:00
Ilya Kreymer
fd76030cb3 urlrewriter: allow passing in existing jinja_env wrapper 2016-04-02 21:36:54 -07:00
Ilya Kreymer
01c21d3a43 recorder: redis indexer accepts arg list, supports separate redis and key_template args
add length param to add_urls_to_index() in redis indexer, return cdx list
2016-04-02 21:36:36 -07:00
Ilya Kreymer
6157cebcc9 testutils: when mock patching FakeStrictRedis, use a subclass with a shared pubsub (to match real redis) 2016-04-02 21:33:39 -07:00
Ilya Kreymer
ddee9236c6 webagg: rename key_prefix -> key_template 2016-04-02 21:33:23 -07:00
Ilya Kreymer
4b753d2612 Merge branch '0.11.5' into develop 2016-03-31 13:16:53 -07:00
Ilya Kreymer
9381acdaaf Merge branch 'zip-loc-fix' into develop 2016-03-31 13:14:39 -07:00
Ilya Kreymer
b901343067 update CHANGES.rst 2016-03-31 13:14:04 -07:00
Ilya Kreymer
e5ef51363c zipnum: backport fix for #173, paths specified in a zipnum .loc file are relative to the .loc file, not to
the working dir of the application
warnings: don't warn on .gz cdx files
2016-03-31 13:09:57 -07:00
Ilya Kreymer
ba7ac56230 release: bump to 0.11.5, update version and changelist 2016-03-31 12:45:16 -07:00
Ilya Kreymer
b5cf79072d loaders: ensure loader stream closed in load_yaml_config() 2016-03-31 12:42:23 -07:00
Ilya Kreymer
8e51ddc544 archiveiterator: don't reuse entries when post-append, as they may be cached for merge -- can break if records do not alternate
request/response fixes #175
2016-03-31 12:42:23 -07:00
Ilya Kreymer
70fbb5f7a6 ulrewrite: fix typos, add full package paths 2016-03-28 22:59:22 -07:00
Ilya Kreymer
f12be3bc91 urlrewrite app: add bottle-based app, templateview separate from pywb webapp framework 2016-03-27 17:34:45 -04:00
Ilya Kreymer
f8f0c3a76e loader: ensure file closed in load_yaml_config() 2016-03-27 13:56:19 -04:00
Ilya Kreymer
017e9802f8 tests: fix fakeredis patch not running on test_handlers,
use exc str instead of repr for error message for consistency
all tests pass on py2 and py3 again!
2016-03-26 22:32:21 -04:00
Ilya Kreymer
0399cc1046 webagg app: support bottle debug properly as opt param 2016-03-26 22:30:47 -04:00
Ilya Kreymer
3eac9be00b warc: ArchiveLoadFailed: add space in exception string 2016-03-26 22:28:38 -04:00
Ilya Kreymer
c5a166f601 tests: use httpbin.org instead of example.com/ for range-request test 2016-03-26 22:28:04 -04:00
Ilya Kreymer
7884d4394b recorder: close_file() by params rather than exact path, update tests 2016-03-26 13:07:53 -04:00
Ilya Kreymer
7deba42851 add urlrewrite pywb-adapter PlatformHandler for using traditional pywb
setup with webrecorder components recorder and webagg
2016-03-24 16:33:03 -04:00
Ilya Kreymer
2bfe5d4f9e inputreq: only use REQUEST_URI if no SCRIPT_NAME is set (otherwise reconstruct the path) 2016-03-24 16:17:46 -04:00
Ilya Kreymer
b6e988d9a1 self-redirect: if 'status' is a 3xx, call raise_on_self_redirect() to check Location for exact url redirect.
supports both WARC and live loaders, addresses #1
2016-03-24 16:08:29 -04:00
Ilya Kreymer
61921d6c4a tests: add FakeRedisTests class mixin for patching in FakeRedis for tests 2016-03-24 10:45:48 -04:00
Ilya Kreymer
7cc772329c redis: add tests for RedisMultiKeyIndexSource 2016-03-24 10:44:14 -04:00
Ilya Kreymer
64b32dc57a redis support: add RedisMultiKeyIndexSource for using redis SCAN wildcard query and aggregate results from several
redis keys
2016-03-24 01:17:18 -04:00
Ilya Kreymer
e5ddf9d4f4 utils: res_template() supports extra params for interpolation 2016-03-23 23:58:49 -04:00
Ilya Kreymer
ba66d0bb5e recorder: use res_template() to resolve params, rename indexing method to add_urls_to_index 2016-03-23 23:55:21 -04:00
Ilya Kreymer
5fd49f35ee zipnum: when using .loc file, resolve shard paths relative to the .loc file, not from working directory, fixes #173 2016-03-22 11:31:08 -07:00
Ilya Kreymer
a0347a3c42 typo fix 0.11.4 2016-03-21 13:09:03 -07:00
Ilya Kreymer
aa80cd6881 recorder: add simple recorder config indexing to redis 2016-03-21 11:50:01 -07:00
Ilya Kreymer
d38bb5a1fd filters: add extensible 'skip filters', with default filters to accept certain collections, filter out
recording of range requests. Opportunity to skip recording at request or response time
RespWrapper handles reading stream fully on close() (no need for old ReadFullyStream),
skips recording if read was interrupted/incomplete
writer: avoiding writing duplicate content-length/content-type headers
2016-03-21 11:47:12 -07:00
Ilya Kreymer
cbe7d1c981 webagg: add tests for RedisPathResolver and errors on missing warc, missing warc keys 2016-03-21 11:44:32 -07:00
Ilya Kreymer
22ead52604 webagg: convert StreamIter to generate, remove unused ReadFullyStream
loaders: add support for RedisResolver as well as PathPrefixResolver
inputreq: reconstruct_request() skips host header if already present
improve test app to include replay
2016-03-21 11:04:52 -07:00