1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 08:04:49 +01:00

1591 Commits

Author SHA1 Message Date
Ilya Kreymer
80d9805a58 webagg: tests: flush fakeredis for reentrancy
utils: add load_config() with option for main and override configs
2016-05-19 17:01:09 -07:00
Ilya Kreymer
8ad66249c7 blockloader: support for loader profiles, specified via 'profile+scheme://...' urls. Profiles specify additional settings (eg. credentials) that are not included in the url. To enabl
e custom profiles, BlockLoader.set_profile_loader(callable) to a callable that will return custom config, addresses #180
2016-05-18 16:34:58 -07:00
Ilya Kreymer
d11bd444ad s3 loader: unurlencode username/password 2016-05-17 19:24:14 -07:00
Ilya Kreymer
119074e0ee s3 loader improvements: support AWS cred in username and password part of url, stream s3 response directly 2016-05-17 18:55:10 -07:00
Ilya Kreymer
94afab0bb2 wombat rewrite: don't add duplicate slash in rel-url resolve 2016-05-17 18:53:00 -07:00
Ilya Kreymer
10d8e4b3be bump version to 0.31.0 2016-05-17 18:38:57 -07:00
Ilya Kreymer
45c8fcddbd recorder: add max_idle_secs / close_idle_files() to close any open files that have not been modified longer than set threshold, in prep for webrecorder/webrecorder#92
indexer: add 'full_warc_prefix' for setting full path prefix in add_warc_file() (eg. for http load) for webrecorder/webrecorder#95
2016-05-11 21:40:02 -07:00
Ilya Kreymer
94d6098238 app: separate json_encode() func
compat: py2 fixes
2016-05-11 11:38:59 -07:00
Ilya Kreymer
c45f5cb749 webagg: use werkzeug routing instead of wrapping Bottle app 2016-05-10 16:31:44 -07:00
Ilya Kreymer
464eca2fa0 test apps: enable debugging for test apps
test recorder: write to a temp dir for each run
2016-05-06 16:33:18 -07:00
Ilya Kreymer
e64ae780c6 urlrewrite: improve POST request support for ikreymer/pywb#178 2016-05-06 16:32:13 -07:00
Ilya Kreymer
87da25c703 post request mapping improvements: work on #178, including:
- mapping multipart/form-data same as x-www-form-urlencoded
- parsing application/x-amf with pyamf
- RewriteContentAMF for rewriting AMF response to match request
- default encoding of other POST data as base64 encoded __wb_post_data param
2016-05-06 10:19:08 -07:00
Ilya Kreymer
e5e7c5a7df wombat: ensure Math.random() overrides use the current window 2016-05-06 09:48:38 -07:00
Ilya Kreymer
1e7d4d27e3 bump version to 0.30.2 2016-05-06 09:43:11 -07:00
Ilya Kreymer
ab3af90df2 cookie_tracker: add support for redis-based subdomain cookie tracker, which temp caches cookies with Domain= set in redis and passes them upstream
when rewriting. addresses webrecorder/webrecorder#79
2016-05-04 16:39:47 -07:00
Ilya Kreymer
8e473f01fa add changelist for 0.30.1 2016-05-04 11:33:43 -07:00
Ilya Kreymer
2795802c77 recordloader: for request/response/revisit records, only parse urls starting with http:/https: as http 2016-05-04 11:20:38 -07:00
Ilya Kreymer
af920d77a0 rules: add fuzzy rules for TW video 2016-05-03 17:33:13 -07:00
Ilya Kreymer
07cc4fae0b bump version to 0.30.1 2016-05-03 17:32:35 -07:00
Ilya Kreymer
3a3110efdb fix README typo 2016-05-01 11:57:37 -07:00
Ilya Kreymer
e458bdcc77 CHANGES tweaks 2016-05-01 11:53:23 -07:00
Ilya Kreymer
033909efe0 wombat: set version to 1.12
return 'null' for frameElement ovevrride instead of undefined
2016-05-01 11:46:36 -07:00
Ilya Kreymer
4df45b4338 Update CHANGES for 0.30.0! 2016-05-01 11:45:01 -07:00
Ilya Kreymer
dd8ac42f2c encoding: ensure cdx fields are in the native encoding, except filename, which should stay as unicode in py2 for further use 2016-04-30 16:08:43 -07:00
Ilya Kreymer
e8c77c0538 encoding: encode before quote
setup: enable zip_safe=True again
2016-04-30 15:15:35 -07:00
Ilya Kreymer
ab8b4efaec encoding: cdx: only quote-encode 'url'
warc: ensure path index loads are utf-8 decoded
2016-04-30 14:38:48 -07:00
Ilya Kreymer
228ca58c5b recorer: actually fix content-type on warcinfo, add to test! 2016-04-30 13:07:53 -07:00
Ilya Kreymer
0fbae1c7f8 recorder: ensure warcinfo record has a content-type 2016-04-30 10:19:20 -07:00
Ilya Kreymer
67a02613e7 remove: remove unused/extraneous __iter__ 2016-04-30 01:43:53 -07:00
Ilya Kreymer
1c97a67763 rewrite client-side improvements:
add WB_wombat_frameElement Object prototype property to support frameElement rewriting
document.domain: allow changing to higher-level domain
rewrite_elem: also rewrite <form> action and <input> value, if they are absolute urls
2016-04-30 01:43:40 -07:00
Ilya Kreymer
1bea9d73ed rewrite: rewrite .frameElement -> WB_wombat_frameElement server-side to handle cases when default frameElement can not be overridden 2016-04-30 01:36:26 -07:00
Ilya Kreymer
37609ebdc9 rewrite: support custom cookie_rewriter passed to 'rewrite_content' 2016-04-30 01:35:55 -07:00
Ilya Kreymer
e669ecba15 rewrite: html rewrite fix such that head insert is placed before other <script> tags even if no head 2016-04-30 01:32:16 -07:00
Ilya Kreymer
7a0dd463cd webagg: responseloader: use urllib3 directly instead of requests to
take advantage of connection pooling w/o storing/sharing cookies
2016-04-27 10:16:54 -07:00
Ilya Kreymer
9010e52663 urlrewrite: refactor simpleapp to support live/record/replay 2016-04-27 10:15:48 -07:00
Ilya Kreymer
f119d05724 recorder: fix simplerec init
tests: improve tests for skipping request and response headers
2016-04-27 09:52:56 -07:00
Ilya Kreymer
a1e0c29a85 rules: add rule for twitter timeline 2016-04-26 17:02:54 -07:00
Ilya Kreymer
658303caad rewrite headers: undo not rewriting x- headers, needs more research and exclusions (eg. x-frame-options) 2016-04-26 13:11:08 -07:00
Ilya Kreymer
cf6cfc0c44 tests: fix cookie rewriter tests to exclude 2.6 2016-04-26 10:32:43 -07:00
Ilya Kreymer
4a60e15577 cookie rewrite improvements: #177
- don't remove max-age and expires if in 'live' rewrite mode (flag set on urlrewriter)
- remove secure only if replay prefix is not https
- fix expires UTC->GMT as cookie parsing chokes on UTC
- other rewriting: don't append rewrite prefix to x- headers
tests: add more cookie rewriting tests
2016-04-26 09:45:23 -07:00
Ilya Kreymer
a82e2785c7 tests: add basic test for rewriterapp 2016-04-25 14:29:28 -07:00
Ilya Kreymer
3b6cab1730 urlrewrite: remove dependency on bottle from rewriterapp,
add overridable error and query views, with extensible get_query_params() and process_cdx_query()
to extend cdx for query view
add get_top_url() for adding custom top_url for frame insert
add call_with_params() for adding custom params to environ
2016-04-25 12:05:43 -07:00
Ilya Kreymer
b056acd88e urlrewrite: add support for index query 2016-04-15 04:01:36 +00:00
Ilya Kreymer
0370470e68 urlrewrite: http range: support skipping record for range requests not starting at 0-
and performing async request,
support converting unbounded 0- to non-ranged and back
2016-04-15 02:21:39 +00:00
Ilya Kreymer
0b255819ff recorder warcwriter: allow skipping writing of only request or only response by overriding _is_write_req and _is_write_resp in subclass
(todo: rethink the interface)
2016-04-15 02:19:34 +00:00
Ilya Kreymer
a93f75dca2 webagg: add preliminary 'fuzzy matching' fallback support, currently enabled for all sources
(todo: need to only include sources that support it)
2016-04-15 02:18:20 +00:00
Ilya Kreymer
61381fcac6 wombat rewrite: remove cookie domain if hostname is an IP address 2016-04-07 15:53:26 -07:00
Ilya Kreymer
00bdddd1e9 recorder: SkipDupePolicy only skips if url is an exact match (not just by urlkey) 2016-04-07 10:44:05 -07:00
Ilya Kreymer
f4cc143dc7 urlrewrite: generalize support for overridable handle_custom_response() callback for handling modifiers (default support top-frame)
pass headers to add_custom_params, include error message on error if available
headers: use add_header() to support multiple headers with same name
is_ajax(): check for X-Pywb-Requested-With header to make as ajax and not pass to upstream
2016-04-07 10:39:12 -07:00
Ilya Kreymer
95a212ed79 wombat rewrite: add custom X-Pywb-Requested-With header with turns off rewriting and is never sent upstream 2016-04-06 12:05:53 -07:00