1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 08:04:49 +01:00

1553 Commits

Author SHA1 Message Date
Ilya Kreymer
895a01933c wb: allow multiple readystateevent changes, in case data changes (eg. title is available later) 2016-09-02 12:04:30 -07:00
Ilya Kreymer
70a25b6d0f client rewrite: ensure window.open() windows have wombat inited. if they are set to about:blank, use parser from opener to ensure proper relative url resolving 2016-08-20 13:03:17 -04:00
Ilya Kreymer
099a81b786 wb_frame: add support for optional 'wbinfo.outer_prefix' which if set, is used for making the top frame url (#191) 2016-08-20 00:03:21 -04:00
Ilya Kreymer
892ebacead cross-frame improvements: #191
- make hashchange functions use postMessage(), support setting top->replay and replay->top
- special postMessage() option for sending message from top frame -> replay frame
- fix history navigation, mimic top frame history same as replay frame as much as possible
- remove iframe_loaded() callback, using postMessage() notifications only
- include document title in 'load' message
2016-08-19 23:44:15 -04:00
Ilya Kreymer
6af1a7856e top-frame handling: don't access contents of top frame directly to support cross-domain frames
set __WB_top_Frame in wombat if is_framed property is true, don't check wbinfo (#191)
2016-08-19 13:59:42 -04:00
Ilya Kreymer
2fb1df34c9 recorder: add upload/streaming support with put_record=stream where the content being uploaded is already in WARC record form 2016-08-12 21:23:25 -04:00
Ilya Kreymer
c8b6a48005 webagg: use prepare_auth() to ensure Authorization header is set for http://user:pass@host urls 2016-08-12 21:22:17 -04:00
Ilya Kreymer
82d3b61523 recorder: catch exception in close_idle_files() if file no longer exists and ensure it's removed 2016-08-12 01:19:30 -04:00
Ilya Kreymer
594aff86d3 webagg: response self-redir: don't check if live, throw correct exception 2016-08-10 00:50:43 -04:00
Ilya Kreymer
92dfcbfcbe rewrite: don't rewrite 'www-authenticate' and 'proxy-authenicate' headers 2016-08-10 00:02:53 -04:00
Ilya Kreymer
cca0c01547 urlrewrite misc fixes:
- ensure content-length is converted to str
- templateview: support optional extensions
- fix test
2016-08-09 19:53:22 -04:00
Ilya Kreymer
b22a29df5f vidrw: also check for 'src' param as well as movie 2016-08-08 19:50:16 -04:00
Ilya Kreymer
c93d7ecafc webagg: Fix loading of url-lookup (url agnostic) revisits, ensure all params passed to cdx lookup, add tests for url-agnostic revisit lookup 2016-08-04 16:53:24 -04:00
Ilya Kreymer
e04095ffbb rewrite css: leave spaces in css url, eg url(' http://example.com/ ') rewritten with spaces intact 2016-08-01 10:29:04 -04:00
Ilya Kreymer
d5adc05cbb history rewrite check: don't check empty urls (#188) 2016-08-01 10:27:38 -04:00
Ilya Kreymer
20b161bf90 debug: print stracktrace when debugging 2016-08-01 02:12:15 -04:00
Ilya Kreymer
68b94fe671 record parser: arc-to-warc: support converting arc records to warc 'response' records on-the-fly to simplify
processing for tools that read WARC records. arc headers are converted to equivalent warc header, WARC-Record-ID
generated on the fly #190
2016-07-31 22:31:21 -04:00
Ilya Kreymer
66ca8d8b26 http block loader: raise exception for 4xx, 5xx responses
tests: add tests for limitreader posting, fix charset for frame test
2016-07-31 12:56:00 -04:00
Ilya Kreymer
db3b92e228 writing: add write_stream_to_file()function to be able to write to a WARC an existing input stream
refactor _do_write_req_resp to pass callback to actual writing (eg. _write_to_file)
2016-07-31 00:49:57 -04:00
Ilya Kreymer
1b09015954 recorder: split up _open_file() into get_new_filename() and allow_new_file() to customize skipping recording by returning false
from allow_new_file()
create_warcinfo_record() - switch to dict args over kwargs, update tests
2016-07-30 13:11:12 -04:00
Ilya Kreymer
c3389987cd frame timestamp extract: fix timestamp extracting timestamp for non-html resources for use with frame display (#189) 2016-07-28 10:06:10 -04:00
Ilya Kreymer
c8c0cecda3 rewrite improvements: if content-type is text/plain but mod is js_ or cs_, treat as js or css (#31)
header rewriter: ensure removed content-length and content-encoding are added back if no rewriting performed on response body
2016-07-27 21:34:58 -04:00
Ilya Kreymer
cd15dbfe48 head_insert: add decodeURI() to prefix to ensure unicode prefix string 2016-07-27 10:34:54 -04:00
Ilya Kreymer
498f87fb54 add Dockerfile to git! 2016-07-26 19:42:59 -04:00
Ilya Kreymer
a5696fc2d4 rewriter: range massage for patch as well as record 2016-07-26 19:42:32 -04:00
Ilya Kreymer
14cf68e4e5 custom record: don't override WARC-Date if provided in request header,
return chosen WARC-Date in json response
2016-07-26 19:41:47 -04:00
Ilya Kreymer
6928d72f68 rewrite css: handle rewriting with entities around url() css by leaving them in place, eg: url("http://example.com/") 2016-07-26 18:12:32 -04:00
Ilya Kreymer
782f95fa97 rules: rules for yt video info update 2016-07-24 19:39:43 -04:00
Ilya Kreymer
34a710e51a custom response: add utf-8 encoding, unless framed replay 2016-07-24 00:14:43 -04:00
Ilya Kreymer
9588e8622f responseloader: quote/unquote Webagg-Source-Coll header as source may contain unicode chars 2016-07-23 21:57:24 -04:00
Ilya Kreymer
42a2fa02fe wombat: history check fix: ensure check applies to absolute url #188 2016-07-16 13:32:46 -04:00
Ilya Kreymer
64a49b3e4d wombat: history change improvements (#188):
- ensure back, go, forward also propagated to top frame
- ensure pushState propagated as pushState and replaceState as replaceState to top frame
- security: prevent pushState or replaceState from changing to different domain
2016-07-16 13:18:08 -04:00
Ilya Kreymer
605ee22bec html rewrite: rewrite href on any element, not just few designated ones, as client side rewriting does the same.
avoids edge cases where href used on other tags (eg. a div) that results in incorrect rewriting, #187
2016-07-16 12:55:24 -04:00
Ilya Kreymer
b46cf8492f bump version to 0.31.5 2016-07-16 12:48:26 -04:00
Ilya Kreymer
ae290587f6 temp cookie store: add add_cookie() function for explicitly adding cookie, make expiry configurable
related to webrecorder/webrecorder#79
2016-07-01 10:15:59 -04:00
Ilya Kreymer
0b57f4a352 cookie notification: use postMessage() instead of callback to notify top frame of cookie setting with custom domain, #186 2016-07-01 09:58:25 -04:00
Ilya Kreymer
827ba9b50f cookies: add optional callback when setting cookie with domain (to experiment with server side handling of custom domain) 2016-06-30 12:26:18 -04:00
Ilya Kreymer
f4e5a7df5d Merge branch 'develop' 2016-06-16 00:41:08 -04:00
Ilya Kreymer
2fba97683a CHANGES for 0.31.0 2016-06-16 00:40:53 -04:00
Ilya Kreymer
5024234552 CHANGES for 0.31.0 2016-06-16 00:39:51 -04:00
Ilya Kreymer
d457223555 tests: add brotli compression test #184 2016-06-16 00:00:47 -04:00
Ilya Kreymer
457a1a564c bufferedreader: support brotli decompression
rewrite: handle Content-Encoding: br using brotli decompressor
setup: add brotlipy as dependency
2016-06-15 01:37:29 -04:00
Ilya Kreymer
bc36ae1302 rewriter: update for moved RewriterAMF in pywb 2016-06-14 00:14:29 -04:00
Ilya Kreymer
c1d7111841 webagg: store original 'source' value in cdx for properly mapping in WARC file resolver
error handling: ensure 'last_exc' is a string
2016-06-14 00:13:01 -04:00
Ilya Kreymer
3b68ef6540 html rewriter: cleanup rewrite_srcset, add more tests for empty rewrite 2016-06-12 01:57:21 -04:00
Ilya Kreymer
6a5842d983 Merge branch 'chdorner-fix-empty-srcset' into empty-attr 2016-06-12 01:53:53 -04:00
Ilya Kreymer
1bfec37970 html rewriter: attr rewrite ops check for empty/blank attr value, return empty string 2016-06-12 01:50:55 -04:00
Ilya Kreymer
d2c37f7d91 html parser: attr_value can now be None -- default to '' for string ops, write attr w/o assignment 2016-06-12 01:38:03 -04:00
Ilya Kreymer
0f530a3e0e dependencies: remove pyamf, update to latest surt (0.3.0) 2016-06-12 00:44:52 -04:00
Ilya Kreymer
9f299eb8e9 amf rewriting: move to separate file, mark as experimental, and don't include as default (for now) 2016-06-12 00:40:35 -04:00