1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 08:04:49 +01:00

1670 Commits

Author SHA1 Message Date
Ilya Kreymer
9a3017bfcd bump version to 0.32.1 2016-09-20 15:44:49 -07:00
Ilya Kreymer
5c499753f8 webrecore Docker: update Docker file to latest pywb, python, starting to use versioning! 2016-09-16 18:43:26 -07:00
Ilya Kreymer
874bef0ab1 Update CHANGES 0.32.0 2016-09-15 14:18:44 -07:00
Ilya Kreymer
dc05d14934 Merge pull request #194 from nlevitt/cli-desc
fix/tweak for cli --help
2016-09-15 14:16:42 -07:00
Ilya Kreymer
c3f98c3d38 Merge branch 'develop' 2016-09-15 14:15:34 -07:00
Ilya Kreymer
d95116885a Update CHANGES for 0.32.0 2016-09-15 14:14:05 -07:00
Ilya Kreymer
86cbb366f3 rules: undo yt rules change (will revisit later) 2016-09-15 10:01:36 -07:00
Ilya Kreymer
0a76a56b91 wombat: edge case: correctly handle <iframe src="javascript:WB_wombat_location=...> assignment created via JS.. custom rewrite_frame_src() added for use with rewrite_elem(), ensures wombat init is inserted first thing after 'javascript:' 2016-09-14 15:44:20 -07:00
Ilya Kreymer
cc65ce914d wombat improvements (2.16):
- rewrite_elem() also rewrite 'poster'
- extract_orig() don't add http:// if nothing extracted
- new override: navigator.sendBeacon() if available
2016-09-14 14:13:59 -07:00
Ilya Kreymer
5fede0fea3 wombat: turn off debugging (accidentally committed) 2016-09-14 13:39:10 -07:00
Ilya Kreymer
1fb6e9b5fa rewrite: url rewriter: don't rewrite relative urls, only those that start with scheme, / or contain ../ #195
update tests to reflect this new behavior
2016-09-14 13:04:46 -07:00
Noah Levitt
1620668363 fix/tweak for cli --help 2016-09-14 09:58:44 -07:00
Ilya Kreymer
70fdaae2b3 rules: rewrite location string for periscope js 2016-09-12 20:07:14 -07:00
Ilya Kreymer
1a37d789ed cdx-api: when using cdx server api, return no captures 404 error in json format if output=json, plain text otherwise instead of as html #193 2016-09-08 18:59:52 -07:00
Ilya Kreymer
f47ae0bb7e rewrite: for rewriting on* attr, add 'window.' before WB_wombat_ as window may not be in scope (if no '.' before WB_wombat) 2016-09-08 18:38:35 -07:00
Ilya Kreymer
6452c72b4f bump versions 2016-09-08 10:31:07 -07:00
Ilya Kreymer
1fe201c528 rewrite: html: rewrite svg <image> tag
client: update textContent after rewrite_style() in rewrite_elem()
2016-09-08 10:06:47 -07:00
Ilya Kreymer
895a01933c wb: allow multiple readystateevent changes, in case data changes (eg. title is available later) 2016-09-02 12:04:30 -07:00
Ilya Kreymer
70a25b6d0f client rewrite: ensure window.open() windows have wombat inited. if they are set to about:blank, use parser from opener to ensure proper relative url resolving 2016-08-20 13:03:17 -04:00
Ilya Kreymer
099a81b786 wb_frame: add support for optional 'wbinfo.outer_prefix' which if set, is used for making the top frame url (#191) 2016-08-20 00:03:21 -04:00
Ilya Kreymer
892ebacead cross-frame improvements: #191
- make hashchange functions use postMessage(), support setting top->replay and replay->top
- special postMessage() option for sending message from top frame -> replay frame
- fix history navigation, mimic top frame history same as replay frame as much as possible
- remove iframe_loaded() callback, using postMessage() notifications only
- include document title in 'load' message
2016-08-19 23:44:15 -04:00
Ilya Kreymer
6af1a7856e top-frame handling: don't access contents of top frame directly to support cross-domain frames
set __WB_top_Frame in wombat if is_framed property is true, don't check wbinfo (#191)
2016-08-19 13:59:42 -04:00
Ilya Kreymer
2fb1df34c9 recorder: add upload/streaming support with put_record=stream where the content being uploaded is already in WARC record form 2016-08-12 21:23:25 -04:00
Ilya Kreymer
c8b6a48005 webagg: use prepare_auth() to ensure Authorization header is set for http://user:pass@host urls 2016-08-12 21:22:17 -04:00
Ilya Kreymer
82d3b61523 recorder: catch exception in close_idle_files() if file no longer exists and ensure it's removed 2016-08-12 01:19:30 -04:00
Ilya Kreymer
594aff86d3 webagg: response self-redir: don't check if live, throw correct exception 2016-08-10 00:50:43 -04:00
Ilya Kreymer
92dfcbfcbe rewrite: don't rewrite 'www-authenticate' and 'proxy-authenicate' headers 2016-08-10 00:02:53 -04:00
Ilya Kreymer
cca0c01547 urlrewrite misc fixes:
- ensure content-length is converted to str
- templateview: support optional extensions
- fix test
2016-08-09 19:53:22 -04:00
Ilya Kreymer
b22a29df5f vidrw: also check for 'src' param as well as movie 2016-08-08 19:50:16 -04:00
Ilya Kreymer
c93d7ecafc webagg: Fix loading of url-lookup (url agnostic) revisits, ensure all params passed to cdx lookup, add tests for url-agnostic revisit lookup 2016-08-04 16:53:24 -04:00
Ilya Kreymer
e04095ffbb rewrite css: leave spaces in css url, eg url(' http://example.com/ ') rewritten with spaces intact 2016-08-01 10:29:04 -04:00
Ilya Kreymer
d5adc05cbb history rewrite check: don't check empty urls (#188) 2016-08-01 10:27:38 -04:00
Ilya Kreymer
20b161bf90 debug: print stracktrace when debugging 2016-08-01 02:12:15 -04:00
Ilya Kreymer
68b94fe671 record parser: arc-to-warc: support converting arc records to warc 'response' records on-the-fly to simplify
processing for tools that read WARC records. arc headers are converted to equivalent warc header, WARC-Record-ID
generated on the fly #190
2016-07-31 22:31:21 -04:00
Ilya Kreymer
66ca8d8b26 http block loader: raise exception for 4xx, 5xx responses
tests: add tests for limitreader posting, fix charset for frame test
2016-07-31 12:56:00 -04:00
Ilya Kreymer
db3b92e228 writing: add write_stream_to_file()function to be able to write to a WARC an existing input stream
refactor _do_write_req_resp to pass callback to actual writing (eg. _write_to_file)
2016-07-31 00:49:57 -04:00
Ilya Kreymer
1b09015954 recorder: split up _open_file() into get_new_filename() and allow_new_file() to customize skipping recording by returning false
from allow_new_file()
create_warcinfo_record() - switch to dict args over kwargs, update tests
2016-07-30 13:11:12 -04:00
Ilya Kreymer
c3389987cd frame timestamp extract: fix timestamp extracting timestamp for non-html resources for use with frame display (#189) 2016-07-28 10:06:10 -04:00
Ilya Kreymer
c8c0cecda3 rewrite improvements: if content-type is text/plain but mod is js_ or cs_, treat as js or css (#31)
header rewriter: ensure removed content-length and content-encoding are added back if no rewriting performed on response body
2016-07-27 21:34:58 -04:00
Ilya Kreymer
cd15dbfe48 head_insert: add decodeURI() to prefix to ensure unicode prefix string 2016-07-27 10:34:54 -04:00
Ilya Kreymer
498f87fb54 add Dockerfile to git! 2016-07-26 19:42:59 -04:00
Ilya Kreymer
a5696fc2d4 rewriter: range massage for patch as well as record 2016-07-26 19:42:32 -04:00
Ilya Kreymer
14cf68e4e5 custom record: don't override WARC-Date if provided in request header,
return chosen WARC-Date in json response
2016-07-26 19:41:47 -04:00
Ilya Kreymer
6928d72f68 rewrite css: handle rewriting with entities around url() css by leaving them in place, eg: url(&quot;http://example.com/&quot;) 2016-07-26 18:12:32 -04:00
Ilya Kreymer
782f95fa97 rules: rules for yt video info update 2016-07-24 19:39:43 -04:00
Ilya Kreymer
34a710e51a custom response: add utf-8 encoding, unless framed replay 2016-07-24 00:14:43 -04:00
Ilya Kreymer
9588e8622f responseloader: quote/unquote Webagg-Source-Coll header as source may contain unicode chars 2016-07-23 21:57:24 -04:00
Ilya Kreymer
42a2fa02fe wombat: history check fix: ensure check applies to absolute url #188 2016-07-16 13:32:46 -04:00
Ilya Kreymer
64a49b3e4d wombat: history change improvements (#188):
- ensure back, go, forward also propagated to top frame
- ensure pushState propagated as pushState and replaceState as replaceState to top frame
- security: prevent pushState or replaceState from changing to different domain
2016-07-16 13:18:08 -04:00
Ilya Kreymer
605ee22bec html rewrite: rewrite href on any element, not just few designated ones, as client side rewriting does the same.
avoids edge cases where href used on other tags (eg. a div) that results in incorrect rewriting, #187
2016-07-16 12:55:24 -04:00