1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1024 Commits

Author SHA1 Message Date
Ilya Kreymer
27212488e3 tests: zipnum: better test coverage for incorrect idx or loc files, add invalid sample files zipnum-bad{.idx, .loc}, #112 2015-06-05 17:46:45 -07:00
Ilya Kreymer
2b9e1b97c3 rules: disable tw rewrite rule as it was page reloads 2015-06-04 17:31:16 -07:00
Ilya Kreymer
f80be17392 buffering: when buffered_replay is enabled, only buffer responses that do not have a content-length
(eg. rewritten text content) and only buffer up to buffer_max_size (default 16384), and stream the remainder.
if the response has a content-length already, no buffering is performed #111
2015-05-29 19:40:25 -07:00
Ilya Kreymer
15c2ddbfcf header rewriter: cache options: use 'rewrite_opts.http_cache' to set caching headers options, #110
'pass': passthrough original cache headers unrewritten
None (default): rewrite cache headers and don't add anything else
N: set cache-control max-age: N and corresponding expires
N=0: set cache-control: no-cache; no-store
2015-05-29 12:53:29 -07:00
Ilya Kreymer
bb250cafbc zipnum: add query arg to location resolver 2015-05-29 12:52:35 -07:00
Ilya Kreymer
c8980c3f8f query_handler: pass wbrequest.coll as 'coll' param to cdx query automatically 2015-05-29 11:51:16 -07:00
Ilya Kreymer
f26f74ec84 bump version to 0.10.0 2015-05-29 11:48:50 -07:00
Ilya Kreymer
a51b2936f3 zipnum: fix bug with urls in last block not being accessible. when iter_range() fails, if check to see if last_line == end_line,
and if so, check if start_line should also be end_line #112
support non-linenumbered idx files w/o pagination queries
add new zipnum-sample to test cdx lines in last block (previous sample had only one line in last block except the first)
2015-05-29 11:46:00 -07:00
Ilya Kreymer
d104c03135 wombat: check coll prefix w/o mod or timestamp 2015-05-26 18:27:35 -07:00
Ilya Kreymer
07d6031d3e wombat: check for dropped collection and add back to avoid refer-relative redirect check on server 2015-05-26 18:16:17 -07:00
Ilya Kreymer
c4dad56681 rules: add custom js for resizing poster on twitter video images 2015-05-26 15:15:11 -07:00
Ilya Kreymer
ce8da00b89 wombat: wteak history override to be more consistent
add exported 'watch_elem' func to be used by rules for custom ops
2015-05-26 15:14:12 -07:00
Ilya Kreymer
ee20ac66d6 rules: tw video player rules, disable rewriting
rewrite: tweak location rule
wombat: add getAttribute() override, but disabled for now
store default getAttribute()/setAttribute() to refer internally
2015-05-25 17:52:03 -07:00
Ilya Kreymer
6c97fe1d44 vidrw: support livestream playlist, support for generic extractor 2015-05-25 17:48:25 -07:00
Ilya Kreymer
0a606ce558 cdxindexing: store arbitrary json metadata from WARC-Json-Metadata field (experimental) 2015-05-24 20:17:10 -07:00
Ilya Kreymer
b1c9503a9d rewrite: insert head-insert after <html>, <head> and before any other tags (if head is missing)
previously was being inserted after other head tags #109
2015-05-24 20:17:10 -07:00
Ilya Kreymer
af37b99e80 wombat: additional fixes/testing: for about:blank/empty iframes, initialize WB_wombat_location, document.WB_wombat_location and WB_wombat_top immediately.
disable redundant rewrites
vidrw: check for null parent node, fix bug with double-add! don't rewrite added elements
2015-05-24 20:17:04 -07:00
Ilya Kreymer
adb9448f27 rules: improved rules for googleplus! 2015-05-22 18:45:50 -07:00
Ilya Kreymer
d5b92dbb3c rules: update rules for yt comments 2015-05-21 17:20:40 -07:00
Ilya Kreymer
179f11198b fuzzy match: look at first occurence, not last of match seperator
rules: add new rule for yt comments
2015-05-21 23:52:09 +00:00
Ilya Kreymer
35e2e535bb def banner: ensure banner element isn't rewritten! 2015-05-21 12:24:16 -07:00
Ilya Kreymer
a929e96433 wombat: add rewrite_elem() back to main init_dom_override(), check if already overwritten
createElement(): add optional skip arg
2015-05-21 12:11:08 -07:00
Ilya Kreymer
b7e27ba1a8 tests: update tests for keeping scheme-relative, scheme-relative after rewrite #101
remove tests for document.cookie, document.referrer and document.domain rewrite, as this is now handled client-side
2015-05-21 11:38:06 -07:00
Ilya Kreymer
690106bcb4 wombat: more refactoring! enable http/src observer by default, add skip_createElement override
implement document.cookie, document.referrer and document.domain as property overrides instead of WB_wombat rewrites
when a new iframe is loaded, ensure the *document* is reinited with wombat, even if window already has wombat settings
2015-05-21 11:26:54 -07:00
Ilya Kreymer
4983bf4425 rewrite: keep relative scheme after all, to work where with: scheme + "//..." constructions, #101 2015-05-21 11:26:54 -07:00
Ilya Kreymer
9912a31523 wombat: add prototype-level override for innerHTML and outerHTML 2015-05-21 11:26:54 -07:00
Ilya Kreymer
4e1be5c275 wombat work: add createElement() override, use current protocol instead of original url protocol
refactor init_dom_override() to only check children for fragments, add innerHTML override
2015-05-21 11:26:54 -07:00
Ilya Kreymer
058b25ec5a wombat: test with href overrides 2015-05-21 11:26:54 -07:00
Ilya Kreymer
c5a5d45a58 wombat: experimenting with initializing wombat on iframe init directly, rather than waiting for injected init... 2015-05-21 11:26:54 -07:00
Ilya Kreymer
4603b423f4 bump to 0.9.9 dev 2015-05-21 11:25:31 -07:00
Ilya Kreymer
ec3ea69225 Merge branch 'develop' 2015-05-14 22:32:47 -07:00
Ilya Kreymer
cb6efeff6a CHANGES.rst update 2015-05-14 22:32:10 -07:00
Ilya Kreymer
3ab4fa7487 More CHANGES.rst updates 2015-05-14 22:32:10 -07:00
Ilya Kreymer
0223ac0489 rewrite: top rewrite: avoid rewriting 'top(' 2015-05-14 22:32:10 -07:00
Ilya Kreymer
d55bac70c1 update version for 0.9.8 release 2015-05-14 22:32:10 -07:00
Ilya Kreymer
e92f657cca update CHNAGES for 0.9.8 2015-05-14 22:32:10 -07:00
Ilya Kreymer
5cf7368f90 default config: set default 'archive_paths' to current directory, to avoid exception no startup 2015-05-14 22:32:10 -07:00
Ilya Kreymer
557f26b852 config: allow custom config.yaml settings for automatic collections.
settings in config.yaml are merged with collection-specific settings, which take precedence
(before, the config.yaml settings were being overwritten) #103
2015-05-14 22:32:09 -07:00
Ilya Kreymer
d2763004dd wombat: for now, disable node observers by default 2015-05-14 22:32:09 -07:00
Ilya Kreymer
d8b11db1e7 wombat:
ajax: always explicitly add X-Requested-With: XMLHttpRequest to ajax requests
mutation obs: don't rewrite <link rel=canonical> with node added observer
2015-05-14 22:32:09 -07:00
Ilya Kreymer
e94b239d84 rewrite: when rewriting scheme-relative urls, if adding an absolute prefix, use the scheme of the prefix
otherwise, keep relative scheme #101
2015-05-14 22:32:09 -07:00
Ilya Kreymer
fd4a0cc9b1 wombat: add extra mutation observer for any nodes added 2015-05-14 22:32:09 -07:00
Ilya Kreymer
1b9ef4e325 html_rewriter: handle parse_comments by rewriting as html, not as js, should address ikreymer/pywb-webrecorder#7 2015-05-14 22:32:09 -07:00
Ilya Kreymer
40f15cf6ea rules: add location rewrite only rule for disqus
wombat: ensure _orig_setAttribute is still set even if setAttribute rewriting disabled!
2015-05-14 22:32:09 -07:00
Ilya Kreymer
86be72b30a query_handler: specify matchType exact for all queries, in case url ends in * 2015-05-14 22:32:09 -07:00
Ilya Kreymer
b2e26eeb27 wombat: remove timezone offset, as Date.now() already UTC 2015-05-14 22:32:09 -07:00
Ilya Kreymer
7cbf43872f wombat: obey _no_rewrite for rewrite_elem() 2015-05-14 22:32:08 -07:00
Ilya Kreymer
15ac7ea1f8 vidrw: just check 'ustream' in url 2015-05-14 22:32:08 -07:00
Ilya Kreymer
7a0ab76a07 vidrw work: limit flashvar parsing to ustream (for now) 2015-05-14 22:32:08 -07:00
Ilya Kreymer
9a90af595c views: don't add head_insert for ajax requests! 2015-05-14 22:32:08 -07:00