Ilya Kreymer
d8f035642b
fuzzymatching: add new ext based rule. fuzzy match if url has an ext except those on the 'not_ext' list ( #218 )
2017-05-19 10:53:09 -07:00
Ilya Kreymer
f0f274c0c9
wb_frame: allow "load" event to pushState() instead of replaceState() if window.pushStateOnLoad.
...
This is necessary to have working history when running in electron, which does not combine
iframe history into the top-frame history
2017-05-16 17:18:37 -07:00
Ilya Kreymer
d6cfb7cd2d
wb_frame/wb.js: don't call push_state() if already on the current state,
...
eg. if two load events received for different readyState
add document.readyState to load event
2017-05-15 22:26:52 -07:00
Ilya Kreymer
762f669d13
rules: fuzzy match update:
...
- ignore all query args for flash files
- ignore cb= param for all urls
2017-05-12 08:55:03 -07:00
Ilya Kreymer
94262546d5
integration tests: add fixture to run all relevant tests in framed and non-framed mode
...
rename test_framed_inverse -> test_memento, remove unneeded test config
2017-05-03 20:05:07 -07:00
Ilya Kreymer
296b4ed94d
client-side rewrite: remove WB_wombat_ from any id/class= in document.write()
2017-05-03 15:31:06 -07:00
Ilya Kreymer
7434cb619e
config: ensure 'framed_replay' config is loaded again (default to true)
...
config template overrides: check config for overrides for all templates again
fixes #216
2017-05-02 10:05:11 -07:00
Ilya Kreymer
3fea5288b2
tests: fix memento not found test to use different timegate (webenact)
2017-05-01 21:51:59 -07:00
Ilya Kreymer
147c3217dd
update to warcio==1.3
...
recorder: use ArcWarcRecordLoader() for parsing response record
multifilewarcwriter: ensure digest is computed before trying to lookup revisits
2017-05-01 21:50:39 -07:00
Ilya Kreymer
58f39f0558
setup: update to warcio==1.2
...
add ensure_http_headers=True when reading WARC records
tests: fix pytest warnings, use webtest.TestApp instead of TestApp
2017-04-29 13:47:54 -07:00
Ilya Kreymer
14af9287dc
warc loading tests: use custom __repr__ to match results after latest warcio change (for now)
2017-04-28 15:56:58 -07:00
Ilya Kreymer
74e64e701d
py27 fix: add to_native_str() for new url, header usage
2017-04-28 14:40:42 -07:00
Ilya Kreymer
40f4b6bd94
urlrewrite cleanup:
...
frontendapp: pass properly decoded url from router
rewriterapp: read upstream cdx from Webagg-Cdx header
cleanup unused code
2017-04-28 12:37:24 -07:00
Ilya Kreymer
46e2d27e54
webagg improvements:
...
- add _get_referrer() access to index source, can pass to loader via cdx['set_referrer']
- make MementoIndexSource more extensible
- move WAYBACK_ORIG_SUFFIX into BaseIndexSource for extensibility
- fix RemoteIndexSource 'closest' not being set, update template to use 'closest' instead of 'timestamp'
- update remote index tests to use 'closest' instead of 'timestamp'
- loader: set referrer via cdx['set_referrer']
- loader: pass cdx to downstream via Webagg-Cdx header
- utils: ParamFormatter also looks for unprefixed key in params
2017-04-28 12:32:45 -07:00
Ilya Kreymer
082487ab3c
support per-collection assets again:
...
- wb-manager added metadata now loaded dynamically, cached, for search and index pages (#196 )
- metadata updated w/o restart (#87 )
- per-collection template overrides and per-template static file support
tests: test_auto_colls.py fully ported to new system
(per-collection config.yaml no longer supported)
2017-04-26 12:18:36 -07:00
Ilya Kreymer
52dc46fe6a
remove obsolete code and tests!
...
disable test_auto_colls for now until fully supported in new system
2017-04-25 19:39:19 -07:00
Ilya Kreymer
24c968640d
fuzzymatcher: better fix for mime-type matching if no mime
2017-04-25 14:48:09 -07:00
Ilya Kreymer
b3bc7765a1
fuzzymatcher fix: don't assume 'mime' is always present
2017-04-25 14:42:49 -07:00
Ilya Kreymer
d32c6d492b
tests: disable webagg output tests until they can be stabilized
2017-04-24 16:34:53 -07:00
Ilya Kreymer
478600716d
urllib3: use version from requests
...
coverage: use gevent concurrency
2017-04-24 16:32:23 -07:00
Ilya Kreymer
7ceeb32531
proxy support: update for wsgiprox==1.2, transfer-encoding/buffering support now part of wsgiprox
...
frame insert: set 'iframe_url' to full rewritten url, or in proxy mode, original url with scheme matching current scheme
2017-04-24 15:08:42 -07:00
Ilya Kreymer
15a7b15d44
proxy mode support via rewriterapp!
...
- check for 'wsgiprox.fixed_host' and use that as host_prefix if set
- don't include Connection/Proxy-Connection headers in upstram request
- ensure proxy response has length or is chunk-encoded
2017-04-22 18:17:41 -07:00
Ilya Kreymer
e060ea7b56
frontendapp: encapsulate, don't extend rewriterapp
...
rewriterapp: add 'Content-Location' if fuzzy match, or if using memento
tests: fix test to check for Content-Location for fuzzy match instead of redirect
2017-04-21 15:37:21 -07:00
Ilya Kreymer
4b055c9394
client-rewrite: support proper srcset= attr rewriting
2017-04-21 12:31:56 -07:00
Ilya Kreymer
45869eab42
server-side rewrite: experiment with JSONP rewriter, running on all json content #213
...
(previous json-rewriting defaulted to none)
2017-04-19 15:42:13 -07:00
Ilya Kreymer
3dd6c442ed
client-side rewrite: unrewrite accessing Attr object value/nodeValue for href, src, poster attributes
2017-04-18 11:40:28 -07:00
Ilya Kreymer
8849eb494e
client-side: init postMessage override on iframe access
2017-04-17 13:39:41 -07:00
Ilya Kreymer
0c833eb27e
client-side rewrite fixes:
...
- rewrite-blob: more generic removal of postMessage override for worker scripts
- rewrite-style: wrap decodeURIComponent in exception handling
2017-04-15 23:37:07 -07:00
Ilya Kreymer
bc50b908b7
html rewrite: fix <base> tag rewriting
...
ensure 'rebased' urlrewriter is set to absolute url
tests: add test to verify <base> rewriting, relative and absolute
2017-04-15 12:32:16 -07:00
Ilya Kreymer
79a35bcf9c
options: add check for 'enable_memento' option before adding memento headers
...
pass options to frontend app
2017-04-15 08:32:20 -07:00
Ilya Kreymer
bae9a09671
client-side Date override: override 'constructor' property so 'new Date().constructor == Date'
2017-04-14 09:21:29 -07:00
Ilya Kreymer
f593b5f80f
trailing slash fix: add trailing slash, preserving query, if no slash present after hostname ( #211 )
2017-04-04 18:10:49 -07:00
Ilya Kreymer
7ca5795976
ensure trailing slash: redirect to ensure a host-only url has a trailing slash, eg. /live/ http://example.com -> /live/ http://example.com/
2017-04-04 15:41:03 -07:00
Ilya Kreymer
26662f7df3
setup: generate current git_hash into autogenerated 'pywb.git_hash' file, add to .gitignore
2017-03-28 10:31:43 -07:00
Ilya Kreymer
69af57dedf
js regex rewrite: fix tertiary op rewrite, remove commented out regexs, add a few more tests
2017-03-21 11:50:40 -07:00
Ilya Kreymer
15ad56c024
rewrite dash: support for using custom rewriting function (for FB)
...
rewrite_fb_dash() added for rewriting dash xml, embedded in js, embedded in html
todo: refactor to make more general support for custom rewriting functions
regex_rewriter: add ':' to exclude from rewrite again
2017-03-21 11:18:53 -07:00
Ilya Kreymer
a20480b9ab
wombat rewrite: rewrite href="data:text/css" using rewrite_style()
...
rewrite_style fix: replace all 'WB_wombat_' in text not just first once
2017-03-21 11:17:15 -07:00
Ilya Kreymer
55def50de7
rewriterapp: readd range: only convert to 206 if response is 200
2017-03-21 18:13:34 +00:00
Ilya Kreymer
5671017e8f
rewrite: add rewrite_dash.py for DASH and HLS rewriting
2017-03-20 15:15:00 -07:00
Ilya Kreymer
a82cfc1ab2
rewriter: add rewrite_dash for rewriting DASH and HLS manifests!
...
rewriter: refactor to use mixins to extend base rewriter (todo: more refactoring)
fuzzy-matcher: support for additional 'match_filters' to filter fuzzy results via optional regexes by mime type,
eg. allow more lenient fuzzy matching on DASH manifests than other resources (for now)
fuzzy-matching: add WebAgg-Fuzzy-Match response header if response is fuzzy matched, redirect to exact match in rewriterapp
2017-03-20 14:41:12 -07:00
Ilya Kreymer
22edb2f14b
frontendapp: fix error response return
2017-03-18 16:52:13 -07:00
Ilya Kreymer
0937c2b58f
recorder tests: fix revisit/skip tests by switching from httpbin.org/get to httpbin/user-agent,
...
as /get now inserting random request id and not returning any duplicates
2017-03-18 10:34:28 -07:00
Ilya Kreymer
037fca5b78
tests: fix rewrite test for srcset
2017-03-15 11:43:40 -07:00
Ilya Kreymer
c421b1c5ea
html rewriter: srcset rewrite: don't add extra space
2017-03-15 11:15:20 -07:00
Ilya Kreymer
1344907032
wombat fixes: message listener fixes for multiple listeners
...
- don't reject multiple listeners
- create new WrappedListener() obj for each listener
- extract_orig() add current scheme if url starts with '//'
2017-03-15 11:14:04 -07:00
Ilya Kreymer
93f26452e5
wombat fixes:
...
- add service worker rewrite
- add documentURI rewrite
- allow history change from "about:blank"
2017-03-14 18:28:18 -07:00
Ilya Kreymer
20e49c7391
karma fixes: avoid accessing undef var
2017-03-14 12:28:13 -07:00
Ilya Kreymer
8ddf43684f
karma: add stack trace
2017-03-14 12:14:04 -07:00
Ilya Kreymer
09a0779abb
fix karma test for wombat change
2017-03-14 11:59:28 -07:00
Ilya Kreymer
a76dbefec2
regex rewrite: loosen rules for top & location rewrite, add tests
...
.WB_wombat_location and .WB_wombat_top overrides should help with less strict rewriting
2017-03-14 11:44:15 -07:00