Ilya Kreymer
762f669d13
rules: fuzzy match update:
...
- ignore all query args for flash files
- ignore cb= param for all urls
2017-05-12 08:55:03 -07:00
Ilya Kreymer
94262546d5
integration tests: add fixture to run all relevant tests in framed and non-framed mode
...
rename test_framed_inverse -> test_memento, remove unneeded test config
2017-05-03 20:05:07 -07:00
Ilya Kreymer
296b4ed94d
client-side rewrite: remove WB_wombat_ from any id/class= in document.write()
2017-05-03 15:31:06 -07:00
Ilya Kreymer
7434cb619e
config: ensure 'framed_replay' config is loaded again (default to true)
...
config template overrides: check config for overrides for all templates again
fixes #216
2017-05-02 10:05:11 -07:00
Ilya Kreymer
3fea5288b2
tests: fix memento not found test to use different timegate (webenact)
2017-05-01 21:51:59 -07:00
Ilya Kreymer
147c3217dd
update to warcio==1.3
...
recorder: use ArcWarcRecordLoader() for parsing response record
multifilewarcwriter: ensure digest is computed before trying to lookup revisits
2017-05-01 21:50:39 -07:00
Ilya Kreymer
58f39f0558
setup: update to warcio==1.2
...
add ensure_http_headers=True when reading WARC records
tests: fix pytest warnings, use webtest.TestApp instead of TestApp
2017-04-29 13:47:54 -07:00
Ilya Kreymer
14af9287dc
warc loading tests: use custom __repr__ to match results after latest warcio change (for now)
2017-04-28 15:56:58 -07:00
Ilya Kreymer
74e64e701d
py27 fix: add to_native_str() for new url, header usage
2017-04-28 14:40:42 -07:00
Ilya Kreymer
40f4b6bd94
urlrewrite cleanup:
...
frontendapp: pass properly decoded url from router
rewriterapp: read upstream cdx from Webagg-Cdx header
cleanup unused code
2017-04-28 12:37:24 -07:00
Ilya Kreymer
46e2d27e54
webagg improvements:
...
- add _get_referrer() access to index source, can pass to loader via cdx['set_referrer']
- make MementoIndexSource more extensible
- move WAYBACK_ORIG_SUFFIX into BaseIndexSource for extensibility
- fix RemoteIndexSource 'closest' not being set, update template to use 'closest' instead of 'timestamp'
- update remote index tests to use 'closest' instead of 'timestamp'
- loader: set referrer via cdx['set_referrer']
- loader: pass cdx to downstream via Webagg-Cdx header
- utils: ParamFormatter also looks for unprefixed key in params
2017-04-28 12:32:45 -07:00
Ilya Kreymer
082487ab3c
support per-collection assets again:
...
- wb-manager added metadata now loaded dynamically, cached, for search and index pages (#196 )
- metadata updated w/o restart (#87 )
- per-collection template overrides and per-template static file support
tests: test_auto_colls.py fully ported to new system
(per-collection config.yaml no longer supported)
2017-04-26 12:18:36 -07:00
Ilya Kreymer
52dc46fe6a
remove obsolete code and tests!
...
disable test_auto_colls for now until fully supported in new system
2017-04-25 19:39:19 -07:00
Ilya Kreymer
24c968640d
fuzzymatcher: better fix for mime-type matching if no mime
2017-04-25 14:48:09 -07:00
Ilya Kreymer
b3bc7765a1
fuzzymatcher fix: don't assume 'mime' is always present
2017-04-25 14:42:49 -07:00
Ilya Kreymer
d32c6d492b
tests: disable webagg output tests until they can be stabilized
2017-04-24 16:34:53 -07:00
Ilya Kreymer
478600716d
urllib3: use version from requests
...
coverage: use gevent concurrency
2017-04-24 16:32:23 -07:00
Ilya Kreymer
7ceeb32531
proxy support: update for wsgiprox==1.2, transfer-encoding/buffering support now part of wsgiprox
...
frame insert: set 'iframe_url' to full rewritten url, or in proxy mode, original url with scheme matching current scheme
2017-04-24 15:08:42 -07:00
Ilya Kreymer
15a7b15d44
proxy mode support via rewriterapp!
...
- check for 'wsgiprox.fixed_host' and use that as host_prefix if set
- don't include Connection/Proxy-Connection headers in upstram request
- ensure proxy response has length or is chunk-encoded
2017-04-22 18:17:41 -07:00
Ilya Kreymer
e060ea7b56
frontendapp: encapsulate, don't extend rewriterapp
...
rewriterapp: add 'Content-Location' if fuzzy match, or if using memento
tests: fix test to check for Content-Location for fuzzy match instead of redirect
2017-04-21 15:37:21 -07:00
Ilya Kreymer
4b055c9394
client-rewrite: support proper srcset= attr rewriting
2017-04-21 12:31:56 -07:00
Ilya Kreymer
45869eab42
server-side rewrite: experiment with JSONP rewriter, running on all json content #213
...
(previous json-rewriting defaulted to none)
2017-04-19 15:42:13 -07:00
Ilya Kreymer
3dd6c442ed
client-side rewrite: unrewrite accessing Attr object value/nodeValue for href, src, poster attributes
2017-04-18 11:40:28 -07:00
Ilya Kreymer
8849eb494e
client-side: init postMessage override on iframe access
2017-04-17 13:39:41 -07:00
Ilya Kreymer
0c833eb27e
client-side rewrite fixes:
...
- rewrite-blob: more generic removal of postMessage override for worker scripts
- rewrite-style: wrap decodeURIComponent in exception handling
2017-04-15 23:37:07 -07:00
Ilya Kreymer
bc50b908b7
html rewrite: fix <base> tag rewriting
...
ensure 'rebased' urlrewriter is set to absolute url
tests: add test to verify <base> rewriting, relative and absolute
2017-04-15 12:32:16 -07:00
Ilya Kreymer
79a35bcf9c
options: add check for 'enable_memento' option before adding memento headers
...
pass options to frontend app
2017-04-15 08:32:20 -07:00
Ilya Kreymer
bae9a09671
client-side Date override: override 'constructor' property so 'new Date().constructor == Date'
2017-04-14 09:21:29 -07:00
Ilya Kreymer
f593b5f80f
trailing slash fix: add trailing slash, preserving query, if no slash present after hostname ( #211 )
2017-04-04 18:10:49 -07:00
Ilya Kreymer
7ca5795976
ensure trailing slash: redirect to ensure a host-only url has a trailing slash, eg. /live/ http://example.com -> /live/ http://example.com/
2017-04-04 15:41:03 -07:00
Ilya Kreymer
26662f7df3
setup: generate current git_hash into autogenerated 'pywb.git_hash' file, add to .gitignore
2017-03-28 10:31:43 -07:00
Ilya Kreymer
69af57dedf
js regex rewrite: fix tertiary op rewrite, remove commented out regexs, add a few more tests
2017-03-21 11:50:40 -07:00
Ilya Kreymer
15ad56c024
rewrite dash: support for using custom rewriting function (for FB)
...
rewrite_fb_dash() added for rewriting dash xml, embedded in js, embedded in html
todo: refactor to make more general support for custom rewriting functions
regex_rewriter: add ':' to exclude from rewrite again
2017-03-21 11:18:53 -07:00
Ilya Kreymer
a20480b9ab
wombat rewrite: rewrite href="data:text/css" using rewrite_style()
...
rewrite_style fix: replace all 'WB_wombat_' in text not just first once
2017-03-21 11:17:15 -07:00
Ilya Kreymer
55def50de7
rewriterapp: readd range: only convert to 206 if response is 200
2017-03-21 18:13:34 +00:00
Ilya Kreymer
5671017e8f
rewrite: add rewrite_dash.py for DASH and HLS rewriting
2017-03-20 15:15:00 -07:00
Ilya Kreymer
a82cfc1ab2
rewriter: add rewrite_dash for rewriting DASH and HLS manifests!
...
rewriter: refactor to use mixins to extend base rewriter (todo: more refactoring)
fuzzy-matcher: support for additional 'match_filters' to filter fuzzy results via optional regexes by mime type,
eg. allow more lenient fuzzy matching on DASH manifests than other resources (for now)
fuzzy-matching: add WebAgg-Fuzzy-Match response header if response is fuzzy matched, redirect to exact match in rewriterapp
2017-03-20 14:41:12 -07:00
Ilya Kreymer
22edb2f14b
frontendapp: fix error response return
2017-03-18 16:52:13 -07:00
Ilya Kreymer
0937c2b58f
recorder tests: fix revisit/skip tests by switching from httpbin.org/get to httpbin/user-agent,
...
as /get now inserting random request id and not returning any duplicates
2017-03-18 10:34:28 -07:00
Ilya Kreymer
037fca5b78
tests: fix rewrite test for srcset
2017-03-15 11:43:40 -07:00
Ilya Kreymer
c421b1c5ea
html rewriter: srcset rewrite: don't add extra space
2017-03-15 11:15:20 -07:00
Ilya Kreymer
1344907032
wombat fixes: message listener fixes for multiple listeners
...
- don't reject multiple listeners
- create new WrappedListener() obj for each listener
- extract_orig() add current scheme if url starts with '//'
2017-03-15 11:14:04 -07:00
Ilya Kreymer
93f26452e5
wombat fixes:
...
- add service worker rewrite
- add documentURI rewrite
- allow history change from "about:blank"
2017-03-14 18:28:18 -07:00
Ilya Kreymer
20e49c7391
karma fixes: avoid accessing undef var
2017-03-14 12:28:13 -07:00
Ilya Kreymer
8ddf43684f
karma: add stack trace
2017-03-14 12:14:04 -07:00
Ilya Kreymer
09a0779abb
fix karma test for wombat change
2017-03-14 11:59:28 -07:00
Ilya Kreymer
a76dbefec2
regex rewrite: loosen rules for top & location rewrite, add tests
...
.WB_wombat_location and .WB_wombat_top overrides should help with less strict rewriting
2017-03-14 11:44:15 -07:00
Ilya Kreymer
0f0c20a03a
fuzzy matching: new, clean fuzzy matcher implementation for webagg
...
rules: default rule: fuzzy match urls ignoring prefix match (needs more testing)
tests: update tests for new broad fuzzy match rule
2017-03-14 11:44:15 -07:00
Ilya Kreymer
e0878f0f67
wombat: reinit paths if inited via new window creation/iframe to reflect correct url!
...
refactor wombat into single _WBWombat object
2017-03-14 11:44:09 -07:00
Ilya Kreymer
8fe2c1b5bd
apps & cli: remove old apps, keep:
...
- webagg-server
- wayback
- live-rewrite-server
support adding custom settings to AutoApp
support for --live flag that automatically adds live-web source at '/live'
tests: disable cdx_server tests as old cdx_server removed
2017-03-12 12:21:54 -07:00