1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

2031 Commits

Author SHA1 Message Date
Ilya Kreymer
24c968640d fuzzymatcher: better fix for mime-type matching if no mime 2017-04-25 14:48:09 -07:00
Ilya Kreymer
b3bc7765a1 fuzzymatcher fix: don't assume 'mime' is always present 2017-04-25 14:42:49 -07:00
Ilya Kreymer
d32c6d492b tests: disable webagg output tests until they can be stabilized 2017-04-24 16:34:53 -07:00
Ilya Kreymer
478600716d urllib3: use version from requests
coverage: use gevent concurrency
2017-04-24 16:32:23 -07:00
Ilya Kreymer
7ceeb32531 proxy support: update for wsgiprox==1.2, transfer-encoding/buffering support now part of wsgiprox
frame insert: set 'iframe_url' to full rewritten url, or in proxy mode, original url with scheme matching current scheme
2017-04-24 15:08:42 -07:00
Ilya Kreymer
15a7b15d44 proxy mode support via rewriterapp!
- check for 'wsgiprox.fixed_host' and use that as host_prefix if set
- don't include Connection/Proxy-Connection headers in upstram request
- ensure proxy response has length or is chunk-encoded
2017-04-22 18:17:41 -07:00
Ilya Kreymer
e060ea7b56 frontendapp: encapsulate, don't extend rewriterapp
rewriterapp: add 'Content-Location' if fuzzy match, or if using memento
tests: fix test to check for Content-Location for fuzzy match instead of redirect
2017-04-21 15:37:21 -07:00
Ilya Kreymer
4b055c9394 client-rewrite: support proper srcset= attr rewriting 2017-04-21 12:31:56 -07:00
Ilya Kreymer
45869eab42 server-side rewrite: experiment with JSONP rewriter, running on all json content #213
(previous json-rewriting defaulted to none)
2017-04-19 15:42:13 -07:00
Ilya Kreymer
3dd6c442ed client-side rewrite: unrewrite accessing Attr object value/nodeValue for href, src, poster attributes 2017-04-18 11:40:28 -07:00
Ilya Kreymer
8849eb494e client-side: init postMessage override on iframe access 2017-04-17 13:39:41 -07:00
Ilya Kreymer
0c833eb27e client-side rewrite fixes:
- rewrite-blob: more generic removal of postMessage override for worker scripts
- rewrite-style: wrap decodeURIComponent in exception handling
2017-04-15 23:37:07 -07:00
Ilya Kreymer
bc50b908b7 html rewrite: fix <base> tag rewriting
ensure 'rebased' urlrewriter is set to absolute url
tests: add test to verify <base> rewriting, relative and absolute
2017-04-15 12:32:16 -07:00
Ilya Kreymer
79a35bcf9c options: add check for 'enable_memento' option before adding memento headers
pass options to frontend app
2017-04-15 08:32:20 -07:00
Ilya Kreymer
bae9a09671 client-side Date override: override 'constructor' property so 'new Date().constructor == Date' 2017-04-14 09:21:29 -07:00
Ilya Kreymer
f593b5f80f trailing slash fix: add trailing slash, preserving query, if no slash present after hostname (#211) 2017-04-04 18:10:49 -07:00
Ilya Kreymer
7ca5795976 ensure trailing slash: redirect to ensure a host-only url has a trailing slash, eg. /live/http://example.com -> /live/http://example.com/ 2017-04-04 15:41:03 -07:00
Ilya Kreymer
26662f7df3 setup: generate current git_hash into autogenerated 'pywb.git_hash' file, add to .gitignore 2017-03-28 10:31:43 -07:00
Ilya Kreymer
69af57dedf js regex rewrite: fix tertiary op rewrite, remove commented out regexs, add a few more tests 2017-03-21 11:50:40 -07:00
Ilya Kreymer
15ad56c024 rewrite dash: support for using custom rewriting function (for FB)
rewrite_fb_dash() added for rewriting dash xml, embedded in js, embedded in html
todo: refactor to make more general support for custom rewriting functions
regex_rewriter: add ':' to exclude from rewrite again
2017-03-21 11:18:53 -07:00
Ilya Kreymer
a20480b9ab wombat rewrite: rewrite href="data:text/css" using rewrite_style()
rewrite_style fix: replace all 'WB_wombat_' in text not just first once
2017-03-21 11:17:15 -07:00
Ilya Kreymer
55def50de7 rewriterapp: readd range: only convert to 206 if response is 200 2017-03-21 18:13:34 +00:00
Ilya Kreymer
5671017e8f rewrite: add rewrite_dash.py for DASH and HLS rewriting 2017-03-20 15:15:00 -07:00
Ilya Kreymer
a82cfc1ab2 rewriter: add rewrite_dash for rewriting DASH and HLS manifests!
rewriter: refactor to use mixins to extend base rewriter (todo: more refactoring)
fuzzy-matcher: support for additional 'match_filters' to filter fuzzy results via optional regexes by mime type,
eg. allow more lenient fuzzy matching on DASH manifests than other resources (for now)
fuzzy-matching: add WebAgg-Fuzzy-Match response header if response is fuzzy matched, redirect to exact match in rewriterapp
2017-03-20 14:41:12 -07:00
Ilya Kreymer
22edb2f14b frontendapp: fix error response return 2017-03-18 16:52:13 -07:00
Ilya Kreymer
0937c2b58f recorder tests: fix revisit/skip tests by switching from httpbin.org/get to httpbin/user-agent,
as /get now inserting random request id and not returning any duplicates
2017-03-18 10:34:28 -07:00
Ilya Kreymer
037fca5b78 tests: fix rewrite test for srcset 2017-03-15 11:43:40 -07:00
Ilya Kreymer
c421b1c5ea html rewriter: srcset rewrite: don't add extra space 2017-03-15 11:15:20 -07:00
Ilya Kreymer
1344907032 wombat fixes: message listener fixes for multiple listeners
- don't reject multiple listeners
- create new WrappedListener() obj for each listener
- extract_orig() add current scheme if url starts with '//'
2017-03-15 11:14:04 -07:00
Ilya Kreymer
93f26452e5 wombat fixes:
- add service worker rewrite
- add documentURI rewrite
- allow history change from "about:blank"
2017-03-14 18:28:18 -07:00
Ilya Kreymer
20e49c7391 karma fixes: avoid accessing undef var 2017-03-14 12:28:13 -07:00
Ilya Kreymer
8ddf43684f karma: add stack trace 2017-03-14 12:14:04 -07:00
Ilya Kreymer
09a0779abb fix karma test for wombat change 2017-03-14 11:59:28 -07:00
Ilya Kreymer
a76dbefec2 regex rewrite: loosen rules for top & location rewrite, add tests
.WB_wombat_location and .WB_wombat_top overrides should help with less strict rewriting
2017-03-14 11:44:15 -07:00
Ilya Kreymer
0f0c20a03a fuzzy matching: new, clean fuzzy matcher implementation for webagg
rules: default rule: fuzzy match urls ignoring prefix match (needs more testing)
tests: update tests for new broad fuzzy match rule
2017-03-14 11:44:15 -07:00
Ilya Kreymer
e0878f0f67 wombat: reinit paths if inited via new window creation/iframe to reflect correct url!
refactor wombat into single _WBWombat object
2017-03-14 11:44:09 -07:00
Ilya Kreymer
8fe2c1b5bd apps & cli: remove old apps, keep:
- webagg-server
- wayback
- live-rewrite-server
support adding custom settings to AutoApp
support for --live flag that automatically adds live-web source at '/live'
tests: disable cdx_server tests as old cdx_server removed
2017-03-12 12:21:54 -07:00
Ilya Kreymer
ac84dcc2e3 setup: cleanup deps: remove urllib3 (installed by requests), add werkzeug to core deps 2017-03-12 12:21:23 -07:00
Ilya Kreymer
57eba8fcde client side rewrite: add override for window.frames access 2017-03-12 09:47:29 -07:00
Ilya Kreymer
cab1c43473 live: switch live-rewrite-server to new arch, remove old live_rewrite_server.py 2017-03-10 14:15:02 -08:00
Ilya Kreymer
544df71302 setup: use latest webtest again
tests: use geventwebserver for LiveServerTests instead of separate process
2017-03-10 11:19:27 -08:00
Ilya Kreymer
baa248c502 responseloader: for py2, look at the original header line only 2017-03-10 11:16:05 -08:00
Ilya Kreymer
d04f8fc2e3 recorder: cookie filter:
- update ExcludeSpecificHeaders() to be passed directly as a filter to warcio
- add ExcludeHttpOnlyCookiesHeader() to exclude only Set-Cookie if HttpOnly is present
remove unused code
2017-03-10 10:07:13 -08:00
Ilya Kreymer
7a8fed2681 update to wario1.1
archiveindexer: explicitly consume content for each record
2017-03-10 10:05:39 -08:00
Ilya Kreymer
af7bbfd6e1 build: update gevent, support py3.6 2017-03-09 11:59:54 -08:00
Ilya Kreymer
d4321792b7 tests: convert test_inputreq to use werkzeug (same as the app), remove bottle from test dependencies 2017-03-08 23:09:19 -08:00
Ilya Kreymer
e86e3e6d32 build process: simplify build process by moving essential deps to requirements.txt, and extras to extra_requirements.txt
setup.py just loads from requirements.txt
Dockerfile pip installs requirements, then extra requirements for improved cacheing
travis runs setup install, then installs extra requirements
2017-03-08 17:05:29 -08:00
Ilya Kreymer
738fc0e427 Merge pull request #209 from ikreymer/warcio-split
Warcio split
2017-03-08 16:35:08 -08:00
Ilya Kreymer
98c0475806 test: fix test to use closest='now' for live test 2017-03-08 12:50:51 -08:00
Ilya Kreymer
a2ffbde2f6 dockerfile: add portalocker
rewriterapp: don't add memento headers for ajax responses to avoid replay issues
2017-03-08 12:30:20 -08:00