Ilya Kreymer
a8c0ff3c06
client rewrite: fix window.fetch override, create new Request object if url is rewritten
2016-11-23 11:46:01 -08:00
Ilya Kreymer
e5adc5ba69
responseloader: ensure Host header is correct when sending non-live remote request
2016-11-23 11:41:13 -08:00
Ilya Kreymer
685d48d531
webagg: split RedisMultiKeyIndexSource into Base* version to make more extensible, support different agg mixin
...
indexsource: update __repr__ funcs to use current classname
2016-11-22 18:21:01 -08:00
Ilya Kreymer
99e533d31b
remoteindex: add limit when doing closest query
...
responseloader: ignore scheme from self-redir check
2016-11-21 21:42:07 -08:00
Ilya Kreymer
74276f58f3
webagg improvements:
...
responseloader: direct loader: unrewrite location, content-location headers for non-live responses
autoapp: support custom indexsource list
indexsource: ensure closest query is added for RemoteIndexSource
utils res)template: urlencode '{url}' param if after '?'
2016-11-21 18:59:22 -08:00
Ilya Kreymer
cbe7508afc
webagg: add ZipNumIndexSource, add zip and cdxops test using new webagg IndexSource system
...
autoapp: add init_index_agg() for initializing indexes from a config dict
autoapp config: use RedisMultiKeyIndexSource for redis url and ZipNumIndexSource as zipnum+
2016-11-18 16:40:14 -08:00
Ilya Kreymer
1d8ddb8d20
responseloader: support for gzip compressing warc record with 'compress=gzip' param
...
prefix resolver: if prefix contains '*', attempt to resolve with glob, ignore none prefix
2016-11-17 19:11:54 -08:00
Ilya Kreymer
eac5bdce26
webagg: add AutoConfigApp initing the webagg sytsem from config.yaml
...
all index sources can be inited from string or dictionary (loaded from yaml)
support for dynamic directory-based collections based on file system, as well as static routes
specified explicitly
add `-cdx` path for compatibility with existing pywb -cdx interface
tests: add tests for AutoConfigApp yaml loading
add WSGI app shortcut for AutoConfigApp
2016-11-17 19:06:04 -08:00
Ilya Kreymer
34a03a78f6
app: fix missing import, add simple route path
...
test: fix typo
2016-11-17 19:00:29 -08:00
Ilya Kreymer
d24868db7a
tests: add MementoOverrideTests as a reusable class, convert memento_agg tests to use class,
...
handlers: add saved link header data for memento tests for handlers
2016-11-15 14:24:34 -08:00
Ilya Kreymer
cec0db1bdd
rules: instagram rules tweak, ignore query args
2016-11-14 13:19:26 -08:00
Ilya Kreymer
41f6ca9bb6
rules: update rules for medium, instagram
...
bump version to 0.33.1
2016-11-13 22:50:53 -08:00
Ilya Kreymer
008bc47fad
tests & travis: change live test to httpbin, remove 3.3 for now
2016-11-13 18:37:47 -08:00
Ilya Kreymer
36862fd9e9
recorder test: fix warc/revisit cdx test (don't assume exact order with 14-digit timestamp)
2016-11-13 11:46:10 -08:00
Ilya Kreymer
4a94aefead
travis fixes: add dependency, remove unnecessary include
2016-11-11 12:07:51 -08:00
Ilya Kreymer
8765de4fe7
refactor: updated dependencies, remove watchdog, add gevent and webassets
...
update tests, tests should pass for python 2 and 3!
2016-11-11 10:32:19 -08:00
Ilya Kreymer
ab77c1b6d9
refactor autoindex: switch to gevent-based simple polling, as watchdog doesn't work with gevent #200
2016-11-11 10:31:48 -08:00
Ilya Kreymer
fa247b8fe5
refactor: fix recorder and urlrewrite packages #200
2016-11-08 15:04:22 -08:00
Ilya Kreymer
6b4b038471
refactor: fix pywb.webagg package paths
...
all webagg tests working!
move testdata cdxj into sample_archive, remove rest (duplicates) #200
2016-11-08 14:30:09 -08:00
Ilya Kreymer
99e5008ac0
refactor: move newly merged packages to be pywb subpackages
2016-11-08 07:01:33 -08:00
Ilya Kreymer
c44e780c12
bump version to 0.33.0 for release
2016-10-24 10:45:30 -07:00
Ilya Kreymer
3f8480c37e
typo: fix typo after rename!
2016-10-20 11:47:06 -07:00
Ilya Kreymer
40b0a291a9
rewrite: don't rewrite ajax-requested html content
...
js regex: add special regex to rewrite '?location:'
2016-10-20 11:30:14 -07:00
Ilya Kreymer
52ce45beee
tests: additional test for new modifier form
2016-10-19 21:17:40 -07:00
Ilya Kreymer
42a31bbebf
wombat improvements:
...
- history change check: don't reject urls without a slash, check if new url == origin
- new api: override window.fetch() if it exists
- srcset elem rewriting, <source> element srcset override
- ajax: don't add X-Pywb-Requested-With header if url is a data: url
2016-10-19 21:11:16 -07:00
Ilya Kreymer
8b77f66a10
wb_frame.js: make more safe, check that frame actually exists before accessing
2016-10-19 20:57:56 -07:00
Ilya Kreymer
7b45df7338
wburl: support for new modifier form: $mod as well as 'mod_'
2016-10-10 17:00:36 -07:00
Ilya Kreymer
06b9e957e6
vidrw: when in proxy mode, use current protocol for vi_ query
2016-10-03 08:17:13 -07:00
Ilya Kreymer
28dd799516
wombat: auto-disable notifications and geolocation queries
2016-10-01 21:08:53 -07:00
Ilya Kreymer
b8769c7de0
proxy mode: use js_proxy rewriter for js embedded in html when in proxy mode #198
2016-10-01 21:08:08 -07:00
Ilya Kreymer
e97d2fb517
wombat unrewrite: if given a host-relative url (starting with '/') to extract_orig(), extract as host-relative as well if the host matches the current origin -- maintain host-relative urls when possible
2016-10-01 13:53:59 -07:00
Ilya Kreymer
950c31737c
wombat typo: check that __WB_top_frame is not null before using!
2016-09-30 13:49:57 -07:00
Ilya Kreymer
a4efa58d1e
proxy mode: add special 'proxy_js' rewriter which defaults to none rewriter, but supports custom rules
...
from rules.yaml, to avoid inserting WB_wombat_ overrides in proxy mode #198
2016-09-30 11:33:30 -07:00
Ilya Kreymer
2079ce191c
header rewriter improvements: better define headers rewritten/prefixed due to content rewrite vs url rewriting
...
when in proxy mode, don't rewrite headers unless related to content, transfer-encoding or cacheing (separate settings) #197
2016-09-30 09:02:50 -07:00
Ilya Kreymer
718cd43ae2
client rewrite: improvements for proxy mode
...
- disable most overrides when in proxy mode
- if using rewrite_url(), keep current scheme, instead of defaulting to http
- use 'window._wb_js' to check init
2016-09-29 15:26:12 -07:00
Ilya Kreymer
bdf4f9bc71
static handler: if 'wsgi.file_wrapper' throws exception, default to streaming directly
2016-09-29 15:23:40 -07:00
Ilya Kreymer
e61078ab96
memento: use replace_header() to avoid double adding Link, Memento-Datetime, Vary when using range request cache
2016-09-29 15:22:44 -07:00
Ilya Kreymer
4cdb99f415
rewrite: strip www redir check: use re.MULTILINE to include urls that may have a \r
2016-09-29 15:20:25 -07:00
Ilya Kreymer
98e8a75920
vidrw: more permissive flash video rewriting: consider any <object> with flashvars, attempt any youtube-dl playlist
...
bump version to 0.32.2
2016-09-21 11:37:31 -07:00
Ilya Kreymer
a6a186891e
wbrequestresponse: text response: calculate Content-Length from encoded utf-8 bytes, not the original text
2016-09-20 15:44:50 -07:00
Ilya Kreymer
1bb7aa01ce
wburl improved scheme detection: use regex to match acceptable scheme before :/, don't treat something like 'a.com/?x=http://' as having a scheme, update tests to check for this
2016-09-20 15:44:50 -07:00
Ilya Kreymer
9a3017bfcd
bump version to 0.32.1
2016-09-20 15:44:49 -07:00
Ilya Kreymer
dc05d14934
Merge pull request #194 from nlevitt/cli-desc
...
fix/tweak for cli --help
2016-09-15 14:16:42 -07:00
Ilya Kreymer
86cbb366f3
rules: undo yt rules change (will revisit later)
2016-09-15 10:01:36 -07:00
Ilya Kreymer
0a76a56b91
wombat: edge case: correctly handle <iframe src="javascript:WB_wombat_location=...> assignment created via JS.. custom rewrite_frame_src() added for use with rewrite_elem(), ensures wombat init is inserted first thing after 'javascript:'
2016-09-14 15:44:20 -07:00
Ilya Kreymer
cc65ce914d
wombat improvements (2.16):
...
- rewrite_elem() also rewrite 'poster'
- extract_orig() don't add http:// if nothing extracted
- new override: navigator.sendBeacon() if available
2016-09-14 14:13:59 -07:00
Ilya Kreymer
5fede0fea3
wombat: turn off debugging (accidentally committed)
2016-09-14 13:39:10 -07:00
Ilya Kreymer
1fb6e9b5fa
rewrite: url rewriter: don't rewrite relative urls, only those that start with scheme, / or contain ../ #195
...
update tests to reflect this new behavior
2016-09-14 13:04:46 -07:00
Noah Levitt
1620668363
fix/tweak for cli --help
2016-09-14 09:58:44 -07:00
Ilya Kreymer
70fdaae2b3
rules: rewrite location string for periscope js
2016-09-12 20:07:14 -07:00