1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1811 Commits

Author SHA1 Message Date
Ilya Kreymer
58b141bd53 add python 3 classifiers (#208) 2017-02-16 11:02:53 -08:00
Ilya Kreymer
531422fc1b client-side rewrite improvements:
- add overrides for document.URL, xhr.responseURL, function for general single property override
- postMessage: add overrides for additional MessageEvent properties, target, srcElement, path, eventPhase
- postMessage: avoid duplicate event listeners registered
- check for duplicate postMessage override inits
2017-02-15 17:03:15 -08:00
Ilya Kreymer
a5bc932e0c memento agg test: fix test to reflect change from link->* 2017-02-06 21:17:48 -05:00
Ilya Kreymer
06c6e0c6f8 memento agg: fix test to reflect change 2017-02-06 21:00:08 -05:00
Ilya Kreymer
1d5b48d3b6 indexsource: improve init_from_config() to always use current class
use '*' instead of 'link' for timemap for compatibility (for now)
2017-02-06 20:52:43 -05:00
Ilya Kreymer
564f548afa wombat improvements:
- xhr responseURL override, extract original url
- Worker override: if using 'blob:', extract blob and remove any postMessage() rewriting (workers won't have the __WB_pmw function)
- eval() override: conv to string before rewriting
2017-02-05 02:26:02 -05:00
Ilya Kreymer
7f8562a39d utils: LimitReader tell() proxies to original stream, available only if original has tell() 2017-02-04 22:54:43 -05:00
Ilya Kreymer
1a9f66f8b6 mementoindexsource: treat missing Link header as non-memento/not found 2017-01-27 00:07:38 -08:00
Ilya Kreymer
f92782d1dd utils: LimitReader: support tell() 2017-01-26 23:29:40 -08:00
Ilya Kreymer
9773eba47d setup reqs: use webassets==0.12.1 (with pyinstaller support), remove dependency on custom branch 2017-01-26 01:37:57 -08:00
Ilya Kreymer
84796ba810 setup req: fix jinja<2.9 for now due to issues in 2.9+ 2017-01-26 01:14:10 -08:00
Ilya Kreymer
2d54bb87be setup.py: ensure gevent monkey-patch is called before running tests with python setup.py test 2017-01-26 00:37:35 -08:00
Ilya Kreymer
bb64d0de54 url-rewrite cookie store: decode() only if redis returns byte strings in py3 2017-01-26 00:01:39 -08:00
Ilya Kreymer
2cc6f5b4d6 Merge pull request #203 from atomotic/new-pywb
replace fcntl with portalocker
2016-12-27 12:38:12 -08:00
raffaele messuti
524d9bfd26 portalocker for file locking check instead of fcntl. more portable on windows 2016-12-26 10:27:20 +01:00
Ilya Kreymer
0e414acfda setup: remove pyamf as default dep for now 2016-12-21 17:15:41 -08:00
Ilya Kreymer
3b82416ad3 setup: add specific dependencies for webassets, pyamf 2016-12-21 16:11:48 -08:00
Ilya Kreymer
c52efa0f9b loader improvements: add PackageLoader for pkg:// scheme
use pkgutil.get_data() instead of pkg_resources
template loading: load assets file through load() interface, use standard PackageLoader
2016-12-18 20:57:17 -08:00
Ilya Kreymer
fa85793e97 remove chunk_encoding of wsgi response: per pep 3333 (https://www.python.org/dev/peps/pep-3333/#other-http-features), the application/middleware should *not* add Transfer-Encoding header or chunk encode the response 2016-12-16 13:48:51 -08:00
Ilya Kreymer
5f7a62bd5e utils: expandvars only if not empty 2016-12-16 12:22:06 -08:00
Ilya Kreymer
d104b0f367 rewriterapp: ensure correct sized or chunked response:
if no content-length and http 1.1, chunk encode the response
if no content-length and http 1.0, buffer response and add content-length
utils: port buffer_iter() for buffering iter, returning another iter
utils load_config: expand any env vars
2016-12-16 11:19:40 -08:00
Ilya Kreymer
4ce65c5289 logging: disable excess print statements 2016-12-16 11:13:27 -08:00
Ilya Kreymer
fb91d116a9 urlrewrite cookietracker fix: rewrite Path of cookies retrieved from cookietracker (redis) using custom host scope rewriter (no other filtering) 2016-12-11 18:59:02 -08:00
Ilya Kreymer
bf402e68f6 warc: make ArcWarcRecord a class to allow modifying attribs
warcwriter: add option to not adjust content length if record already prepared
2016-12-09 18:09:42 -08:00
Ilya Kreymer
bbfe3a9d51 bufferedreader: read() op attempts to read entire buff or exact length, retries if boundary reached 2016-12-09 17:51:39 -08:00
Ilya Kreymer
50a3353da3 wsgi server: default to gevent-based wsgi server for all cmd line server apps, add -s command for specifying server #201
cli: add 'webagg-server' cli command for running new webagg system
tests: fix cli test for gevent server
2016-12-09 16:46:33 -08:00
Ilya Kreymer
4f9b963e13 tests: update test to support uncompressed followed after compressed block 2016-12-08 14:20:46 -08:00
Ilya Kreymer
bc219acb33 rewriterapp: fix async requests call typo! (was erroring with invalid params previously) 2016-12-08 13:44:11 -08:00
Ilya Kreymer
d772b05fd6 warc indexing improvements:
decompressor: allow plaintext after gzipped record fully finished, as next member
warc loader: ignore blank line records -- if empty statusheaders, set length to 0 and ignore, don't read indenfinitely
2016-12-08 12:52:51 -08:00
Ilya Kreymer
2fa6aa7a20 Merge branch 'develop' into new-pywb 2016-12-02 20:43:10 -08:00
Ilya Kreymer
5690604556 client-side rewrite: add eval() override, add WB_wombat_ prefixes for location 2016-12-02 12:11:54 -08:00
Ilya Kreymer
936b8dfb86 live web loader: add support for optional forward proxy 2016-11-30 12:46:34 -08:00
Ilya Kreymer
fec907a299 responseloader live loader: increase httplib max headers to avoid 'too many headers' error 2016-11-28 10:36:58 -08:00
Ilya Kreymer
577ced76f0 dockerfile: set fixed requests version to avoid encoding issues in latest requests 2016-11-28 10:35:15 -08:00
Ilya Kreymer
1ef0a54988 recorder improvements:
- make recorder tempfile used by request/response wrappers overridable, better checks to ensure temp file is closed after recording is done/failed
- ensure ParamsFormatter inited for all requests
- writer: ensure writing from temp buffer done in BUFF_SIZE increments
2016-11-27 21:12:12 -05:00
Ilya Kreymer
d7d002c076 Merge branch 'develop' into new-pywb (rules and rewrite fixes) 2016-11-23 11:47:21 -08:00
Ilya Kreymer
a8c0ff3c06 client rewrite: fix window.fetch override, create new Request object if url is rewritten 2016-11-23 11:46:01 -08:00
Ilya Kreymer
e5adc5ba69 responseloader: ensure Host header is correct when sending non-live remote request 2016-11-23 11:41:13 -08:00
Ilya Kreymer
685d48d531 webagg: split RedisMultiKeyIndexSource into Base* version to make more extensible, support different agg mixin
indexsource: update __repr__ funcs to use current classname
2016-11-22 18:21:01 -08:00
Ilya Kreymer
99e533d31b remoteindex: add limit when doing closest query
responseloader: ignore scheme from self-redir check
2016-11-21 21:42:07 -08:00
Ilya Kreymer
c9a0259604 dockerfile: install dependencies first to speed up updates 2016-11-21 21:33:04 -08:00
Ilya Kreymer
74276f58f3 webagg improvements:
responseloader: direct loader: unrewrite location, content-location headers for non-live responses
autoapp: support custom indexsource list
indexsource: ensure closest query is added for RemoteIndexSource
utils res)template: urlencode '{url}' param if after '?'
2016-11-21 18:59:22 -08:00
Ilya Kreymer
cbe7508afc webagg: add ZipNumIndexSource, add zip and cdxops test using new webagg IndexSource system
autoapp: add init_index_agg() for initializing indexes from a config dict
autoapp config: use RedisMultiKeyIndexSource for redis url and ZipNumIndexSource as zipnum+
2016-11-18 16:40:14 -08:00
Ilya Kreymer
1d8ddb8d20 responseloader: support for gzip compressing warc record with 'compress=gzip' param
prefix resolver: if prefix contains '*', attempt to resolve with glob, ignore none prefix
2016-11-17 19:11:54 -08:00
Ilya Kreymer
eac5bdce26 webagg: add AutoConfigApp initing the webagg sytsem from config.yaml
all index sources can be inited from string or dictionary (loaded from yaml)
support for dynamic directory-based collections based on file system, as well as static routes
specified explicitly
add `-cdx` path for compatibility with existing pywb -cdx interface
tests: add tests for AutoConfigApp yaml loading
add WSGI app shortcut for AutoConfigApp
2016-11-17 19:06:04 -08:00
Ilya Kreymer
34a03a78f6 app: fix missing import, add simple route path
test: fix typo
2016-11-17 19:00:29 -08:00
Ilya Kreymer
d24868db7a tests: add MementoOverrideTests as a reusable class, convert memento_agg tests to use class,
handlers: add saved link header data for memento tests for handlers
2016-11-15 14:24:34 -08:00
Ilya Kreymer
c7fa8b711c travis: trying 2.7, 3.5 only for now 2016-11-15 10:22:32 -08:00
Ilya Kreymer
cec0db1bdd rules: instagram rules tweak, ignore query args 2016-11-14 13:19:26 -08:00
Ilya Kreymer
41f6ca9bb6 rules: update rules for medium, instagram
bump version to 0.33.1
2016-11-13 22:50:53 -08:00