1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1599 Commits

Author SHA1 Message Date
Ilya Kreymer
3477cb0bb5 drop process/thread mixin support (doesn't work as well on py2) could readd processes only if need arises, but for now focusing on gevent
rename header Source-Coll -> WebAgg-Source-Coll
2016-03-08 10:56:03 -08:00
Ilya Kreymer
348fb133e0 add upstream/proxy tests 2016-03-08 10:29:59 -08:00
Ilya Kreymer
107ba9aabc add ProxyLiveIndexSource for proxying upstream conn directly w/o a second index query
liveloader: if 'memento_url' key is set, then memento-datetime header must be present or its an error response
liveindexsource: add option to specify custom live path (eg. prefix for cacheing)
fix test cases changed due to ia (todo: mock up all external data!)
2016-03-08 10:27:13 -08:00
Ilya Kreymer
b3372f64c3 timeutils: add datetime_to_iso_date 2016-03-08 08:39:45 -08:00
Ilya Kreymer
3af2979cf1 cdx: skip any fields starting with '_' when serializing 2016-03-08 08:38:55 -08:00
Ilya Kreymer
5ad01f7d64 statusandheaders: add a to_bytes() func for serializing header 2016-03-08 08:26:51 -08:00
Ilya Kreymer
c1895ae70f loaders: return full WARC record in response, no need for upstream response handler
add UpstreamAggIndexSource to simplify upstream aggregator config, add test for upstream config
bottle app: wrap in a ResAppAgg, allow multiple bottle apps
py2: non-gevent concurrency not supported
2016-03-06 23:12:14 -08:00
Ilya Kreymer
0823ff4bd0 added 'upstream' handler for connecting to another webagg when 'upstream_url' is set
output 'is_live' as string in live index
2016-03-06 09:10:17 -08:00
Ilya Kreymer
20ebccc13e handlers: return out_headers directly instead of setting bottle response, contains bottle dependency to app.py (to allow alternate impl not using bottle)
param parsing: instead of setting custom _src_params and _all_params, use a custom ParamFormatter which will check param dict for params with prefix and custom name
2016-03-05 16:49:26 -08:00
Ilya Kreymer
648e567805 statusandheaders: add __str__ func to reconstruct statusline + headers text 2016-03-04 12:48:36 -08:00
Mat Kelly
96da397456 Quick comment fix 2016-03-04 11:17:35 -05:00
Ilya Kreymer
bb806d7f26 Merge branch 'develop' into py3 2016-03-03 14:09:00 -08:00
Ilya Kreymer
5ddc843094 resolvingloader: use string_types instead of str for compat 2016-03-03 14:05:14 -08:00
Ilya Kreymer
bdda1b8c03 minor fixes for py2 support 2016-03-03 13:58:09 -08:00
Ilya Kreymer
a6dc57cf4a post query: ensure post query optional buffer is a byte not string buffer
exceptions: move LiveRequestException to wbexceptions
cdx query: support for 'alt_url' which, if set, is used to create start_key and end_key
2016-03-03 13:13:44 -08:00
Ilya Kreymer
896f81fd1c Add README.rst 2016-03-03 12:09:17 -08:00
Ilya Kreymer
ed1d3555c3 rename rezag -> webagg
rename aggindexsource -> aggregator
2016-03-03 11:55:43 -08:00
Ilya Kreymer
98830147b5 add memento headers to all response loaders, use BaseLoader base class, update tests
for memento headers
2016-03-03 11:04:28 -08:00
Ilya Kreymer
65e969a492 errors and timeouts reported back to the user via ResErrors header
add new /index, /resource access point system
2016-03-02 18:13:13 -08:00
Ilya Kreymer
1f3763d02c misc fixes: add route listing, more not found tests, timemap use file:// with ranges 2016-03-01 14:46:05 -08:00
Ilya Kreymer
008e5284b1 seperate iter_sources from list_sources api
all errors returned as json block with error msg
tests for not found, invalid errors
2016-02-29 12:34:06 -08:00
Ilya Kreymer
68090d00c1 add routing setup via app.py
add full test suite for handlers and responseloaders, as well as timeouts
2016-02-28 14:33:08 -08:00
Ilya Kreymer
c88c5f4cca add new package setup!
add tests and testdata, splitting mem and dir agg tests
2016-02-26 18:25:10 -08:00
Ilya Kreymer
fc5d7cc7cd rewrite: add rewriting of <meta> content="" attribute if it is a url 2016-02-25 18:49:31 -08:00
Ilya Kreymer
8fc789cc8f rewrite: leave out charset in top-frame and don't modify it in replay frame
to allow browser to detect best charset, as it would on original page if it is absent)
see #170 for details
2016-02-25 18:25:53 -08:00
Ilya Kreymer
c76aa17b78 wb.js: pad timestamp to 14 digits 2016-02-25 18:25:28 -08:00
Ilya Kreymer
e6361c58ac bump version to 0.11.2 2016-02-25 18:15:29 -08:00
Ilya Kreymer
398e8f1a77 inputrequest: add input request handling (direct wsgi headers) or as a prepared post request
add timemap link output
rename source_name -> source
2016-02-24 14:22:29 -08:00
Ilya Kreymer
20bd9d118b travis: remove --use-mirrors 2016-02-23 18:39:27 -08:00
Ilya Kreymer
1d5b23413f proxy: ensure proxy cert download sets content length
proxy options: 'use_default_coll' must specify exact default coll
(otherwise a random coll is chosen, as ordering is not defined)
travis: add py3.4, py3.5!
2016-02-23 18:09:09 -08:00
Ilya Kreymer
cebd6b6239 rewrite: fix rewriting encoding -- for best rewriting, keep strategy of encoding
insert to match page, then using latin-1 for rewriting. support for non-ascii
based encoding still needed
2016-02-23 18:07:34 -08:00
Ilya Kreymer
3a584a1ec3 py3: all tests pass, at last!
but not yet py2... need to resolve encoding in rewriting issues
2016-02-23 13:26:53 -08:00
Ilya Kreymer
0dff388e4e cdx: CDXQuery takes params dict not **params
CDXObject comparison using to_json()
2016-02-23 01:36:39 -08:00
Ilya Kreymer
57991fd0cf cdx: ensure url param required check is performed on init 2016-02-22 13:59:07 -08:00
Ilya Kreymer
af7c876263 cdx: ensure CDXQuery computes key and end_key automatically
key and end_key encoded as utf-8 by default
2016-02-22 13:39:47 -08:00
Ilya Kreymer
1a0b2fba17 add aggregate index source and tests! 2016-02-22 13:30:12 -08:00
Ilya Kreymer
7513011cac path resolvers: add PathResolverMapper for converting paths to resolvers,
ResolvingLoader takes a list of resolvers, not paths (to allow for custom overrides)
ResolvingLoader and ArcWarcRecordLoader support 'no_record_parse' on load to not parse http headers from stream
2016-02-19 22:33:38 -08:00
Ilya Kreymer
37198767ed add utils, responseloader and liverec 2016-02-19 17:27:19 -08:00
Ilya Kreymer
baa02add69 add indexloader and tests, including file, redis, remote cdx, memento, and live sources 2016-02-19 17:25:54 -08:00
Ilya Kreymer
bd841b91a9 more python 3 support work -- pywb.cdx, pywb.warc tests succeed
most relative imports replaced with absolute
2016-02-18 21:26:40 -08:00
Ilya Kreymer
b7008920de fix setup.py typo 2016-02-16 16:14:10 -08:00
Ilya Kreymer
3c85f7b7ac py3: make pywb.utils work with python 3! 2016-02-16 14:52:20 -08:00
Mat Kelly
50dab0bc98 Fixed misspelling 2016-01-12 18:21:05 -05:00
Ilya Kreymer
7cf81935e1 Update CHANGES for 0.11.1 2015-12-29 23:03:51 -08:00
Ilya Kreymer
d1c0bfac10 warc indexing: refactor to add create_payload_buffer() which can be overriden in custom iterators to create a file-like object
that will receive write() calls to buffer the payload when indexing. Default implementation does not buffer the payload
2015-12-29 17:01:25 -08:00
Ilya Kreymer
98843a2551 wombat: call reload() on actual location, possible fix related to #164 2015-12-29 16:17:39 -08:00
Ilya Kreymer
1e54f8c8fa proxy: add tests for proxy-mode 'Pywb-Rewrite-Prefix' header which adds optional prefix to proxy mode rewrites.. ensures such rewrites always absolute to include the prefix 2015-12-29 16:10:23 -08:00
Ilya Kreymer
a25096968a proxy: ip resolver: show 500 error if incorrect coll preconfigured for ip-based settings (todo: make it configurable?) 2015-12-29 14:53:50 -08:00
Ilya Kreymer
ba19ff1cd5 proxy: add custom rewrite prefix in proxy mode with 'pywb-proxy-rewrite: prefix' header 2015-12-23 23:14:47 -08:00
Ilya Kreymer
0cf6b40af9 wombat: add option to def_prop() to make overriden property enumerable, make WombatLocation and other loc overrides enumerable, fixes #163 2015-12-18 21:46:50 -08:00