1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1580 Commits

Author SHA1 Message Date
Ilya Kreymer
f962418c1f html rewrite typo: ensure rw_mod is set for meta content rewrite 2016-03-16 14:27:55 -07:00
Ilya Kreymer
c26660e20f cookies: use httplib headers pair list instead of requests headers dict to avoid 'set-cookie' headers being concatenated, as that messes up parsing in 3.5.1 2016-03-16 09:47:55 -07:00
Ilya Kreymer
ef5860901f warc record loader: if no content-length is specified on WARC record (as opposed to error or invalid), leave stream alone, don't force size of 0 and 204 2016-03-13 17:56:37 -07:00
Ilya Kreymer
8dc59ef6bd webagg: add test for live server config 2016-03-13 16:53:39 -07:00
Ilya Kreymer
06978bd8d2 recorder: check for empty input stream (support for direct proxy?) 2016-03-13 11:17:52 -07:00
Ilya Kreymer
709d2b1ea2 reorg: move StreamIter to utils 2016-03-12 23:29:23 -08:00
Ilya Kreymer
7a828017d1 recorder: clean up logging, ReadFullyStream moves to utils, get_request_uri to inputreq 2016-03-12 22:18:01 -08:00
Ilya Kreymer
49b6ae78a8 live loader: remove liverec (doesn't work well with gevent), use regular requests
instead of overriden version.
reconstruct header block from httplib header pairs list
move ReadFullyStream to utils
2016-03-12 22:15:24 -08:00
Ilya Kreymer
9adb8da3b7 recorder: add support for filtering collections to record by regex (default: .*)
add support for excluding certain headers when writing WARCs
tests: add first batch of tests for recorder, using live upstream server
2016-03-11 11:12:25 -08:00
Ilya Kreymer
2003925b75 setup: fix pywb py3 version to 0.30.0, add coverage for recorder 2016-03-11 11:11:43 -08:00
Ilya Kreymer
3b3e190cf4 testing: use test mixins for class-scope temp directory, live server creation
use processes instead of threads for live server
2016-03-11 11:10:22 -08:00
Ilya Kreymer
2051785e6b statusandheaders: add to_str() method with 'exclude_list' to support converting to str with certain headers
excluded. also supported by to_bytes()
2016-03-11 11:02:13 -08:00
Ilya Kreymer
46d013ab19 test redis: minor tweak to use @patch for fakeredis mock 2016-03-10 21:35:01 -08:00
Ilya Kreymer
c309637a3a tests: webagg test tweaks, create TempDirTests for sharing tests that require a temp dir 2016-03-10 16:04:27 -08:00
Ilya Kreymer
7b847311d5 dir agg: include filename in dir source name 2016-03-10 15:51:01 -08:00
Ilya Kreymer
3f734e1c98 tests: remove 3.2, fix auto_index test assert 2016-03-10 13:07:57 -08:00
Ilya Kreymer
42aa12f9ae test py3.2 also 2016-03-10 12:55:36 -08:00
Ilya Kreymer
34cc3ccacb versions and readme: update version to 0.30.0, update README with python 2 and 3 support 2016-03-10 12:51:14 -08:00
Ilya Kreymer
0f6e3da127 cdx: tests: add tests for comparison ops 2016-03-10 12:47:36 -08:00
Ilya Kreymer
e5ca9bf601 Merge branch 'master' into py3 2016-03-10 10:53:30 -08:00
Ilya Kreymer
effd618bb3 tests: add parse_comment test for html_rewriter 2016-03-10 10:10:51 -08:00
Ilya Kreymer
8ae692d630 Merge pull request #172 from machawk1/patch-3
Spelling: Quick comment fix
2016-03-10 09:52:25 -08:00
Ilya Kreymer
a25bb5e238 Merge pull request #166 from machawk1/patch-2
Fixed misspelling
2016-03-10 09:51:56 -08:00
Ilya Kreymer
12ecb29a01 tweak CHANGES 2016-03-10 09:48:28 -08:00
Ilya Kreymer
2f67b78023 Update CHANGES for 0.11.2 2016-03-10 09:47:12 -08:00
Ilya Kreymer
c1bdeac92b redis: fix redis key lookup, add tests for zrangebylex with new fakeredis 2016-03-09 18:33:04 -08:00
Ilya Kreymer
31fb2f926f add recorder app, initial pass! 2016-03-09 14:33:36 -08:00
Ilya Kreymer
1499f0e611 add shared README.rst and coverage 2016-03-09 14:33:11 -08:00
Ilya Kreymer
34386578a5 shared setup: move webagg test to webagg/test 2016-03-09 14:29:14 -08:00
Ilya Kreymer
0198bf1213 Merge branch 'develop' into py3 2016-03-09 07:29:54 -08:00
Ilya Kreymer
bc84c2fda0 indexing: declare 'record' and bail if no record was loaded, add test for empty file indexing, fixes #168 2016-03-09 07:27:03 -08:00
Ilya Kreymer
3477cb0bb5 drop process/thread mixin support (doesn't work as well on py2) could readd processes only if need arises, but for now focusing on gevent
rename header Source-Coll -> WebAgg-Source-Coll
2016-03-08 10:56:03 -08:00
Ilya Kreymer
348fb133e0 add upstream/proxy tests 2016-03-08 10:29:59 -08:00
Ilya Kreymer
107ba9aabc add ProxyLiveIndexSource for proxying upstream conn directly w/o a second index query
liveloader: if 'memento_url' key is set, then memento-datetime header must be present or its an error response
liveindexsource: add option to specify custom live path (eg. prefix for cacheing)
fix test cases changed due to ia (todo: mock up all external data!)
2016-03-08 10:27:13 -08:00
Ilya Kreymer
b3372f64c3 timeutils: add datetime_to_iso_date 2016-03-08 08:39:45 -08:00
Ilya Kreymer
3af2979cf1 cdx: skip any fields starting with '_' when serializing 2016-03-08 08:38:55 -08:00
Ilya Kreymer
5ad01f7d64 statusandheaders: add a to_bytes() func for serializing header 2016-03-08 08:26:51 -08:00
Ilya Kreymer
c1895ae70f loaders: return full WARC record in response, no need for upstream response handler
add UpstreamAggIndexSource to simplify upstream aggregator config, add test for upstream config
bottle app: wrap in a ResAppAgg, allow multiple bottle apps
py2: non-gevent concurrency not supported
2016-03-06 23:12:14 -08:00
Ilya Kreymer
0823ff4bd0 added 'upstream' handler for connecting to another webagg when 'upstream_url' is set
output 'is_live' as string in live index
2016-03-06 09:10:17 -08:00
Ilya Kreymer
20ebccc13e handlers: return out_headers directly instead of setting bottle response, contains bottle dependency to app.py (to allow alternate impl not using bottle)
param parsing: instead of setting custom _src_params and _all_params, use a custom ParamFormatter which will check param dict for params with prefix and custom name
2016-03-05 16:49:26 -08:00
Ilya Kreymer
648e567805 statusandheaders: add __str__ func to reconstruct statusline + headers text 2016-03-04 12:48:36 -08:00
Mat Kelly
96da397456 Quick comment fix 2016-03-04 11:17:35 -05:00
Ilya Kreymer
bb806d7f26 Merge branch 'develop' into py3 2016-03-03 14:09:00 -08:00
Ilya Kreymer
5ddc843094 resolvingloader: use string_types instead of str for compat 2016-03-03 14:05:14 -08:00
Ilya Kreymer
bdda1b8c03 minor fixes for py2 support 2016-03-03 13:58:09 -08:00
Ilya Kreymer
a6dc57cf4a post query: ensure post query optional buffer is a byte not string buffer
exceptions: move LiveRequestException to wbexceptions
cdx query: support for 'alt_url' which, if set, is used to create start_key and end_key
2016-03-03 13:13:44 -08:00
Ilya Kreymer
896f81fd1c Add README.rst 2016-03-03 12:09:17 -08:00
Ilya Kreymer
ed1d3555c3 rename rezag -> webagg
rename aggindexsource -> aggregator
2016-03-03 11:55:43 -08:00
Ilya Kreymer
98830147b5 add memento headers to all response loaders, use BaseLoader base class, update tests
for memento headers
2016-03-03 11:04:28 -08:00
Ilya Kreymer
65e969a492 errors and timeouts reported back to the user via ResErrors header
add new /index, /resource access point system
2016-03-02 18:13:13 -08:00