1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

91 Commits

Author SHA1 Message Date
Ilya Kreymer
4cf935abd1 directory agg: add CacheDirectoryAggregator to cache file listing, rescan dir only if changed 2016-03-19 20:34:09 -07:00
Ilya Kreymer
f5ee3c7bca inputreq: add reconstruct_request() to return a bytestring of the request, add test for inputreq 2016-03-19 20:32:37 -07:00
Ilya Kreymer
c96e419341 recorder: ensure filename is also tracked by the indexer, add tests
for redis file mapping
2016-03-19 10:24:28 -07:00
Ilya Kreymer
3452cf39e0 recorder: use more general MultiFileWARCWriter, supporting both keeping file open
and one-warc-per record use cases
2016-03-18 21:40:41 -07:00
Ilya Kreymer
e81457df5f rename WARCRecorder -> WARCWriter, add optional max_size to single warc recorder
per-record recorder combines http response/req into single file
2016-03-18 19:49:14 -07:00
Ilya Kreymer
b64be0dff1 recorder: add tests for single file writer, including file locking
dedup policy: support customizable dedup/skip/write policy plugins and add tests
2016-03-18 15:28:24 -07:00
Ilya Kreymer
cba8e4ee3a filters: more functional filter impl for header exclusion 2016-03-17 18:22:26 -07:00
Ilya Kreymer
58e8c709aa docker: add initial docker-compose, webagg Dockerfile 2016-03-16 18:42:15 -07:00
Ilya Kreymer
8dc59ef6bd webagg: add test for live server config 2016-03-13 16:53:39 -07:00
Ilya Kreymer
06978bd8d2 recorder: check for empty input stream (support for direct proxy?) 2016-03-13 11:17:52 -07:00
Ilya Kreymer
709d2b1ea2 reorg: move StreamIter to utils 2016-03-12 23:29:23 -08:00
Ilya Kreymer
7a828017d1 recorder: clean up logging, ReadFullyStream moves to utils, get_request_uri to inputreq 2016-03-12 22:18:01 -08:00
Ilya Kreymer
49b6ae78a8 live loader: remove liverec (doesn't work well with gevent), use regular requests
instead of overriden version.
reconstruct header block from httplib header pairs list
move ReadFullyStream to utils
2016-03-12 22:15:24 -08:00
Ilya Kreymer
9adb8da3b7 recorder: add support for filtering collections to record by regex (default: .*)
add support for excluding certain headers when writing WARCs
tests: add first batch of tests for recorder, using live upstream server
2016-03-11 11:12:25 -08:00
Ilya Kreymer
2003925b75 setup: fix pywb py3 version to 0.30.0, add coverage for recorder 2016-03-11 11:11:43 -08:00
Ilya Kreymer
3b3e190cf4 testing: use test mixins for class-scope temp directory, live server creation
use processes instead of threads for live server
2016-03-11 11:10:22 -08:00
Ilya Kreymer
46d013ab19 test redis: minor tweak to use @patch for fakeredis mock 2016-03-10 21:35:01 -08:00
Ilya Kreymer
c309637a3a tests: webagg test tweaks, create TempDirTests for sharing tests that require a temp dir 2016-03-10 16:04:27 -08:00
Ilya Kreymer
7b847311d5 dir agg: include filename in dir source name 2016-03-10 15:51:01 -08:00
Ilya Kreymer
31fb2f926f add recorder app, initial pass! 2016-03-09 14:33:36 -08:00
Ilya Kreymer
1499f0e611 add shared README.rst and coverage 2016-03-09 14:33:11 -08:00
Ilya Kreymer
34386578a5 shared setup: move webagg test to webagg/test 2016-03-09 14:29:14 -08:00
Ilya Kreymer
3477cb0bb5 drop process/thread mixin support (doesn't work as well on py2) could readd processes only if need arises, but for now focusing on gevent
rename header Source-Coll -> WebAgg-Source-Coll
2016-03-08 10:56:03 -08:00
Ilya Kreymer
348fb133e0 add upstream/proxy tests 2016-03-08 10:29:59 -08:00
Ilya Kreymer
107ba9aabc add ProxyLiveIndexSource for proxying upstream conn directly w/o a second index query
liveloader: if 'memento_url' key is set, then memento-datetime header must be present or its an error response
liveindexsource: add option to specify custom live path (eg. prefix for cacheing)
fix test cases changed due to ia (todo: mock up all external data!)
2016-03-08 10:27:13 -08:00
Ilya Kreymer
c1895ae70f loaders: return full WARC record in response, no need for upstream response handler
add UpstreamAggIndexSource to simplify upstream aggregator config, add test for upstream config
bottle app: wrap in a ResAppAgg, allow multiple bottle apps
py2: non-gevent concurrency not supported
2016-03-06 23:12:14 -08:00
Ilya Kreymer
0823ff4bd0 added 'upstream' handler for connecting to another webagg when 'upstream_url' is set
output 'is_live' as string in live index
2016-03-06 09:10:17 -08:00
Ilya Kreymer
20ebccc13e handlers: return out_headers directly instead of setting bottle response, contains bottle dependency to app.py (to allow alternate impl not using bottle)
param parsing: instead of setting custom _src_params and _all_params, use a custom ParamFormatter which will check param dict for params with prefix and custom name
2016-03-05 16:49:26 -08:00
Ilya Kreymer
bdda1b8c03 minor fixes for py2 support 2016-03-03 13:58:09 -08:00
Ilya Kreymer
896f81fd1c Add README.rst 2016-03-03 12:09:17 -08:00
Ilya Kreymer
ed1d3555c3 rename rezag -> webagg
rename aggindexsource -> aggregator
2016-03-03 11:55:43 -08:00
Ilya Kreymer
98830147b5 add memento headers to all response loaders, use BaseLoader base class, update tests
for memento headers
2016-03-03 11:04:28 -08:00
Ilya Kreymer
65e969a492 errors and timeouts reported back to the user via ResErrors header
add new /index, /resource access point system
2016-03-02 18:13:13 -08:00
Ilya Kreymer
1f3763d02c misc fixes: add route listing, more not found tests, timemap use file:// with ranges 2016-03-01 14:46:05 -08:00
Ilya Kreymer
008e5284b1 seperate iter_sources from list_sources api
all errors returned as json block with error msg
tests for not found, invalid errors
2016-02-29 12:34:06 -08:00
Ilya Kreymer
68090d00c1 add routing setup via app.py
add full test suite for handlers and responseloaders, as well as timeouts
2016-02-28 14:33:08 -08:00
Ilya Kreymer
c88c5f4cca add new package setup!
add tests and testdata, splitting mem and dir agg tests
2016-02-26 18:25:10 -08:00
Ilya Kreymer
398e8f1a77 inputrequest: add input request handling (direct wsgi headers) or as a prepared post request
add timemap link output
rename source_name -> source
2016-02-24 14:22:29 -08:00
Ilya Kreymer
1a0b2fba17 add aggregate index source and tests! 2016-02-22 13:30:12 -08:00
Ilya Kreymer
37198767ed add utils, responseloader and liverec 2016-02-19 17:27:19 -08:00
Ilya Kreymer
baa02add69 add indexloader and tests, including file, redis, remote cdx, memento, and live sources 2016-02-19 17:25:54 -08:00