Ilya Kreymer
|
3452cf39e0
|
recorder: use more general MultiFileWARCWriter, supporting both keeping file open
and one-warc-per record use cases
|
2016-03-18 21:40:41 -07:00 |
|
Ilya Kreymer
|
e81457df5f
|
rename WARCRecorder -> WARCWriter, add optional max_size to single warc recorder
per-record recorder combines http response/req into single file
|
2016-03-18 19:49:14 -07:00 |
|
Ilya Kreymer
|
b64be0dff1
|
recorder: add tests for single file writer, including file locking
dedup policy: support customizable dedup/skip/write policy plugins and add tests
|
2016-03-18 15:28:24 -07:00 |
|
Ilya Kreymer
|
cba8e4ee3a
|
filters: more functional filter impl for header exclusion
|
2016-03-17 18:22:26 -07:00 |
|
Ilya Kreymer
|
06978bd8d2
|
recorder: check for empty input stream (support for direct proxy?)
|
2016-03-13 11:17:52 -07:00 |
|
Ilya Kreymer
|
709d2b1ea2
|
reorg: move StreamIter to utils
|
2016-03-12 23:29:23 -08:00 |
|
Ilya Kreymer
|
7a828017d1
|
recorder: clean up logging, ReadFullyStream moves to utils, get_request_uri to inputreq
|
2016-03-12 22:18:01 -08:00 |
|
Ilya Kreymer
|
9adb8da3b7
|
recorder: add support for filtering collections to record by regex (default: .*)
add support for excluding certain headers when writing WARCs
tests: add first batch of tests for recorder, using live upstream server
|
2016-03-11 11:12:25 -08:00 |
|
Ilya Kreymer
|
31fb2f926f
|
add recorder app, initial pass!
|
2016-03-09 14:33:36 -08:00 |
|