* master:
fix test in py<=3.4
fix failing test, and change response code from 500 to more appropriate 502
failing test for correct handling of "http.client.RemoteDisconnected: Remote end closed connection without response" from remote server
fix oops
better error message for bad WARCPROX_WRITE_RECORD request
fix mistakes in warc write thread profile aggregation
aggregate warc writer thread profiles much like we do for proxy threads
have --profile profile proxy threads as well as warc writer threads
hacky way to fix problem of benchmarks arguments getting stale
* trough-dedup:
py2 fix
automatic segment promotion every hour
move trough client into separate module
pypy and pypy3 are passing at the moment, so why not :)
more cleanly separate trough client code from the rest of TroughDedup
update payload_digest reference in trough dedup for changes in commit 3a0f6e0947
hopefully fix test failing occasionally apparently due to race condition by checking that the file we're waiting for has some content
fix payload digest by pulling calculation up one level where content has already been transfer-decoded
new failing test for correct calculation of payload digest
missed a spot handling case of no warc records written
eh, don't prefix sqlite filenames with 'warcprox-trough-'; logging tweaks
not gonna bother figuring out why pypy regex is not matching https://travis-ci.org/internetarchive/warcprox/jobs/299864258#L615
fix failing test just committed, which involves running "listeners" for all urls, including those not archived; make adjustments accordingly
make test_crawl_log expect HEAD request to be logged
fix crawl log handling of WARCPROX_WRITE_RECORD request
modify test_crawl_log to expect crawl log to honor --base32 setting and add tests of WARCPROX_WRITE_RECORD request and HEAD request (not written to warc)
bump dev version number
add --crawl-log-dir option to fix failing test
* master:
hopefully fix test failing occasionally apparently due to race condition by checking that the file we're waiting for has some content
fix payload digest by pulling calculation up one level where content has already been transfer-decoded
new failing test for correct calculation of payload digest
missed a spot handling case of no warc records written
* master:
not gonna bother figuring out why pypy regex is not matching https://travis-ci.org/internetarchive/warcprox/jobs/299864258#L615
fix failing test just committed, which involves running "listeners" for all urls, including those not archived; make adjustments accordingly
make test_crawl_log expect HEAD request to be logged
fix crawl log handling of WARCPROX_WRITE_RECORD request
modify test_crawl_log to expect crawl log to honor --base32 setting and add tests of WARCPROX_WRITE_RECORD request and HEAD request (not written to warc)
bump dev version number
add --crawl-log-dir option to fix failing test
create crawl log dir at startup if it doesn't exist
make test pass with py27
fix crawl log test to avoid any dedup collisions
fix crawl log test
heritrix-style crawl log support
disallow slash and backslash in warc-prefix
can't see any reason to split the main() like this (anymore?)
add missing dependency warcio to tests_require