Noah Levitt
45c06eab58
bump dev version number
2018-03-08 16:35:25 -08:00
Noah Levitt
c2172c6b5b
make sure to roll over idle warcs
...
even when warcprox is idle itself
2018-02-28 13:02:03 -08:00
Noah Levitt
8a7ed0cf57
bump dev version number after merge
2018-02-28 11:45:10 -08:00
Noah Levitt
d316569196
bump dev version after revert
2018-02-27 17:28:44 -08:00
Noah Levitt
d29a367db6
bump dev version number after PR merge
2018-02-27 10:33:02 -08:00
Noah Levitt
f3e270b796
make test_method_filter() pass by waiting
...
in test_limit_large_resource() for url processing to finish, to prevent
stats from affecting the subsequent test
2018-02-20 14:54:58 -08:00
Noah Levitt
6d6f2c9aa0
fix sqlite3 string escaping
2018-02-12 11:42:35 -08:00
Noah Levitt
b2a1f15bf6
clean up test infrastructure
...
- fix crufty, broken test in setup.py
- include tests in sdist tarball for pypi
2018-02-07 16:06:46 -08:00
Noah Levitt
688e53d889
bump version number after pull request
2018-02-07 15:49:35 -08:00
Noah Levitt
e68be9354d
back to dev version number
2018-02-07 15:48:42 -08:00
Noah Levitt
2ceedd3fd2
2.4b1 for pypi
2018-02-07 15:48:42 -08:00
Noah Levitt
322512dab6
bump version number after latest pull request
2018-02-07 15:48:42 -08:00
Noah Levitt
824c194142
make plugin api more flexible
2018-01-24 16:07:45 -08:00
Noah Levitt
5b414102ba
respect CA-related command line options
2018-01-24 10:27:40 -08:00
Noah Levitt
1cfb4d46c6
bump version number after pull request
2018-01-22 12:50:16 -08:00
Noah Levitt
41b531e398
use trick to avoid dns looking up local ip
2018-01-21 19:47:15 -08:00
Noah Levitt
de327450ea
close open warcs at shutdown
2018-01-21 19:46:31 -08:00
Noah Levitt
4b53c10132
bump minor version after these big changes
2018-01-19 14:37:53 -08:00
Noah Levitt
b43ab751f0
fix running_stats thing
2018-01-15 17:28:20 -08:00
Noah Levitt
c459812c93
roll over idle warcs on time
2018-01-12 11:46:44 -08:00
Noah Levitt
7fef2336e6
fix logging.notice/trace methods which were masking file/line/function of log message
2017-12-29 16:28:48 -08:00
Noah Levitt
f401b21958
update test_svcreg_status to expect new fields
2017-12-29 13:03:45 -08:00
Noah Levitt
5347cc92c3
change where RunningStats is initialized and fix tests
2017-12-29 11:06:46 -08:00
Noah Levitt
c966f7f6e8
more stats available from /status (and in rethindkb services table)
2017-12-28 17:07:02 -08:00
Noah Levitt
a85c665ce9
timeouts for trough requests to prevent hanging
2017-12-27 16:32:54 -08:00
Noah Levitt
eacf070a2a
dropping claim of support for python 2.7 (not worth hacking around tempfile.TemporaryDirectory to make tests pass)
2017-12-21 15:45:39 -08:00
Noah Levitt
500ffad7e4
implementation of special prefix "-" which means "do not archive"
2017-12-21 14:33:30 -08:00
Noah Levitt
9784c91459
test for special warc prefix "-" which means "do not archive"
2017-12-21 14:31:54 -08:00
Noah Levitt
399853dea0
if --profile is enabled, dump results every ten minutes, as well as at shutdown
2017-12-21 11:13:37 -08:00
Noah Levitt
af6e5ea112
fix error logging in case of failure promoting trough segment
2017-12-20 12:24:28 -08:00
Noah Levitt
0e324eaecf
avoid unexpected error KeyError: ...
2017-12-20 12:07:14 -08:00
Noah Levitt
6b67f49da4
back to dev version number
2017-12-15 16:44:34 -08:00
Noah Levitt
0e46dd466c
2.3 for pypi
2017-12-15 16:43:08 -08:00
Noah Levitt
995a11f444
bump dev version number after big merge
2017-11-30 16:15:55 -08:00
jkafader
e5a3dd8b3e
Merge pull request #37 from nlevitt/trough-dedup
...
WIP: trough for deduplication initial proof-of-concept-ish code
2017-11-30 16:14:43 -08:00
Noah Levitt
330635c0a8
fix test in py<=3.4
2017-11-22 13:55:44 -08:00
Noah Levitt
5be289730f
fix failing test, and change response code from 500 to more appropriate 502
2017-11-22 13:11:26 -08:00
Noah Levitt
627ef5667b
failing test for correct handling of "http.client.RemoteDisconnected: Remote end closed connection without response" from remote server
2017-11-22 12:49:46 -08:00
Noah Levitt
b28f9b9fb7
fix oops
2017-11-22 11:08:34 -08:00
Noah Levitt
95b2b86487
better error message for bad WARCPROX_WRITE_RECORD request
2017-11-15 23:41:44 +00:00
Noah Levitt
5c2c21de07
aggregate warc writer thread profiles much like we do for proxy threads
2017-11-14 16:44:31 -08:00
Noah Levitt
c13fd9a40e
have --profile profile proxy threads as well as warc writer threads
2017-11-14 16:35:25 -08:00
Noah Levitt
9cce03dc16
hacky way to fix problem of benchmarks arguments getting stale
2017-11-14 14:40:50 -08:00
Noah Levitt
c40ad8391d
Merge branch 'master' into trough-dedup
...
* master:
hopefully fix test failing occasionally apparently due to race condition by checking that the file we're waiting for has some content
fix payload digest by pulling calculation up one level where content has already been transfer-decoded
new failing test for correct calculation of payload digest
missed a spot handling case of no warc records written
2017-11-13 11:47:04 -08:00
Noah Levitt
ffc8a268ab
hopefully fix test failing occasionally apparently due to race condition by checking that the file we're waiting for has some content
2017-11-13 11:45:06 -08:00
Noah Levitt
3a0f6e0947
fix payload digest by pulling calculation up one level where content has already been transfer-decoded
2017-11-10 17:18:22 -08:00
Noah Levitt
3c215b42b5
missed a spot handling case of no warc records written
2017-11-10 14:34:06 -08:00
Noah Levitt
b2adb778ee
Merge branch 'master' into trough-dedup
...
* master:
not gonna bother figuring out why pypy regex is not matching https://travis-ci.org/internetarchive/warcprox/jobs/299864258#L615
fix failing test just committed, which involves running "listeners" for all urls, including those not archived; make adjustments accordingly
make test_crawl_log expect HEAD request to be logged
fix crawl log handling of WARCPROX_WRITE_RECORD request
modify test_crawl_log to expect crawl log to honor --base32 setting and add tests of WARCPROX_WRITE_RECORD request and HEAD request (not written to warc)
bump dev version number
add --crawl-log-dir option to fix failing test
create crawl log dir at startup if it doesn't exist
make test pass with py27
fix crawl log test to avoid any dedup collisions
fix crawl log test
heritrix-style crawl log support
disallow slash and backslash in warc-prefix
can't see any reason to split the main() like this (anymore?)
add missing dependency warcio to tests_require
2017-11-09 15:50:18 -08:00
Noah Levitt
700056cc04
fix failing test just committed, which involves running "listeners" for all urls, including those not archived; make adjustments accordingly
2017-11-09 13:10:57 -08:00
Noah Levitt
df6d7f1ce6
make test_crawl_log expect HEAD request to be logged
2017-11-09 13:09:07 -08:00