Noah Levitt
be6fe83c56
bump dev version number after merging pull requests
2017-09-28 14:37:30 -07:00
Noah Levitt
2e5f8a733a
Merge pull request #33 from vbanos/fix-unit-tests
...
Add missing packages from setup.py, add tox config.
2017-09-28 14:35:48 -07:00
Noah Levitt
9aa330ecb3
Merge pull request #34 from vbanos/remove-unused
...
Remove unused imports
2017-09-28 14:34:58 -07:00
Vangelis Banos
6fd687f2b6
Add missing "," in deps
2017-09-28 20:37:15 +00:00
Vangelis Banos
51a2178cbd
Remove tox.ini, move warcio to test_requires
2017-09-28 20:35:47 +00:00
Noah Levitt
faae23d764
allow very long request header lines, to support large warcprox-meta header values
2017-09-27 17:29:55 -07:00
Vangelis Banos
66b4c35322
Remove unused imports
2017-09-24 11:15:30 +00:00
Vangelis Banos
b1819c51b9
Add missing packages from setup.py, add tox config.
...
Add missing `requests` and `warcio` packages. They are used in unit tests but
they were not included in `setup.py`.
Add `tox` configuration in order to be able to run unit tests for py27,
py34 and py35 with 1 command.
2017-09-24 10:51:29 +00:00
Noah Levitt
8bfda9f4b3
fix python2 tests
2017-09-20 11:03:36 -07:00
Noah Levitt
1bca9d0324
don't use http.client.HTTPResponse.getheader() to get the content-type header, because it can return a comma-delimited string
2017-09-18 14:45:16 -07:00
Noah Levitt
a8adaaf527
Merge pull request #30 from trifle/master
...
allow zero warc_writer_threads
2017-09-12 13:46:12 -07:00
Noah Levitt
b89f834ce3
no SIGQUIT on windows, so no SIGQUIT handler
2017-09-07 12:01:51 -07:00
Noah Levitt
3003c46c10
https://github.com/internetarchive/warcprox/pull/32 warrants a version bump
2017-09-07 10:33:21 -07:00
Noah Levitt
c73fdd91f8
Merge pull request #32 from internetarchive/trough
...
hello --plugin, goodbye kafka feed
2017-09-07 10:31:42 -07:00
Noah Levitt
db0f36c745
fix --size option ( https://github.com/internetarchive/warcprox/issues/31 )
2017-09-05 12:43:55 -07:00
Noah Levitt
7e55568851
fix --playback-port option ( https://github.com/internetarchive/warcprox/issues/29 )
2017-09-05 12:20:22 -07:00
Pascal Jürgens
940af4e888
fix zero-indexing of warc_writer_threads so they can be disabled via empty list
2017-08-18 15:52:34 +02:00
Noah Levitt
c0cb59e5af
Merge branch 'master' into trough
...
* master:
hidden argument --rethinkdb-big-table-name
try to fix https://github.com/internetarchive/warcprox/issues/27
2017-08-03 11:22:27 -07:00
Noah Levitt
13ee68ce4a
hidden argument --rethinkdb-big-table-name
2017-07-20 12:53:59 -07:00
Noah Levitt
b1a8fecd9d
try to fix https://github.com/internetarchive/warcprox/issues/27
2017-07-07 14:54:55 -07:00
Noah Levitt
ad3e6f405d
call stop() at shutdown if present on plugins
2017-06-28 16:40:20 -07:00
Noah Levitt
9ea3540d63
fix misuse of +=
2017-06-28 14:19:06 -07:00
Noah Levitt
2c95a1f2ee
remove kafka feed code
2017-06-28 13:12:30 -07:00
Noah Levitt
4c32394256
new option --plugin
2017-06-28 12:53:34 -07:00
Noah Levitt
e31302a6e3
hide kafka options as first step toward removing them
2017-06-28 12:03:48 -07:00
Noah Levitt
5a8d1610e6
try to work around stupid travis build error, see https://blog.travis-ci.com/2017-06-21-trusty-updates-2017-Q2-launch
2017-06-23 14:12:04 -07:00
Noah Levitt
b23e485898
simplify recovery of stats batch in case of exception saving them (not sure what was wrong with summy_merge, but this is simpler)
2017-06-22 16:54:04 -07:00
Noah Levitt
c0ee9c6093
avoid holding the lock, which makes all warc writer threads block, while doing rethinkdb operations, in RethinkStatsDb
2017-06-22 16:17:25 -07:00
Noah Levitt
24082c2e8c
don't wait for queue to be empty to do idle rollovers, because sometimes warcprox can stay busy for a long, long time
2017-06-22 15:04:01 -07:00
Noah Levitt
2f0c4454ac
try not to let problems responding to kill -QUIT (which prints stack trace of each thread) kill the whole process
2017-06-12 16:51:50 -07:00
Noah Levitt
808950abb4
recover properly from exception updating stats in rethinkdb
2017-06-12 16:51:45 -07:00
Noah Levitt
1500341875
use %r instead of calling repr()
2017-06-07 16:05:47 -07:00
Noah Levitt
2f93cdcad9
use locking to ensure consistency and avoid this kind of test failure https://travis-ci.org/internetarchive/warcprox/jobs/235819316
2017-05-25 17:38:20 +00:00
Noah Levitt
00b982aa24
Merge pull request #25 from nlevitt/sqlite
...
get rid of dbm, switch to sqlite, for easier portability, clarity aro…
2017-05-24 14:25:45 -07:00
Noah Levitt
95dfa54968
get rid of dbm, switch to sqlite, for easier portability, clarity around threading
2017-05-24 13:57:09 -07:00
Noah Levitt
99dd840d20
use "ttl" for updated doublethink svc reg api
2017-05-23 10:37:39 -07:00
Noah Levitt
aca0b881c6
make sure records are written to warc in a predictable order to make tests pass consistently
2017-05-19 16:34:27 -07:00
Noah Levitt
ef5dd2e4ae
multiple warc writer threads (hacked in with little thought to code organization)
2017-05-19 16:10:44 -07:00
Noah Levitt
515dd84aed
lock to certauth < 1.2 until we port
2017-05-19 15:44:00 -07:00
Noah Levitt
a3dde3d97f
fix mistake (incorrect interpration of concurrent.futures.ThreadPoolExecutor internals) that caused unnecessary waits, and unnecessarily long waits, before calling socket.accept()
2017-05-12 14:18:35 -07:00
Noah Levitt
fd770b71bc
revert stuff accidentally committed as part of eea582c6db9ed6d :(
2017-05-11 11:56:01 -07:00
Noah Levitt
621ebb91ea
use request count and payload size to specify length of benchmark run
2017-05-10 18:58:19 +00:00
Noah Levitt
2a0c8c28c9
improvements to run-benchmark.py, primarily to actually make multiple requests in parallel
2017-05-10 18:01:56 +00:00
Noah Levitt
eea582c6db
rewrite run-benchmarks.py for aiohttp2
2017-05-08 20:56:32 -07:00
Noah Levitt
c87ff90bc1
move more stuff in do_COMMAND inside the try block so that exceptions result in a 500 response
2017-05-05 13:44:46 -07:00
Noah Levitt
c642565ad8
bump up the socket backlog argument to try to stop kernel closing attempted connections on linux
2017-05-05 18:49:56 +00:00
Noah Levitt
b2f08535ae
set method when creating ProxyingRecordingHTTPResponse so that it knows when to close the connection, and HEAD requests don't sit around trying to read more data until socket timeout
2017-05-04 12:54:04 -07:00
Noah Levitt
11e11f4e68
early trace-level logging of the requestline
2017-05-03 18:39:57 -07:00
Noah Levitt
c0e6c219ca
python2 fixes
2017-04-28 11:12:17 -07:00
Noah Levitt
338e5cd878
comment out debug logging thing
2017-04-28 11:08:41 -07:00