418 Commits

Author SHA1 Message Date
Noah Levitt
9aa330ecb3 Merge pull request #34 from vbanos/remove-unused
Remove unused imports
2017-09-28 14:34:58 -07:00
Noah Levitt
faae23d764 allow very long request header lines, to support large warcprox-meta header values 2017-09-27 17:29:55 -07:00
Vangelis Banos
66b4c35322 Remove unused imports 2017-09-24 11:15:30 +00:00
Noah Levitt
8bfda9f4b3 fix python2 tests 2017-09-20 11:03:36 -07:00
Noah Levitt
1bca9d0324 don't use http.client.HTTPResponse.getheader() to get the content-type header, because it can return a comma-delimited string 2017-09-18 14:45:16 -07:00
Noah Levitt
a8adaaf527 Merge pull request #30 from trifle/master
allow zero warc_writer_threads
2017-09-12 13:46:12 -07:00
Noah Levitt
b89f834ce3 no SIGQUIT on windows, so no SIGQUIT handler 2017-09-07 12:01:51 -07:00
Noah Levitt
3003c46c10 https://github.com/internetarchive/warcprox/pull/32 warrants a version bump 2017-09-07 10:33:21 -07:00
Noah Levitt
c73fdd91f8 Merge pull request #32 from internetarchive/trough
hello --plugin, goodbye kafka feed
2017-09-07 10:31:42 -07:00
Noah Levitt
db0f36c745 fix --size option (https://github.com/internetarchive/warcprox/issues/31) 2017-09-05 12:43:55 -07:00
Noah Levitt
7e55568851 fix --playback-port option (https://github.com/internetarchive/warcprox/issues/29) 2017-09-05 12:20:22 -07:00
Pascal Jürgens
940af4e888 fix zero-indexing of warc_writer_threads so they can be disabled via empty list 2017-08-18 15:52:34 +02:00
Noah Levitt
c0cb59e5af Merge branch 'master' into trough
* master:
  hidden argument --rethinkdb-big-table-name
  try to fix https://github.com/internetarchive/warcprox/issues/27
2017-08-03 11:22:27 -07:00
Noah Levitt
13ee68ce4a hidden argument --rethinkdb-big-table-name 2017-07-20 12:53:59 -07:00
Noah Levitt
b1a8fecd9d try to fix https://github.com/internetarchive/warcprox/issues/27 2017-07-07 14:54:55 -07:00
Noah Levitt
ad3e6f405d call stop() at shutdown if present on plugins 2017-06-28 16:40:20 -07:00
Noah Levitt
9ea3540d63 fix misuse of += 2017-06-28 14:19:06 -07:00
Noah Levitt
2c95a1f2ee remove kafka feed code 2017-06-28 13:12:30 -07:00
Noah Levitt
4c32394256 new option --plugin 2017-06-28 12:53:34 -07:00
Noah Levitt
e31302a6e3 hide kafka options as first step toward removing them 2017-06-28 12:03:48 -07:00
Noah Levitt
5a8d1610e6 try to work around stupid travis build error, see https://blog.travis-ci.com/2017-06-21-trusty-updates-2017-Q2-launch 2017-06-23 14:12:04 -07:00
Noah Levitt
b23e485898 simplify recovery of stats batch in case of exception saving them (not sure what was wrong with summy_merge, but this is simpler) 2017-06-22 16:54:04 -07:00
Noah Levitt
c0ee9c6093 avoid holding the lock, which makes all warc writer threads block, while doing rethinkdb operations, in RethinkStatsDb 2017-06-22 16:17:25 -07:00
Noah Levitt
24082c2e8c don't wait for queue to be empty to do idle rollovers, because sometimes warcprox can stay busy for a long, long time 2017-06-22 15:04:01 -07:00
Noah Levitt
2f0c4454ac try not to let problems responding to kill -QUIT (which prints stack trace of each thread) kill the whole process 2017-06-12 16:51:50 -07:00
Noah Levitt
808950abb4 recover properly from exception updating stats in rethinkdb 2017-06-12 16:51:45 -07:00
Noah Levitt
1500341875 use %r instead of calling repr() 2017-06-07 16:05:47 -07:00
Noah Levitt
2f93cdcad9 use locking to ensure consistency and avoid this kind of test failure https://travis-ci.org/internetarchive/warcprox/jobs/235819316 2017-05-25 17:38:20 +00:00
Noah Levitt
00b982aa24 Merge pull request #25 from nlevitt/sqlite
get rid of dbm, switch to sqlite, for easier portability, clarity aro…
2017-05-24 14:25:45 -07:00
Noah Levitt
95dfa54968 get rid of dbm, switch to sqlite, for easier portability, clarity around threading 2017-05-24 13:57:09 -07:00
Noah Levitt
99dd840d20 use "ttl" for updated doublethink svc reg api 2017-05-23 10:37:39 -07:00
Noah Levitt
aca0b881c6 make sure records are written to warc in a predictable order to make tests pass consistently 2017-05-19 16:34:27 -07:00
Noah Levitt
ef5dd2e4ae multiple warc writer threads (hacked in with little thought to code organization) 2017-05-19 16:10:44 -07:00
Noah Levitt
515dd84aed lock to certauth < 1.2 until we port 2017-05-19 15:44:00 -07:00
Noah Levitt
a3dde3d97f fix mistake (incorrect interpration of concurrent.futures.ThreadPoolExecutor internals) that caused unnecessary waits, and unnecessarily long waits, before calling socket.accept() 2017-05-12 14:18:35 -07:00
Noah Levitt
fd770b71bc revert stuff accidentally committed as part of eea582c6db9ed6d :( 2017-05-11 11:56:01 -07:00
Noah Levitt
621ebb91ea use request count and payload size to specify length of benchmark run 2017-05-10 18:58:19 +00:00
Noah Levitt
2a0c8c28c9 improvements to run-benchmark.py, primarily to actually make multiple requests in parallel 2017-05-10 18:01:56 +00:00
Noah Levitt
eea582c6db rewrite run-benchmarks.py for aiohttp2 2017-05-08 20:56:32 -07:00
Noah Levitt
c87ff90bc1 move more stuff in do_COMMAND inside the try block so that exceptions result in a 500 response 2017-05-05 13:44:46 -07:00
Noah Levitt
c642565ad8 bump up the socket backlog argument to try to stop kernel closing attempted connections on linux 2017-05-05 18:49:56 +00:00
Noah Levitt
b2f08535ae set method when creating ProxyingRecordingHTTPResponse so that it knows when to close the connection, and HEAD requests don't sit around trying to read more data until socket timeout 2017-05-04 12:54:04 -07:00
Noah Levitt
11e11f4e68 early trace-level logging of the requestline 2017-05-03 18:39:57 -07:00
Noah Levitt
c0e6c219ca python2 fixes 2017-04-28 11:12:17 -07:00
Noah Levitt
338e5cd878 comment out debug logging thing 2017-04-28 11:08:41 -07:00
Noah Levitt
ca7625b18d set via header on request and response, record request via in warc (because it is sent to the remote site), do not record response via in warc (because it is not sent by the remote site) 2017-04-28 11:07:33 -07:00
Noah Levitt
47680cc17d let test_choose_a_port_for_me pass when service registry is missing, i.e. when not running with rethinkdb 2017-04-17 12:05:39 -07:00
Noah Levitt
3d87ed61be whoops, stop warcprox and join thread in test_choose_a_port_for_me 2017-04-17 11:47:22 -07:00
Noah Levitt
1900dfac08 test choosing port 0 which means, let the system choose one for me, and fix a bug in service registry reporting of the port 2017-04-17 11:45:37 -07:00
Noah Levitt
21a9a26f51 fix some obsolete calls 2017-04-17 11:00:43 -07:00