433 Commits

Author SHA1 Message Date
Noah Levitt
5ed47b3871 cryptography lib version 2.1.1 is causing problems 2017-10-16 11:37:49 -07:00
Noah Levitt
ad8c1d0658 Merge pull request #40 from vbanos/bugfix-warcfilename
Replace invalid warcfilename variable in playback
2017-10-13 13:51:11 -07:00
Vangelis Banos
f7240a33d7 Replace invalid warcfilename variable in playback
A warcfilename variable which does not exists is used here. Replace it
with the current variable for filename.
2017-10-13 19:42:41 +00:00
Noah Levitt
9b8043d3a2 greatly simplify automated test setup by reusing initialization code from the command line executable; this also has the benefit of testing that initialization code 2017-10-06 17:00:35 -07:00
Noah Levitt
0cc68dd428 avoid TypeError: 'NoneType' object is not iterable exception at shutdown 2017-10-06 16:58:27 -07:00
Noah Levitt
908988c4f0 wait for rethinkdb indexes to be ready 2017-10-06 16:57:39 -07:00
Noah Levitt
0de10791aa Merge pull request #35 from vbanos/dedup-redundant-code
Remove redundant methods from dedup classes
2017-09-29 11:42:47 -07:00
Vangelis Banos
4e7d8fa917 Remove deleted `close` method call from test. 2017-09-29 06:36:37 +00:00
Noah Levitt
be6fe83c56 bump dev version number after merging pull requests 2017-09-28 14:37:30 -07:00
Noah Levitt
2e5f8a733a Merge pull request #33 from vbanos/fix-unit-tests
Add missing packages from setup.py, add tox config.
2017-09-28 14:35:48 -07:00
Noah Levitt
9aa330ecb3 Merge pull request #34 from vbanos/remove-unused
Remove unused imports
2017-09-28 14:34:58 -07:00
Vangelis Banos
6fd687f2b6 Add missing "," in deps 2017-09-28 20:37:15 +00:00
Vangelis Banos
51a2178cbd Remove tox.ini, move warcio to test_requires 2017-09-28 20:35:47 +00:00
Noah Levitt
faae23d764 allow very long request header lines, to support large warcprox-meta header values 2017-09-27 17:29:55 -07:00
Vangelis Banos
eb266f198d Remove redundant stop() & sync() dedup methods
Similarly with my previous commits, these methods do nothing.

I think that the reason they are here is because the author uses the
same style in other places in the code (e.g.
``warcprox.stats.StatsDb``). Similar methods exist there.
2017-09-24 13:44:13 +00:00
Vangelis Banos
d035147e3e Remove redundant close method from DedupDb and RethinkDedupDb
I'm trying to implement another DedupDb interface and I looked into the
use of each method. The ``close`` method of ``dedup.DedupDb`` and
``deup.RethinkDedupDb`` is empty.
It is also invoked from ``controller``.

Since it doesn't do anything and it won't in the foreseeable future,
let's remove it.
2017-09-24 13:36:12 +00:00
Vangelis Banos
66b4c35322 Remove unused imports 2017-09-24 11:15:30 +00:00
Vangelis Banos
b1819c51b9 Add missing packages from setup.py, add tox config.
Add missing `requests` and `warcio` packages. They are used in unit tests but
they were not included in `setup.py`.

Add `tox` configuration in order to be able to run unit tests for py27,
py34 and py35 with 1 command.
2017-09-24 10:51:29 +00:00
Noah Levitt
8bfda9f4b3 fix python2 tests 2017-09-20 11:03:36 -07:00
Noah Levitt
1bca9d0324 don't use http.client.HTTPResponse.getheader() to get the content-type header, because it can return a comma-delimited string 2017-09-18 14:45:16 -07:00
Noah Levitt
a8adaaf527 Merge pull request #30 from trifle/master
allow zero warc_writer_threads
2017-09-12 13:46:12 -07:00
Noah Levitt
b89f834ce3 no SIGQUIT on windows, so no SIGQUIT handler 2017-09-07 12:01:51 -07:00
Noah Levitt
3003c46c10 https://github.com/internetarchive/warcprox/pull/32 warrants a version bump 2017-09-07 10:33:21 -07:00
Noah Levitt
c73fdd91f8 Merge pull request #32 from internetarchive/trough
hello --plugin, goodbye kafka feed
2017-09-07 10:31:42 -07:00
Noah Levitt
db0f36c745 fix --size option (https://github.com/internetarchive/warcprox/issues/31) 2017-09-05 12:43:55 -07:00
Noah Levitt
7e55568851 fix --playback-port option (https://github.com/internetarchive/warcprox/issues/29) 2017-09-05 12:20:22 -07:00
Pascal Jürgens
940af4e888 fix zero-indexing of warc_writer_threads so they can be disabled via empty list 2017-08-18 15:52:34 +02:00
Noah Levitt
c0cb59e5af Merge branch 'master' into trough
* master:
  hidden argument --rethinkdb-big-table-name
  try to fix https://github.com/internetarchive/warcprox/issues/27
2017-08-03 11:22:27 -07:00
Noah Levitt
13ee68ce4a hidden argument --rethinkdb-big-table-name 2017-07-20 12:53:59 -07:00
Noah Levitt
b1a8fecd9d try to fix https://github.com/internetarchive/warcprox/issues/27 2017-07-07 14:54:55 -07:00
Noah Levitt
ad3e6f405d call stop() at shutdown if present on plugins 2017-06-28 16:40:20 -07:00
Noah Levitt
9ea3540d63 fix misuse of += 2017-06-28 14:19:06 -07:00
Noah Levitt
2c95a1f2ee remove kafka feed code 2017-06-28 13:12:30 -07:00
Noah Levitt
4c32394256 new option --plugin 2017-06-28 12:53:34 -07:00
Noah Levitt
e31302a6e3 hide kafka options as first step toward removing them 2017-06-28 12:03:48 -07:00
Noah Levitt
5a8d1610e6 try to work around stupid travis build error, see https://blog.travis-ci.com/2017-06-21-trusty-updates-2017-Q2-launch 2017-06-23 14:12:04 -07:00
Noah Levitt
b23e485898 simplify recovery of stats batch in case of exception saving them (not sure what was wrong with summy_merge, but this is simpler) 2017-06-22 16:54:04 -07:00
Noah Levitt
c0ee9c6093 avoid holding the lock, which makes all warc writer threads block, while doing rethinkdb operations, in RethinkStatsDb 2017-06-22 16:17:25 -07:00
Noah Levitt
24082c2e8c don't wait for queue to be empty to do idle rollovers, because sometimes warcprox can stay busy for a long, long time 2017-06-22 15:04:01 -07:00
Noah Levitt
2f0c4454ac try not to let problems responding to kill -QUIT (which prints stack trace of each thread) kill the whole process 2017-06-12 16:51:50 -07:00
Noah Levitt
808950abb4 recover properly from exception updating stats in rethinkdb 2017-06-12 16:51:45 -07:00
Noah Levitt
1500341875 use %r instead of calling repr() 2017-06-07 16:05:47 -07:00
Noah Levitt
2f93cdcad9 use locking to ensure consistency and avoid this kind of test failure https://travis-ci.org/internetarchive/warcprox/jobs/235819316 2017-05-25 17:38:20 +00:00
Noah Levitt
00b982aa24 Merge pull request #25 from nlevitt/sqlite
get rid of dbm, switch to sqlite, for easier portability, clarity aro…
2017-05-24 14:25:45 -07:00
Noah Levitt
95dfa54968 get rid of dbm, switch to sqlite, for easier portability, clarity around threading 2017-05-24 13:57:09 -07:00
Noah Levitt
99dd840d20 use "ttl" for updated doublethink svc reg api 2017-05-23 10:37:39 -07:00
Noah Levitt
aca0b881c6 make sure records are written to warc in a predictable order to make tests pass consistently 2017-05-19 16:34:27 -07:00
Noah Levitt
ef5dd2e4ae multiple warc writer threads (hacked in with little thought to code organization) 2017-05-19 16:10:44 -07:00
Noah Levitt
515dd84aed lock to certauth < 1.2 until we port 2017-05-19 15:44:00 -07:00
Noah Levitt
a3dde3d97f fix mistake (incorrect interpration of concurrent.futures.ThreadPoolExecutor internals) that caused unnecessary waits, and unnecessarily long waits, before calling socket.accept() 2017-05-12 14:18:35 -07:00