442 Commits

Author SHA1 Message Date
Noah Levitt
d4b39f3fcc remove some debugging from .travis.yml and importantly, deactivate the trough virtualenv before installing warcprox and running tests (otherwise it uses the wrong version of python) 2017-10-18 09:45:06 -07:00
Noah Levitt
4c4f8ead09 missed an ampersand 2017-10-17 14:58:46 -07:00
Noah Levitt
73d4a19c0a bangin (is the problem that we didn't start trough-read? 2017-10-17 14:42:54 -07:00
Noah Levitt
994eda70a8 banging 2017-10-17 14:33:36 -07:00
Noah Levitt
ddc88cda0f more banging on travis-ci 2017-10-16 16:05:23 -07:00
Noah Levitt
0e78140d47 cryptography 2.1.1 seems to be the problem 2017-10-13 16:52:08 -07:00
Noah Levitt
166aaab3e5 banging on travis-ci 2017-10-13 16:40:08 -07:00
Noah Levitt
892960d41a first attempt to run trough on travis-ci 2017-10-13 16:26:33 -07:00
Noah Levitt
828a2c3dcf get all the tests to pass with ./tests/run-tests.sh 2017-10-13 15:54:05 -07:00
Noah Levitt
369dc5c124 install and run trough in docker container for testing 2017-10-11 17:28:47 -07:00
Noah Levitt
d177b3b80d change rethinkdb-related command line options to use "rethinkdb urls" (parser just added to doublethink) to reduce the proliferation of rethinkdb options, and add --rethinkdb-trough-db-url option 2017-10-11 12:06:19 -07:00
Noah Levitt
4eda89f232 trough for deduplication initial proof-of-concept-ish code 2017-10-06 17:03:56 -07:00
Noah Levitt
9b8043d3a2 greatly simplify automated test setup by reusing initialization code from the command line executable; this also has the benefit of testing that initialization code 2017-10-06 17:00:35 -07:00
Noah Levitt
0cc68dd428 avoid TypeError: 'NoneType' object is not iterable exception at shutdown 2017-10-06 16:58:27 -07:00
Noah Levitt
908988c4f0 wait for rethinkdb indexes to be ready 2017-10-06 16:57:39 -07:00
Noah Levitt
0de10791aa Merge pull request #35 from vbanos/dedup-redundant-code
Remove redundant methods from dedup classes
2017-09-29 11:42:47 -07:00
Vangelis Banos
4e7d8fa917 Remove deleted `close` method call from test. 2017-09-29 06:36:37 +00:00
Noah Levitt
be6fe83c56 bump dev version number after merging pull requests 2017-09-28 14:37:30 -07:00
Noah Levitt
2e5f8a733a Merge pull request #33 from vbanos/fix-unit-tests
Add missing packages from setup.py, add tox config.
2017-09-28 14:35:48 -07:00
Noah Levitt
9aa330ecb3 Merge pull request #34 from vbanos/remove-unused
Remove unused imports
2017-09-28 14:34:58 -07:00
Vangelis Banos
6fd687f2b6 Add missing "," in deps 2017-09-28 20:37:15 +00:00
Vangelis Banos
51a2178cbd Remove tox.ini, move warcio to test_requires 2017-09-28 20:35:47 +00:00
Noah Levitt
faae23d764 allow very long request header lines, to support large warcprox-meta header values 2017-09-27 17:29:55 -07:00
Vangelis Banos
eb266f198d Remove redundant stop() & sync() dedup methods
Similarly with my previous commits, these methods do nothing.

I think that the reason they are here is because the author uses the
same style in other places in the code (e.g.
``warcprox.stats.StatsDb``). Similar methods exist there.
2017-09-24 13:44:13 +00:00
Vangelis Banos
d035147e3e Remove redundant close method from DedupDb and RethinkDedupDb
I'm trying to implement another DedupDb interface and I looked into the
use of each method. The ``close`` method of ``dedup.DedupDb`` and
``deup.RethinkDedupDb`` is empty.
It is also invoked from ``controller``.

Since it doesn't do anything and it won't in the foreseeable future,
let's remove it.
2017-09-24 13:36:12 +00:00
Vangelis Banos
66b4c35322 Remove unused imports 2017-09-24 11:15:30 +00:00
Vangelis Banos
b1819c51b9 Add missing packages from setup.py, add tox config.
Add missing `requests` and `warcio` packages. They are used in unit tests but
they were not included in `setup.py`.

Add `tox` configuration in order to be able to run unit tests for py27,
py34 and py35 with 1 command.
2017-09-24 10:51:29 +00:00
Noah Levitt
8bfda9f4b3 fix python2 tests 2017-09-20 11:03:36 -07:00
Noah Levitt
1bca9d0324 don't use http.client.HTTPResponse.getheader() to get the content-type header, because it can return a comma-delimited string 2017-09-18 14:45:16 -07:00
Noah Levitt
a8adaaf527 Merge pull request #30 from trifle/master
allow zero warc_writer_threads
2017-09-12 13:46:12 -07:00
Noah Levitt
b89f834ce3 no SIGQUIT on windows, so no SIGQUIT handler 2017-09-07 12:01:51 -07:00
Noah Levitt
3003c46c10 https://github.com/internetarchive/warcprox/pull/32 warrants a version bump 2017-09-07 10:33:21 -07:00
Noah Levitt
c73fdd91f8 Merge pull request #32 from internetarchive/trough
hello --plugin, goodbye kafka feed
2017-09-07 10:31:42 -07:00
Noah Levitt
db0f36c745 fix --size option (https://github.com/internetarchive/warcprox/issues/31) 2017-09-05 12:43:55 -07:00
Noah Levitt
7e55568851 fix --playback-port option (https://github.com/internetarchive/warcprox/issues/29) 2017-09-05 12:20:22 -07:00
Pascal Jürgens
940af4e888 fix zero-indexing of warc_writer_threads so they can be disabled via empty list 2017-08-18 15:52:34 +02:00
Noah Levitt
c0cb59e5af Merge branch 'master' into trough
* master:
  hidden argument --rethinkdb-big-table-name
  try to fix https://github.com/internetarchive/warcprox/issues/27
2017-08-03 11:22:27 -07:00
Noah Levitt
13ee68ce4a hidden argument --rethinkdb-big-table-name 2017-07-20 12:53:59 -07:00
Noah Levitt
b1a8fecd9d try to fix https://github.com/internetarchive/warcprox/issues/27 2017-07-07 14:54:55 -07:00
Noah Levitt
ad3e6f405d call stop() at shutdown if present on plugins 2017-06-28 16:40:20 -07:00
Noah Levitt
9ea3540d63 fix misuse of += 2017-06-28 14:19:06 -07:00
Noah Levitt
2c95a1f2ee remove kafka feed code 2017-06-28 13:12:30 -07:00
Noah Levitt
4c32394256 new option --plugin 2017-06-28 12:53:34 -07:00
Noah Levitt
e31302a6e3 hide kafka options as first step toward removing them 2017-06-28 12:03:48 -07:00
Noah Levitt
5a8d1610e6 try to work around stupid travis build error, see https://blog.travis-ci.com/2017-06-21-trusty-updates-2017-Q2-launch 2017-06-23 14:12:04 -07:00
Noah Levitt
b23e485898 simplify recovery of stats batch in case of exception saving them (not sure what was wrong with summy_merge, but this is simpler) 2017-06-22 16:54:04 -07:00
Noah Levitt
c0ee9c6093 avoid holding the lock, which makes all warc writer threads block, while doing rethinkdb operations, in RethinkStatsDb 2017-06-22 16:17:25 -07:00
Noah Levitt
24082c2e8c don't wait for queue to be empty to do idle rollovers, because sometimes warcprox can stay busy for a long, long time 2017-06-22 15:04:01 -07:00
Noah Levitt
2f0c4454ac try not to let problems responding to kill -QUIT (which prints stack trace of each thread) kill the whole process 2017-06-12 16:51:50 -07:00
Noah Levitt
808950abb4 recover properly from exception updating stats in rethinkdb 2017-06-12 16:51:45 -07:00