Barbara Miller
d87aa0ca57
Merge branch 'do_not_archive' into qa
2018-02-28 12:31:03 -08:00
Barbara Miller
289f4335ef
isinstance(controller._postfetch_chain[0], EarlyPlugin)
2018-02-28 12:28:18 -08:00
Barbara Miller
e65dee57d4
minor test edits
2018-02-28 12:28:18 -08:00
Barbara Miller
6ce5119a48
add test_do_not_archive
2018-02-28 12:28:18 -08:00
Barbara Miller
7f50ecab0a
[0] isinstance of parent class
2018-02-28 12:28:18 -08:00
Barbara Miller
1334b4a546
restore master test_warcprox.py
2018-02-28 12:28:18 -08:00
Barbara Miller
f5dd2fe03b
add test_do_not_archive, tweak early plugin name
2018-02-28 12:28:18 -08:00
Barbara Miller
3161793c5c
add test_do_not_archive
2018-02-27 22:23:40 -08:00
Barbara Miller
84e5110bcb
[0] isinstance of parent class
2018-02-27 21:36:00 -08:00
Barbara Miller
9e2f357bab
restore master test_warcprox.py
2018-02-27 19:49:12 -08:00
Barbara Miller
cb05fc0e09
test issubclass
2018-02-27 18:31:00 -08:00
Barbara Miller
f30fb40393
try tuple
2018-02-27 17:00:08 -08:00
Barbara Miller
97f7b2f3fd
type?
2018-02-27 16:44:36 -08:00
Barbara Miller
3ed551c3be
try not Foo
2018-02-27 16:22:38 -08:00
Barbara Miller
0c650e1158
try __name__...
2018-02-27 16:02:53 -08:00
Barbara Miller
b2672ab2f4
move test_do_not_archive
2018-02-27 15:38:55 -08:00
Barbara Miller
b554831179
add test_do_not_archive, tweak early plugin name
2018-02-27 15:20:24 -08:00
Barbara Miller
39b2fe86d9
test early plugin
2018-02-27 14:46:25 -08:00
Noah Levitt
f3e270b796
make test_method_filter() pass by waiting
...
in test_limit_large_resource() for url processing to finish, to prevent
stats from affecting the subsequent test
2018-02-20 14:54:58 -08:00
Vangelis Banos
985fdf1ac3
Add a unit test for --max-resource-size option
2018-02-19 14:23:22 +00:00
Noah Levitt
fd81190517
refactor the multithreaded warc writing
...
main functional change is that only as man warc files are created as are
needed to keep up with the throughput
2018-02-07 15:48:43 -08:00
Vangelis Banos
8d1df04797
Add socket-timeout unit test
...
Add socket-timeout=4 in ``warcprox_`` test fixture.
Create test URL `/slow-url` which returns after 6 sec.
Trying to access the target URL raises a ``socket.timeout`` and returns
HTTP status 502.
The new ``--socket-timeout`` option does not hurt any other test using
the ``warcprox_`` fixture because they are too fast anyway.
2018-02-07 15:48:42 -08:00
Noah Levitt
824c194142
make plugin api more flexible
2018-01-24 16:07:45 -08:00
Noah Levitt
d590dee59a
fix port conflict test failure on travis-ci
2018-01-18 12:00:27 -08:00
Noah Levitt
6cc6cf4f28
fix plugin loading and add a rudimentary test case
2018-01-18 11:38:24 -08:00
Noah Levitt
bed04af440
postfetch chain info for /status and service reg
...
including number of queued urls for each processor
2018-01-18 11:12:52 -08:00
Noah Levitt
a974ec86fa
fixes to make tests pass
2018-01-17 15:33:41 -08:00
Noah Levitt
9c5a5eda99
use batch postfetch processor for stats
2018-01-17 14:58:52 -08:00
Noah Levitt
5354648512
Merge branch 'master' into wip-postfetch-chain
...
* master:
fix running_stats thing
Update CdxServerDedup unit test
Chec writer._fname in unit test
Configurable CdxServerDedup urllib3 connection pool size
Yet another unit test fix
Change the writer unit test
fix github problem with unit test
Another fix for the unit test
Fix writer unit test
Add WarcWriter warc_filename unit test
Fix warc_filename default value
Configurable WARC filenames
2018-01-16 16:01:40 -08:00
Noah Levitt
6ff9030e67
improve batching, make tests pass
2018-01-16 15:18:53 -08:00
Noah Levitt
d7208d89c6
Merge pull request #50 from vbanos/cdxserverdedup-maxsize
...
Configurable CdxServerDedup urllib3 connection pool size
2018-01-15 16:46:37 -08:00
Noah Levitt
c9a39958db
tests are passing
2018-01-15 14:49:13 -08:00
Vangelis Banos
4a165e5f77
Update CdxServerDedup unit test
...
To work correctly with the new way we init the
``CdxServerDedup.http_pool``. Use ``mock.MagicMock`` instead of
``mock.patch``. The unit test logic remains entirely the same.
2018-01-15 20:58:36 +00:00
Vangelis Banos
f73e625d6b
Chec writer._fname in unit test
...
For some reason this test previously failed in github. Maybe it has to
do with the temporary files I need to create there... in any case, I
changed what we check and evaluate the ``write._fname`` for the correct
filename format.
2018-01-15 20:17:22 +00:00
Vangelis Banos
47ea3110be
Yet another unit test fix
2018-01-10 20:55:31 +00:00
Vangelis Banos
b2c47142de
Change the writer unit test
...
To be able to run in github.
2018-01-10 20:38:06 +00:00
Vangelis Banos
e737a30ec1
fix github problem with unit test
2018-01-10 19:29:22 +00:00
Vangelis Banos
deddd4f850
Another fix for the unit test
2018-01-10 18:52:59 +00:00
Vangelis Banos
9d789cdae8
Fix writer unit test
2018-01-10 18:41:56 +00:00
Vangelis Banos
d2ce61aec9
Add WarcWriter warc_filename unit test
...
Use custom ``warc_filename`` option and check that the created WARC
filename follows the defined pattern.
2018-01-09 12:54:42 +00:00
Noah Levitt
f401b21958
update test_svcreg_status to expect new fields
2017-12-29 13:03:45 -08:00
Noah Levitt
5347cc92c3
change where RunningStats is initialized and fix tests
2017-12-29 11:06:46 -08:00
Noah Levitt
9784c91459
test for special warc prefix "-" which means "do not archive"
2017-12-21 14:31:54 -08:00
jkafader
e5a3dd8b3e
Merge pull request #37 from nlevitt/trough-dedup
...
WIP: trough for deduplication initial proof-of-concept-ish code
2017-11-30 16:14:43 -08:00
Noah Levitt
61a7c234e8
fix warcprox-ensure-rethinkdb-tables and add tests
2017-11-28 10:38:38 -08:00
Noah Levitt
330635c0a8
fix test in py<=3.4
2017-11-22 13:55:44 -08:00
Noah Levitt
5be289730f
fix failing test, and change response code from 500 to more appropriate 502
2017-11-22 13:11:26 -08:00
Noah Levitt
627ef5667b
failing test for correct handling of "http.client.RemoteDisconnected: Remote end closed connection without response" from remote server
2017-11-22 12:49:46 -08:00
Noah Levitt
f5351a43df
automatic segment promotion every hour
2017-11-13 14:22:17 -08:00
Noah Levitt
c40ad8391d
Merge branch 'master' into trough-dedup
...
* master:
hopefully fix test failing occasionally apparently due to race condition by checking that the file we're waiting for has some content
fix payload digest by pulling calculation up one level where content has already been transfer-decoded
new failing test for correct calculation of payload digest
missed a spot handling case of no warc records written
2017-11-13 11:47:04 -08:00