docker image python:3 is now using 3.7 and building pyyaml < 3.13 fails
yaml/pyyaml#126
also filed pull request to update trough's pyyaml dependency spec
internetarchive/trough#20
the fact that we always save a record to the big captures table,
partly by adding a new check that --dedup-min-*-size is respected even
if there is an entry in the dedup db for the sha1
`test_dedup_https` fails on travis-ci.
https://travis-ci.org/internetarchive/warcprox/jobs/370598950
We didn't touch that at all but worked on `test_dedup_min_size` which
runs just before that. We move `test_dedup_min_size` to the end of the
file hoping to resolve this.
Create two very small dummy responses (text, 2 bytes and binary, 4 bytes).
Use options --dedup-min-text-size=3 and --dedup-min-binary-size=5.
Ensure that due to the effects of these options, dedup is not happening.
Existing dedup unit tests are not affected at all.
As of fairly recently, warcprox does keepalive with the remote server
using the urllib3 connection pool. The test http server in
test_warcprox.py was acting as if it supported keepalive (sending
HTTP/1.1 and not sending "Connection: close"). But in fact it did not
support keepalive. It was closing the connection after each request.
Depending on the timing of what was happening in different threads,
sometimes the client thread would try to send another request on a
connection it still believed to be open for keepalive. Then the server
side would complete its request processing and close the connection.
This resulted in test failures with error messages like this (depending
on python version):
2018-04-03 21:20:06,555 12586 ERROR MainThread warcprox.mitmproxy.MitmProxyHandler.do_COMMAND(mitmproxy.py:389) error from remote server(?) None: BadStatusLine("''",)
2018-04-04 19:06:29,599 11632 ERROR MainThread warcprox.mitmproxy.MitmProxyHandler.do_COMMAND(mitmproxy.py:389) error from remote server(?) None: RemoteDisconnected('Remote end closed connection without response',)
For instance https://travis-ci.org/internetarchive/warcprox/jobs/362288603
Add socket-timeout=4 in ``warcprox_`` test fixture.
Create test URL `/slow-url` which returns after 6 sec.
Trying to access the target URL raises a ``socket.timeout`` and returns
HTTP status 502.
The new ``--socket-timeout`` option does not hurt any other test using
the ``warcprox_`` fixture because they are too fast anyway.
* master:
fix running_stats thing
Update CdxServerDedup unit test
Chec writer._fname in unit test
Configurable CdxServerDedup urllib3 connection pool size
Yet another unit test fix
Change the writer unit test
fix github problem with unit test
Another fix for the unit test
Fix writer unit test
Add WarcWriter warc_filename unit test
Fix warc_filename default value
Configurable WARC filenames
To work correctly with the new way we init the
``CdxServerDedup.http_pool``. Use ``mock.MagicMock`` instead of
``mock.patch``. The unit test logic remains entirely the same.
For some reason this test previously failed in github. Maybe it has to
do with the temporary files I need to create there... in any case, I
changed what we check and evaluate the ``write._fname`` for the correct
filename format.