6 Commits

Author SHA1 Message Date
Vangelis Banos
4a165e5f77 Update CdxServerDedup unit test
To work correctly with the new way we init the
``CdxServerDedup.http_pool``. Use ``mock.MagicMock`` instead of
``mock.patch``. The unit test logic remains entirely the same.
2018-01-15 20:58:36 +00:00
Vangelis Banos
f6b1d6f408 Update CdxServerDedup lookup algorithm
Get only one item from CDX (``limit=-1``).

Update unit tests
2017-10-21 20:45:46 +00:00
Vangelis Banos
4fb44a7e9d Pass url instead of recorded_url obj to dedup lookup methods 2017-10-21 20:24:28 +00:00
Vangelis Banos
bc3d0cb4f6 Fix minor CdxServerDedup unit test 2017-10-19 22:57:33 +00:00
Vangelis Banos
960dda4c31 Add CdxServerDedup unit tests and improve its exception handling
Add multiple ``CdxServerDedup`` unit tests to simulate found, not found and
invalid responses from the CDX server. Use a different file
``tests/test_dedup.py`` because we test the CdxServerDedup component
individually and it belongs to the ``warcprox.dedup`` package.

Add ``mock`` package to dev requirements.

Rework the warcprox.dedup.CdxServerDedup class to have better exception
handling.
2017-10-19 22:11:22 +00:00
Vangelis Banos
fc5f39ffed Add CDX Server based deduplication
Add ``--cdxserver-dedup URL`` option.
Create ``warcprox.dedup.CdxServerDedup`` class.
Add dummy unit test (TODO)
2017-10-19 14:33:12 +00:00