Check also that locking succeeds after the writer closes the WARC file.
Remove parametrize from ``test_warc_writer_locking``, test only for the
``no_warc_open_suffix=True`` option.
Change `1` to `OBTAINED LOCK` and `0` to `FAILED TO OBTAIN LOCK` in
``lock_file`` method.
Replace timestamp parameter with more generic extra_response_headers={}
When request has --header ``Warcprox-Meta: {\"accept\":[\"capture-metadata\"]}"``
Response has the following header:
``Warcprox-Meta: {"capture-metadata":{"timestamp":"2017-10-31T10:47:50Z"}}``
Update unit test
When client request has HTTP header ``Warcprox-Meta": {"return-capture-timestamp": 1}``,
add to the response the WARC record timestamp in the following HTTP header:
``Warcprox-Meta: {"capture-timestamp": '%Y-%m-%d %H:%M:%S"}``.
Add unit test.
On Linux, `fcntl.flock` is implemented with `flock(2)`, and
`fcntl.lockf` is implemented with `fcntl(2)` — they are not compatible.
Java `lock()` appears to be `fcntl(2)`. So, other Java programs working
with these files work correctly only with `fcntl.lockf`.
`warcprox` MUST use `fcntl.lockf`
Replace ``_split_timestamp`` with ``datetime.strptime`` in
``warcprox.dedup``.
Remove ``isinstance()`` and add optional ``record_url`` in the rest of
the dedup ``lookup`` methods.
Make `--cdxserver-dedup` option help more explanatory.
Add multiple ``CdxServerDedup`` unit tests to simulate found, not found and
invalid responses from the CDX server. Use a different file
``tests/test_dedup.py`` because we test the CdxServerDedup component
individually and it belongs to the ``warcprox.dedup`` package.
Add ``mock`` package to dev requirements.
Rework the warcprox.dedup.CdxServerDedup class to have better exception
handling.
Stop adding WarcRecord.REFERS_TO when building WARC record. Methods
``warc.WarcRecordBuilder._build_response_principal_record`` and
``warc.WarcRecordBuilder.build_warc_record``.
Replace ``record_id`` (WarcRecord.REFERS_TO) with payload_digest in
``playback``.
Playback database has ``{'f': warcfile, 'o': offset, 'd':
payload_digest}`` instead of ``'i': record_id``.
Make all ``dedup`` classes return only `url` and `date`. Drop `id`.
Similarly with my previous commits, these methods do nothing.
I think that the reason they are here is because the author uses the
same style in other places in the code (e.g.
``warcprox.stats.StatsDb``). Similar methods exist there.