docs updates

This commit is contained in:
Barbara Miller 2019-06-13 17:18:51 -07:00
parent d133565061
commit 8c52bd8442
2 changed files with 10 additions and 7 deletions

View File

@ -89,12 +89,13 @@ for deduplication works similarly to deduplication by `Heritrix
4. If not found, 4. If not found,
a. Write ``response`` record with full payload a. Write ``response`` record with full payload
b. Store new entry in deduplication database b. Store new entry in deduplication database (can be disabled, see
`Warcprox-Meta HTTP request header <api.rst#warcprox-meta-http-request-header>`
The deduplication database is partitioned into different "buckets". URLs are The deduplication database is partitioned into different "buckets". URLs are
deduplicated only against other captures in the same bucket. If specified, the deduplicated only against other captures in the same bucket. If specified, the
``dedup-bucket`` field of the `Warcprox-Meta HTTP request header ``dedup-buckets`` field of the `Warcprox-Meta HTTP request header
<api.rst#warcprox-meta-http-request-header>`_ determines the bucket. Otherwise, <api.rst#warcprox-meta-http-request-header>`_ determines the bucket(s). Otherwise,
the default bucket is used. the default bucket is used.
Deduplication can be disabled entirely by starting warcprox with the argument Deduplication can be disabled entirely by starting warcprox with the argument

10
api.rst
View File

@ -137,14 +137,16 @@ Example::
Warcprox-Meta: {"warc-prefix": "special-warc"} Warcprox-Meta: {"warc-prefix": "special-warc"}
``dedup-bucket`` (string) ``dedup-buckets`` (string)
~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~
Specifies the deduplication bucket. For more information about deduplication Specifies the deduplication bucket(s). For more information about deduplication
see `<README.rst#deduplication>`_. see `<README.rst#deduplication>`_.
Example:: Examples::
Warcprox-Meta: {"dedup-bucket":"my-dedup-bucket"} Warcprox-Meta: {"dedup-buckets":{"my-dedup-bucket":"rw"}}
Warcprox-Meta: {"dedup-buckets":{"my-dedup-bucket":"rw", "my-read-only-dedup-bucket": "ro"}}
``blocks`` (list) ``blocks`` (list)
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~