mirror of
https://github.com/internetarchive/warcprox.git
synced 2025-01-18 13:22:09 +01:00
docs updates
This commit is contained in:
parent
d133565061
commit
8c52bd8442
@ -89,12 +89,13 @@ for deduplication works similarly to deduplication by `Heritrix
|
|||||||
4. If not found,
|
4. If not found,
|
||||||
|
|
||||||
a. Write ``response`` record with full payload
|
a. Write ``response`` record with full payload
|
||||||
b. Store new entry in deduplication database
|
b. Store new entry in deduplication database (can be disabled, see
|
||||||
|
`Warcprox-Meta HTTP request header <api.rst#warcprox-meta-http-request-header>`
|
||||||
|
|
||||||
The deduplication database is partitioned into different "buckets". URLs are
|
The deduplication database is partitioned into different "buckets". URLs are
|
||||||
deduplicated only against other captures in the same bucket. If specified, the
|
deduplicated only against other captures in the same bucket. If specified, the
|
||||||
``dedup-bucket`` field of the `Warcprox-Meta HTTP request header
|
``dedup-buckets`` field of the `Warcprox-Meta HTTP request header
|
||||||
<api.rst#warcprox-meta-http-request-header>`_ determines the bucket. Otherwise,
|
<api.rst#warcprox-meta-http-request-header>`_ determines the bucket(s). Otherwise,
|
||||||
the default bucket is used.
|
the default bucket is used.
|
||||||
|
|
||||||
Deduplication can be disabled entirely by starting warcprox with the argument
|
Deduplication can be disabled entirely by starting warcprox with the argument
|
||||||
|
10
api.rst
10
api.rst
@ -137,14 +137,16 @@ Example::
|
|||||||
|
|
||||||
Warcprox-Meta: {"warc-prefix": "special-warc"}
|
Warcprox-Meta: {"warc-prefix": "special-warc"}
|
||||||
|
|
||||||
``dedup-bucket`` (string)
|
``dedup-buckets`` (string)
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
Specifies the deduplication bucket. For more information about deduplication
|
Specifies the deduplication bucket(s). For more information about deduplication
|
||||||
see `<README.rst#deduplication>`_.
|
see `<README.rst#deduplication>`_.
|
||||||
|
|
||||||
Example::
|
Examples::
|
||||||
|
|
||||||
Warcprox-Meta: {"dedup-bucket":"my-dedup-bucket"}
|
Warcprox-Meta: {"dedup-buckets":{"my-dedup-bucket":"rw"}}
|
||||||
|
|
||||||
|
Warcprox-Meta: {"dedup-buckets":{"my-dedup-bucket":"rw", "my-read-only-dedup-bucket": "ro"}}
|
||||||
|
|
||||||
``blocks`` (list)
|
``blocks`` (list)
|
||||||
~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~
|
||||||
|
Loading…
x
Reference in New Issue
Block a user