From 8c52bd8442d75e0a0da628610e77ab5979266980 Mon Sep 17 00:00:00 2001 From: Barbara Miller Date: Thu, 13 Jun 2019 17:18:51 -0700 Subject: [PATCH] docs updates --- README.rst | 7 ++++--- api.rst | 10 ++++++---- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/README.rst b/README.rst index b7b5c17..77e7e58 100644 --- a/README.rst +++ b/README.rst @@ -89,12 +89,13 @@ for deduplication works similarly to deduplication by `Heritrix 4. If not found, a. Write ``response`` record with full payload - b. Store new entry in deduplication database + b. Store new entry in deduplication database (can be disabled, see + `Warcprox-Meta HTTP request header ` The deduplication database is partitioned into different "buckets". URLs are deduplicated only against other captures in the same bucket. If specified, the -``dedup-bucket`` field of the `Warcprox-Meta HTTP request header -`_ determines the bucket. Otherwise, +``dedup-buckets`` field of the `Warcprox-Meta HTTP request header +`_ determines the bucket(s). Otherwise, the default bucket is used. Deduplication can be disabled entirely by starting warcprox with the argument diff --git a/api.rst b/api.rst index 1da1898..eee3219 100644 --- a/api.rst +++ b/api.rst @@ -137,14 +137,16 @@ Example:: Warcprox-Meta: {"warc-prefix": "special-warc"} -``dedup-bucket`` (string) +``dedup-buckets`` (string) ~~~~~~~~~~~~~~~~~~~~~~~~~ -Specifies the deduplication bucket. For more information about deduplication +Specifies the deduplication bucket(s). For more information about deduplication see ``_. -Example:: +Examples:: - Warcprox-Meta: {"dedup-bucket":"my-dedup-bucket"} + Warcprox-Meta: {"dedup-buckets":{"my-dedup-bucket":"rw"}} + + Warcprox-Meta: {"dedup-buckets":{"my-dedup-bucket":"rw", "my-read-only-dedup-bucket": "ro"}} ``blocks`` (list) ~~~~~~~~~~~~~~~~~