mirror of
https://github.com/internetarchive/warcprox.git
synced 2025-01-18 13:22:09 +01:00
describe the last two remaining fields
This commit is contained in:
parent
4a87a08230
commit
cd6e30fe36
31
api.rst
31
api.rst
@ -142,7 +142,7 @@ It is the way url and data limits on jobs, seeds, and hosts are implemented,
|
|||||||
among other things.
|
among other things.
|
||||||
|
|
||||||
Warcprox-Meta fields
|
Warcprox-Meta fields
|
||||||
-------------------
|
--------------------
|
||||||
|
|
||||||
``warc-prefix`` (string)
|
``warc-prefix`` (string)
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
@ -280,6 +280,11 @@ Example::
|
|||||||
|
|
||||||
``metadata`` (dictionary)
|
``metadata`` (dictionary)
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
An arbitrary dictionary. Warcprox mostly ignores this. The one exception is
|
||||||
|
that if it has a ``seed`` entry and crawl logs are enabled via the
|
||||||
|
``--crawl-log-dir`` command line option, the value of ``seed`` is written to
|
||||||
|
the crawl log as the 11th field on the line, simulating heritrix's "source
|
||||||
|
tag".
|
||||||
|
|
||||||
Example::
|
Example::
|
||||||
|
|
||||||
@ -287,24 +292,28 @@ Example::
|
|||||||
|
|
||||||
``accept`` (list)
|
``accept`` (list)
|
||||||
~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~
|
||||||
|
Specifies fields that the client would like to receive in the ``Warcprox-Meta``
|
||||||
|
*response* header. Only one value is currently understood,
|
||||||
|
``capture-metadata``.
|
||||||
|
|
||||||
Example::
|
Example::
|
||||||
|
|
||||||
request_meta = {"accept": ["capture-metadata"]}
|
Warcprox-Meta: {"accept": ["capture-metadata"]}
|
||||||
|
|
||||||
|
The response will include a ``Warcpro-Meta`` response header with one field
|
||||||
|
also called ``captured-metadata``. Currently warcprox reports one piece of
|
||||||
|
capture medata, ``timestamp``, which represents the time fetch began for the
|
||||||
|
resource and matches the ``WARC-Date`` written to the warc record. For
|
||||||
|
example::
|
||||||
|
|
||||||
|
Warcprox-Meta: {"capture-metadata":{"timestamp":"2018-05-30T00:22:49Z"}}
|
||||||
|
|
||||||
``Warcprox-Meta`` http response header
|
``Warcprox-Meta`` http response header
|
||||||
======================================
|
======================================
|
||||||
|
|
||||||
In some cases warcprox will add a ``Warcprox-Meta`` header in the http response
|
In some cases warcprox will add a ``Warcprox-Meta`` header in the http response
|
||||||
that it sends to the client. Like the request header, the value is a json blob.
|
that it sends to the client. Like the request header, the value is a json blob.
|
||||||
It is only included if something in the ``warcprox-meta`` request header calls
|
It is only included if something in the ``warcprox-meta`` request header calls
|
||||||
for it. Those cases are described above in the
|
for it. Those cases are described above in the
|
||||||
`#warcprox-meta-http-request-header`_ section.
|
`Warcprox-Meta http request header`_ section.
|
||||||
|
|
||||||
|
|
||||||
### - blocked-by-rule
|
|
||||||
### - reached-limit
|
|
||||||
### - reached-soft-limit
|
|
||||||
### - stats
|
|
||||||
### - capture-metadata
|
|
||||||
###
|
|
||||||
### Response codes 420, 430
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user