mirror of
https://github.com/webrecorder/pywb.git
synced 2025-03-14 15:53:28 +01:00
Docs Update: OpenWayback -> pywb Transition Guide (#588)
* docs work on OpenWayback -> pywb transition, part 1 * docs: add config change examples, exclusions and deploy recommendations * update with path index example * update terms with collection info * docs update: - add zipnum examples to owb-to-pywb config transition - add working docker compose examples for nginx subdirectory, apache subdirectory and outback cdx deployment in ./sample-deploy - update usage and owb-to-pywb deployment docs with updated subdiretory deployment info + sample-deploy links * tweak exclusion info, deploy title * add missing filee uwsgi_subdir.ini * Docs: fix typos and clarifications from review (thanks @ldko!) Co-authored-by: Lauren Ko <lauren.ko@unt.edu> * docs: explain that existing cdx can be added to outbackcdx, explain reindexing is optional * docs: elaborate on docker-compose examples * minor tweaks * update to latest wombat 3.0.2 * update CHANGES.rst * bump version to 2.5.0 for release Co-authored-by: Lauren Ko <lauren.ko@unt.edu>
This commit is contained in:
parent
7b51101b04
commit
9e09bcd2a7
@ -4,6 +4,8 @@ karma-tests/
|
|||||||
tests_disabled/
|
tests_disabled/
|
||||||
venv/
|
venv/
|
||||||
collections/
|
collections/
|
||||||
|
wombat/
|
||||||
|
docs/
|
||||||
|
|
||||||
.cache/
|
.cache/
|
||||||
.eggs/
|
.eggs/
|
||||||
|
10
CHANGES.rst
10
CHANGES.rst
@ -1,3 +1,13 @@
|
|||||||
|
pywb 2.5.0 changelist
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
* New OpenWayback->pywb Transition Guide: ``https://pywb.readthedocs.io/en/latest/manual/owb-transition.html``
|
||||||
|
|
||||||
|
* Sample deployments with Docker Compose for running with Apache, Nginx and OutbackCDX in ``sample-deploy`` directory.
|
||||||
|
|
||||||
|
* Update to latest gevent to fix issues with latest python `#583 <https://github.com/webrecorder/pywb/pull/583>`_
|
||||||
|
|
||||||
|
|
||||||
pywb 2.4.2 changelist
|
pywb 2.4.2 changelist
|
||||||
~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
@ -20,6 +20,7 @@ A subset of features provides the basic functionality of a "Wayback Machine".
|
|||||||
manual/ui-customization
|
manual/ui-customization
|
||||||
manual/architecture
|
manual/architecture
|
||||||
manual/apis
|
manual/apis
|
||||||
|
manual/owb-transition
|
||||||
code/pywb
|
code/pywb
|
||||||
|
|
||||||
|
|
||||||
|
@ -34,6 +34,8 @@ To disable framed replay add:
|
|||||||
Note: pywb also supports HTTP/S **proxy mode** which requires additional setup. See :ref:`https-proxy` for more details.
|
Note: pywb also supports HTTP/S **proxy mode** which requires additional setup. See :ref:`https-proxy` for more details.
|
||||||
|
|
||||||
|
|
||||||
|
.. _dir_structure:
|
||||||
|
|
||||||
Directory Structure
|
Directory Structure
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
|
31
docs/manual/migrating-cdx.rst
Normal file
31
docs/manual/migrating-cdx.rst
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
.. _migrating-cdx:
|
||||||
|
|
||||||
|
Migrating CDX
|
||||||
|
=============
|
||||||
|
|
||||||
|
If you are not using OutbackCDX, you may need to check on the format of the CDX files that you are using.
|
||||||
|
|
||||||
|
Over the years, there have been many variations on the CDX (capture index) format which is used by OpenWayback and pywb to look up captures in WARC/ARC files.
|
||||||
|
|
||||||
|
When migrating CDX from OpenWayback, there are a few options.
|
||||||
|
|
||||||
|
pywb currently supports:
|
||||||
|
|
||||||
|
- 9 field CDX (surt-ordered)
|
||||||
|
- 11 field CDX (surt-ordered)
|
||||||
|
- CDXJ (surt-ordered)
|
||||||
|
|
||||||
|
pywb will support the 11-field and 9-field `CDX format <http://iipc.github.io/warc-specifications/specifications/cdx-format/cdx-2015/>`_ that is also used in OpenWayback.
|
||||||
|
|
||||||
|
Non-SURT ordered CDXs are not currently supported, though they may be supported in the future (see this `pending pull request <https://github.com/webrecorder/pywb/pull/586>`_).
|
||||||
|
|
||||||
|
CDXJ Conversion
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The native format used by pywb is the :ref:`cdxj-index` with SURT-ordering, which uses JSON to encode the fields, allowing for more flexibility by storing most of the index in a JSON, allowing support for optional fields as needed.
|
||||||
|
|
||||||
|
If your CDX are not SURT-ordered, 11 or 9 field CDX, or if there is a mix, pywb also offers a conversion utility which will convert all CDX to the pywb native CDXJ: ::
|
||||||
|
|
||||||
|
wb-manager cdx-convert <dir-of-cdx-files>
|
||||||
|
|
||||||
|
The converter will read the CDX files and create a corresponding .cdxj file for every cdx file. Since the conversion happens on the .cdx itself, it does not require reindexing the source WARC/ARC files and can happen fairly quickly. The converted CDXJ are guaranteed to be in the right format to work with pywb.
|
74
docs/manual/outbackcdx.rst
Normal file
74
docs/manual/outbackcdx.rst
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
.. _using-outback:
|
||||||
|
|
||||||
|
|
||||||
|
Using OutbackCDX with pywb
|
||||||
|
==========================
|
||||||
|
|
||||||
|
The recommended setup is to run `OutbackCDX <https://github.com/nla/outbackcdx>`_ alongside pywb.
|
||||||
|
OutbackCDX provides an index (CDX) server and can efficiently store and look up web archive data by URL.
|
||||||
|
|
||||||
|
|
||||||
|
Adding CDX to OutbackCDX
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
To set up OutbackCDX, please follow the instructions on the `OutbackCDX README <https://github.com/nla/outbackcdx>`_.
|
||||||
|
|
||||||
|
Since pywb also uses the default port 8080, be sure to use a different port for OutbackCDX, eg. ``java -jar outbackcdx*.jar -p 8084``.
|
||||||
|
|
||||||
|
OutbackCDX can generally ingest existing CDX used in OpenWayback simply by POSTing to OutbackCDX at a new index endpoint.
|
||||||
|
|
||||||
|
For example, assuming OutbackCDX is running on port 8084, to add CDX for ``index1.cdx``, ``index2.cdx``, run:
|
||||||
|
|
||||||
|
.. code:: console
|
||||||
|
|
||||||
|
curl -X POST --data-binary @index1.cdx http://localhost:8084/mycoll
|
||||||
|
curl -X POST --data-binary @index2.cdx http://localhost:8084/mycoll
|
||||||
|
|
||||||
|
The contents of each CDX file are added to the ``mycoll`` OutbackCDX index, which can correspond to the web archive collection ``mycoll``.
|
||||||
|
The index is created automatically if it does not exist.
|
||||||
|
|
||||||
|
See the `OutbackCDX Docs <https://github.com/nla/outbackcdx#loading-records>`_ for more info on ingesting CDX.
|
||||||
|
|
||||||
|
|
||||||
|
(Re)generating CDX from WARCs
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
There are some exceptions where it may be useful to re-generate the CDX with pywb for existing WARCs:
|
||||||
|
|
||||||
|
- If your CDX is 9-field and does not include the compressed length, regnerating the CDX will result in more efficient HTTP range requests
|
||||||
|
- If you want to replay pages with POST requests, pywb generated CDX will soon be supported in OutbackCDX (see: `Issue #585 <https://github.com/webrecorder/pywb/issues/585>`_, `Issue #91 <https://github.com/nla/outbackcdx/pull/91>`_ )
|
||||||
|
|
||||||
|
|
||||||
|
To generate the CDX, run the ``cdx-indexer`` command (with ``-p`` flag for POST request handling) for each WARC or set of WARCs you wish to index:
|
||||||
|
|
||||||
|
.. code:: console
|
||||||
|
|
||||||
|
cdx-indexer /path/to/mywarcs/my.warc.gz > ./index1.cdx
|
||||||
|
cdx-indexer /path/to/all_warcs/*warc.gz > ./index2.cdx
|
||||||
|
|
||||||
|
|
||||||
|
Then, run the POST command as shown above to ingest to OutbackCDX.
|
||||||
|
|
||||||
|
The above can be repeated for each WARC file, or for a set of WARCs using the ``*.warc.gz`` wildcard.
|
||||||
|
|
||||||
|
If a CDX index is too big, OutbackCDX may fail and ingesting an index per-WARC may be needed.
|
||||||
|
|
||||||
|
|
||||||
|
Configure pywb with OutbackCDX
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
The ``config.yaml`` should be configured to point to OutbackCDX.
|
||||||
|
|
||||||
|
Assuming a collection named ``mycoll``, the ``config.yaml`` can be configured as follows to use OutbackCDX
|
||||||
|
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
mycoll:
|
||||||
|
index_paths: cdx+http://localhost:8084/mycoll
|
||||||
|
archive_paths: /path/to/mywarcs/
|
||||||
|
|
||||||
|
|
||||||
|
The ``archive_paths`` can be configured to point to a directory of WARCs or a path index.
|
||||||
|
|
42
docs/manual/owb-pywb-terms.rst
Normal file
42
docs/manual/owb-pywb-terms.rst
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
OpenWayback vs pywb Terms
|
||||||
|
=========================
|
||||||
|
|
||||||
|
pywb and OpenWayback use slightly different terms to describe the configuration options, as explained below.
|
||||||
|
|
||||||
|
Some differences are:
|
||||||
|
- The ``wayback.xml`` config file in OpenWayback is replaced with ``config.yaml`` yaml
|
||||||
|
- The terms ``Access Point`` and ``Wayback Collection`` are replaced with ``Collection`` in pywb. The collection configuration represents a unique path (access point) and the data that is accessed at that path.
|
||||||
|
- The ``Resource Store`` in OpenWayback is known in pywb as the archive paths, configured under ``archive_paths``
|
||||||
|
- The ``Resource Index`` in OpenWayback is known in pywb as the index paths, configurable under ``index_paths``
|
||||||
|
- The ``Exclusions`` in OpenWayback are replaced with general :ref:`access-control`
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Pywb Collection Basics
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
A pywb collection must consist of a minimum of three parts: the collection name, the ``index_paths`` (where to read the index), and the ``archive_paths`` (where to read the WARC files).
|
||||||
|
|
||||||
|
The collection is accessed by name, so there is no distinct access point.
|
||||||
|
|
||||||
|
The collections are configured in the ``config.yaml`` under the ``collections`` key:
|
||||||
|
|
||||||
|
For example, a basic collection definition can be specified via:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: /archive/cdx/
|
||||||
|
archive_paths: /archive/storage/warcs/
|
||||||
|
|
||||||
|
|
||||||
|
Pywb also supports a convention-based directory structure. Collections created in this structure can be detected automatically
|
||||||
|
and need not be specified in the ``config.yaml``. This structure is designed for smaller collections that are all stored locally in a subdirectory.
|
||||||
|
|
||||||
|
See the :ref:`dir_structure` for the default pywb directory structure.
|
||||||
|
|
||||||
|
However, for importing existing collections from OpenWayback, it is probably easier to specify the existing paths as shown above.
|
||||||
|
|
||||||
|
|
||||||
|
|
308
docs/manual/owb-to-pywb-config.rst
Normal file
308
docs/manual/owb-to-pywb-config.rst
Normal file
@ -0,0 +1,308 @@
|
|||||||
|
Converting OpenWayback Config to pywb Config
|
||||||
|
============================================
|
||||||
|
|
||||||
|
OpenWayback includes many different types of configurations.
|
||||||
|
|
||||||
|
For most use cases, using OutbackCDX with pywb is the recommended approach, as explained in :ref:`using-outback`.
|
||||||
|
|
||||||
|
The following are a few specific example of WaybackCollections gathered from active OpenWayback configurations
|
||||||
|
and how they can be configured for use with pywb.
|
||||||
|
|
||||||
|
|
||||||
|
Remote Collection / Access Point
|
||||||
|
--------------------------------
|
||||||
|
|
||||||
|
A collection configured with a remote index and WARC access can be converted to use OutbackCDX
|
||||||
|
for the remote index, while pywb can load WARCs directly from an HTTP endpoint.
|
||||||
|
|
||||||
|
For example, a configuration similar to:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<bean name="standardaccesspoint" class="org.archive.wayback.webapp.AccessPoint">
|
||||||
|
<property name="accessPointPath" value="/wayback/"/>
|
||||||
|
<property name="collection" ref="remotecollection" />
|
||||||
|
...
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
<bean id="remotecollection" class="org.archive.wayback.webapp.WaybackCollection">
|
||||||
|
<property name="resourceStore">
|
||||||
|
<bean class="org.archive.wayback.resourcestore.SimpleResourceStore">
|
||||||
|
<property name="prefix" value="http://myarchive.example.com/RemoteStore/" />
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
<property name="resourceIndex">
|
||||||
|
<bean class="org.archive.wayback.resourceindex.RemoteResourceIndex">
|
||||||
|
<property name="searchUrlBase" value="http://myarchive.example.com/RemoteIndex" />
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
can be converted to the following config, with OutbackCDX assumed to be running
|
||||||
|
at: ``http://myarchive.example.com/RemoteIndex``
|
||||||
|
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: cdx+http://myarchive.example.com/RemoteIndex
|
||||||
|
archive_paths: http://myarchive.example.com/RemoteStore/
|
||||||
|
|
||||||
|
Local Collection / Access Point
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
An OpenWayback configuration with a local collection and local CDX, for example:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<bean id="collection" class="org.archive.wayback.webapp.WaybackCollection">
|
||||||
|
<property name="resourceIndex">
|
||||||
|
<bean class="org.archive.wayback.resourceindex.cdxserver.EmbeddedCDXServerIndex">
|
||||||
|
...
|
||||||
|
<property name="cdxServer">
|
||||||
|
<bean class="org.archive.cdxserver.CDXServer">
|
||||||
|
<property name="cdxSource">
|
||||||
|
<bean class="org.archive.format.cdx.MultiCDXInputSource">
|
||||||
|
<property name="cdxUris">
|
||||||
|
<list>
|
||||||
|
<value>/wayback/cdx/mycdx1.cdx</value>
|
||||||
|
<value>/wayback/cdx/mycdx2.cdx</value>
|
||||||
|
</list>
|
||||||
|
</property>
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
<property name="cdxFormat" value="cdx11"/>
|
||||||
|
<property name="surtMode" value="true"/>
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
...
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
|
||||||
|
can be configured in pywb using the ``index_paths`` key.
|
||||||
|
|
||||||
|
Note that the CDX files should all be in the same format. See :ref:`migrating-cdx` for more info on converting
|
||||||
|
CDX to pywb native CDXJ format.
|
||||||
|
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: /wayback/cdx/
|
||||||
|
archive_paths: ...
|
||||||
|
|
||||||
|
|
||||||
|
It's also possible to combine directories, individual CDX files, and even a remote index from OutbackCDX in a single collection
|
||||||
|
(as long as all CDX are in the same format).
|
||||||
|
|
||||||
|
pywb will query all the sources simultaneously to find the best match.
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_group:
|
||||||
|
cdx1: /wayback/cdx1/
|
||||||
|
cdx2: /wayback/cdx2/mycdx.cdx
|
||||||
|
remote: cdx+https://myarchive.example.com/outbackcdx
|
||||||
|
|
||||||
|
archive_paths: ...
|
||||||
|
|
||||||
|
However, OutbackCDX is still recommended to avoid more complex CDX configurations.
|
||||||
|
|
||||||
|
|
||||||
|
WatchedCDXSource
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
OpenWayback includes a 'Watched CDX Source' option which watches a directory for new CDX indexes.
|
||||||
|
This functionality is default in pywb when specifying a directory for the index path:
|
||||||
|
|
||||||
|
For example, the config:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<property name="source">
|
||||||
|
<bean class="org.archive.wayback.resourceindex.WatchedCDXSource">
|
||||||
|
<property name="recursive" value="false" />
|
||||||
|
<property name="filters">
|
||||||
|
<list>
|
||||||
|
<value>^.+\.cdx$</value>
|
||||||
|
</list>
|
||||||
|
</property>
|
||||||
|
<property name="path" value="/wayback/cdx-index/" />
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
can be replaced with:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: /wayback/cdx-index/
|
||||||
|
archive_paths: ...
|
||||||
|
|
||||||
|
|
||||||
|
pywb will load all CDX from that directory.
|
||||||
|
|
||||||
|
|
||||||
|
ZipNum Cluster Index
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
pywb also supports using a compressed :ref:`zipnum` instead of a plain text CDX. For example, the following OpenWayback configuration:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<bean id="collection" class="org.archive.wayback.webapp.WaybackCollection">
|
||||||
|
<property name="resourceIndex">
|
||||||
|
<bean class="org.archive.wayback.resourceindex.LocalResourceIndex">
|
||||||
|
...
|
||||||
|
<property name="source">
|
||||||
|
<bean class="org.archive.wayback.resourceindex.ZipNumClusterSearchResultSource">
|
||||||
|
<property name="cluster">
|
||||||
|
<bean class="org.archive.format.gzip.zipnum.ZipNumCluster">
|
||||||
|
<property name="summaryFile" value="/webarchive/zipnum-cdx/all.summary"></property>
|
||||||
|
<property name="locFile" value="/webarchive/zipnum-cdx/all.loc"></property>
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
...
|
||||||
|
</bean>
|
||||||
|
</property>
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
can simply be converted to the pywb config:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: /webarchive/zipnum-cdx
|
||||||
|
|
||||||
|
# if the index is not surt ordered
|
||||||
|
surt_ordered: false
|
||||||
|
|
||||||
|
|
||||||
|
pywb will automatically determine the ``.summary`` and use the ``.loc`` files for the ZipNum Cluster if they are present in the directory.
|
||||||
|
|
||||||
|
Note that if the ZipNum index is **not** SURT ordered, the ``surt_ordered: false`` flag must be added to support this format.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Path Index Configuration
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
OpenWayback supports a 'path index' that can be used to look up a WARC by filename and map to an exact path.
|
||||||
|
For compatibility, pywb supports the same path index lookup, as well as loading WARC files by path or URL prefix.
|
||||||
|
|
||||||
|
|
||||||
|
For example, an OpenWayback configuration that includes a path index:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<bean id="resourcefilelocationdb" class="org.archive.wayback.resourcestore.locationdb.FlatFileResourceFileLocationDB">
|
||||||
|
<property name="path" value="/archive/warc-paths.txt"/>
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
<bean id="resourceStore" class="org.archive.wayback.resourcestore.LocationDBResourceStore">
|
||||||
|
<property name="db" ref="resourcefilelocationdb" />
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
|
||||||
|
can be configured in the ``archive_paths`` field of pywb collection configuration:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: ...
|
||||||
|
archive_paths: /archive/warc-paths.txt
|
||||||
|
|
||||||
|
|
||||||
|
The path index is a tab-delimited text file for mapping WARC filenames to full file paths or URLs, eg:
|
||||||
|
|
||||||
|
.. code::
|
||||||
|
|
||||||
|
example.warc.gz<tab>/some/path/to/example.warc.gz
|
||||||
|
another.warc.gz<tab>/some-other/path/another.warc.gz
|
||||||
|
remote.warc.gz<tab>http://warcstore.example.com/serve/remote.warc.gz
|
||||||
|
|
||||||
|
|
||||||
|
However, if all WARC files are stored in the same directory, or in a few directories, a path index is not needed and pywb will try loading the WARC by prefix.
|
||||||
|
|
||||||
|
The ``archive_paths`` can accept a list of entries. For example, given the config:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: ...
|
||||||
|
archive_paths:
|
||||||
|
- /archive/warcs1/
|
||||||
|
- /archive/warcs2/
|
||||||
|
- https://myarchive.example.com/warcs/
|
||||||
|
- /archive/warc-paths.txt
|
||||||
|
|
||||||
|
|
||||||
|
And the WARC file: ``example.warc.gz``, pywb will try to find the WARC in order from:
|
||||||
|
|
||||||
|
.. code::
|
||||||
|
|
||||||
|
1. /archive/warcs1/example.warc.gz
|
||||||
|
2. /archive/warcs2/example.warc.gz
|
||||||
|
3. https://myarchive.example.com/warcs/example.warc.gz
|
||||||
|
4. Looking up example.warc.gz in /archive/warc-paths.txt
|
||||||
|
|
||||||
|
|
||||||
|
Proxy Mode Access
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
A OpenWayback configuration may include many beans to support proxy mode, eg:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<bean id="proxyreplaydispatcher" class="org.archive.wayback.replay.SelectorReplayDispatcher">
|
||||||
|
...
|
||||||
|
<property name="renderer">
|
||||||
|
<bean class="org.archive.wayback.proxy.HttpsRedirectAndLinksRewriteProxyHTMLMarkupReplayRenderer">
|
||||||
|
...
|
||||||
|
<property name="uriConverter">
|
||||||
|
<bean class="org.archive.wayback.proxy.ProxyHttpsResultURIConverter"/>
|
||||||
|
</property>
|
||||||
|
</bean>
|
||||||
|
</propery>
|
||||||
|
</bean>
|
||||||
|
<bean name="proxy" class="org.archive.wayback.webapp.AccessPoint">
|
||||||
|
<property name="internalPort" value="${proxy.port}"/>
|
||||||
|
<property name="accessPointPath" value="${proxy.port}" />
|
||||||
|
<property name="collection" ref="localcdxcollection" />
|
||||||
|
...
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
|
||||||
|
In pywb, the proxy mode can be enabled by adding to the main ``config.yaml`` the name of the collection
|
||||||
|
that should be served in proxy mode:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
proxy:
|
||||||
|
source_coll: wayback
|
||||||
|
|
||||||
|
|
||||||
|
There are some differences between OpenWayback and pywb proxy mode support.
|
||||||
|
|
||||||
|
In OpenWayback, proxy mode is configured using separate access points for different collections on different ports.
|
||||||
|
OpenWayback only supports HTTP proxy and attempts to rewrite HTTPS URLs to HTTP.
|
||||||
|
|
||||||
|
In pywb, proxy mode is enabled on the same port as regular access, and pywb supports HTTP and HTTPS proxy.
|
||||||
|
pywb does not attempt to rewrite HTTPS to HTTP, as most browsers disallow HTTP access as insecure for many sites.
|
||||||
|
pywb supports a default collection that is enabled for proxy mode, and a default timestamp accessed by the proxy mode.
|
||||||
|
(Switching the collection and date accessed is possible but not currently supported without extensions to pywb).
|
||||||
|
|
||||||
|
To support HTTPS access, pywb provides a certificate authority that can be trusted by a browser to rewrite HTTPS content.
|
||||||
|
|
||||||
|
See :ref:`https-proxy` for all of the options of pywb proxy mode configuration.
|
||||||
|
|
80
docs/manual/owb-to-pywb-deploy.rst
Normal file
80
docs/manual/owb-to-pywb-deploy.rst
Normal file
@ -0,0 +1,80 @@
|
|||||||
|
Deploying pywb: Collection Paths and routing with Nginx/Apache
|
||||||
|
======================================================
|
||||||
|
|
||||||
|
In pywb, the collection name is also the access point, and each of the collections in ``config.yaml``
|
||||||
|
can be accessed by their name as the subpath:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
...
|
||||||
|
|
||||||
|
another-collection:
|
||||||
|
...
|
||||||
|
|
||||||
|
If pywb is deployed on port 8080, each collection will be available under:
|
||||||
|
``http://<hostname>/wayback/*/https://example.com/`` and ``http://<hostname>/another-collection/*/https://example.com/``
|
||||||
|
|
||||||
|
To make a collection available under the root, simply set its name to: ``$root``
|
||||||
|
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
$root:
|
||||||
|
...
|
||||||
|
|
||||||
|
another-collection:
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
Now, the first collection is available at: ``http://<hostname>/*/https://example.com/``.
|
||||||
|
|
||||||
|
|
||||||
|
To deploy pywb on a subdirectory, eg. ``http://<hostname>/pywb/another-collection/*/https://example.com/``,
|
||||||
|
|
||||||
|
and in general, for production use, it is recommended to deploy pywb behind an Nginx or Apache reverse proxy.
|
||||||
|
|
||||||
|
|
||||||
|
Nginx and Apache Reverse Proxy
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
The recommended deployment for pywb is with uWSGI and behind an Nginx or Apache frontend.
|
||||||
|
|
||||||
|
This configuration allows for more robust deployment, and allowing these servers to handle static files.
|
||||||
|
|
||||||
|
|
||||||
|
See the :ref:`nginx-deploy` and :ref:`apache-deploy` sections for more info on deploying with Nginx and Apache.
|
||||||
|
|
||||||
|
|
||||||
|
Working Docker Compose Examples
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
The pywb `Deployment Examples <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/>`_ include working examples of deploying pywb with Nginx, Apache and OutbackCDX
|
||||||
|
in Docker using Docker Compose, widely available container orchestration tools.
|
||||||
|
|
||||||
|
See `Installing Docker <https://docs.docker.com/get-docker/>`_ and `Installing Docker Compose <https://docs.docker.com/compose/install/>`_ for instructions on how to install these tools.
|
||||||
|
|
||||||
|
The examples are available in the ``sample-deploy`` directory of the pywb repo. The examples include:
|
||||||
|
|
||||||
|
- ``docker-compose-outback.yaml`` -- Docker Compose config to start OutbackCDX and pywb, and ingest sample data into OutbackCDX
|
||||||
|
- ``docker-compose-nginx.yaml`` -- Docker Compose config to launch pywb and latest Nginx, with pywb running on subdirectory ``/wayback`` and Nginx serving static files from pywb.
|
||||||
|
- ``docker-compose-apache.yaml`` -- Docker Compose config to launch pywb and latest Apache, with pywb running on subdirectory ``/wayback`` and Apache serving static files from pywb.
|
||||||
|
|
||||||
|
|
||||||
|
The examples are designed to be run one at a time, and assume port 8080 is available.
|
||||||
|
|
||||||
|
After installing Docker and Docker Compose, run either of:
|
||||||
|
|
||||||
|
- ``docker-compose -f docker-compose-outback.yaml up``
|
||||||
|
- ``docker-compose -f docker-compose-nginx.yaml up``
|
||||||
|
- ``docker-compose -f docker-compose-apache.yaml up``
|
||||||
|
|
||||||
|
This will download the standard Docker images and start all of the components in Docker.
|
||||||
|
|
||||||
|
If everything works correctly, you should be able to access: ``http://localhost:8080/pywb/https://example.com/`` to view the sample pywb collection.
|
||||||
|
|
||||||
|
Press CTRL+C to interrupt and stop the example in the console.
|
||||||
|
|
||||||
|
|
68
docs/manual/owb-to-pywb-exclusions.rst
Normal file
68
docs/manual/owb-to-pywb-exclusions.rst
Normal file
@ -0,0 +1,68 @@
|
|||||||
|
Migrating Exclusion Rules
|
||||||
|
=========================
|
||||||
|
|
||||||
|
pywb includes a new :ref:`access-control` system, which allows granual allow/block/exclude access control rules on paths and subpaths.
|
||||||
|
|
||||||
|
The rules are configured in .aclj files, and a command-line utility exists to import OpenWayback exclusions
|
||||||
|
into the pywb ACLJ format.
|
||||||
|
|
||||||
|
For example, given an OpenWayback exclusion list configuration for a static file:
|
||||||
|
|
||||||
|
.. code:: xml
|
||||||
|
|
||||||
|
<bean id="excluder-factory-static" class="org.archive.wayback.accesscontrol.staticmap.StaticMapExclusionFilterFactory">
|
||||||
|
<property name="file" value="/archive/exclusions.txt"/>
|
||||||
|
<property name="checkInterval" value="600000" />
|
||||||
|
</bean>
|
||||||
|
|
||||||
|
|
||||||
|
The exclusions file can be converted to an .aclj file by running: ::
|
||||||
|
|
||||||
|
wb-manager acl importtxt /archive/exclusions.aclj /archive/exclusions.txt exclude
|
||||||
|
|
||||||
|
|
||||||
|
Then, in the pywb config, specify:
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
collections:
|
||||||
|
wayback:
|
||||||
|
index_paths: ...
|
||||||
|
archive_paths: ...
|
||||||
|
acl_paths: /archive/exclusions.aclj
|
||||||
|
|
||||||
|
|
||||||
|
It is possible to specify multiple access control files, which will all be applied.
|
||||||
|
|
||||||
|
Using ``block`` instead of ``exclude`` will result in pywb returning a 451 error, indicating that URLs are in the index but blocked.
|
||||||
|
|
||||||
|
|
||||||
|
CLI Tool
|
||||||
|
--------
|
||||||
|
|
||||||
|
After exclusions have been imported, it is recommended to use ``wb-manager acl`` command-line tool for managing exclusions:
|
||||||
|
|
||||||
|
|
||||||
|
To add an exclusion, run: ::
|
||||||
|
|
||||||
|
wb-manager acl add /archive/exclusions.aclj http://httpbin.org/anything/something exclude
|
||||||
|
|
||||||
|
To remove an exclusion, run: ::
|
||||||
|
|
||||||
|
wb-manager acl remove /archive/exclusions.aclj http://httpbin.org/anything/something
|
||||||
|
|
||||||
|
|
||||||
|
For more options, see the full :ref:`access-control` documentation or run ``wb-manager acl --help``.
|
||||||
|
|
||||||
|
|
||||||
|
Not Yet Supported
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Some OpenWayback exclusion options are not yet supported in pywb.
|
||||||
|
The following is not yet supported in the access control system:
|
||||||
|
|
||||||
|
- Exclusions/Access Control By specific date range
|
||||||
|
- Regex based exclusions
|
||||||
|
- Date Range Embargo on All URLs
|
||||||
|
- Robots.txt-based exclusions
|
||||||
|
|
21
docs/manual/owb-transition.rst
Normal file
21
docs/manual/owb-transition.rst
Normal file
@ -0,0 +1,21 @@
|
|||||||
|
.. _transition-openwayback:
|
||||||
|
|
||||||
|
OpenWayback Transition Guide
|
||||||
|
============================
|
||||||
|
|
||||||
|
This guide provides guidelines for transtioning from OpenWayback to pywb,
|
||||||
|
with additional recommendations. The main recommendation is to run pywb along
|
||||||
|
with OutbackCDX and nginx, and this configuration is covered below, along with additional options.
|
||||||
|
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
owb-pywb-terms
|
||||||
|
outbackcdx
|
||||||
|
migrating-cdx
|
||||||
|
owb-to-pywb-config
|
||||||
|
owb-to-pywb-exclusions
|
||||||
|
owb-to-pywb-deploy
|
||||||
|
|
||||||
|
|
@ -7,7 +7,7 @@ pywb includes a sophisticated server and client-side rewriting systems, includin
|
|||||||
configuration for domain and content-specific rewriting rules, fuzzy index matching for replay,
|
configuration for domain and content-specific rewriting rules, fuzzy index matching for replay,
|
||||||
and a thorough client-side JS rewriting system.
|
and a thorough client-side JS rewriting system.
|
||||||
|
|
||||||
With pywb 2.3.0, the client-side rewriting system exists in a separate module at `https://github.com/webrecorder/wombat``
|
With pywb 2.3.0, the client-side rewriting system exists in a separate module at ``https://github.com/webrecorder/wombat``
|
||||||
|
|
||||||
|
|
||||||
URL Rewriting
|
URL Rewriting
|
||||||
|
@ -230,6 +230,8 @@ To run pywb in Docker behind a local nginx (as shown below), port 8081 should al
|
|||||||
See :ref:`getting-started-docker` for more info on using pywb with Docker.
|
See :ref:`getting-started-docker` for more info on using pywb with Docker.
|
||||||
|
|
||||||
|
|
||||||
|
.. _nginx-deploy:
|
||||||
|
|
||||||
Sample Nginx Configuration
|
Sample Nginx Configuration
|
||||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
@ -263,29 +265,55 @@ See the `Nginx Docs <https://nginx.org/en/docs/>`_ for a lot more details on how
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
.. _apache-deploy:
|
||||||
|
|
||||||
Sample Apache Configuration
|
Sample Apache Configuration
|
||||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
The following Apache configuration snippet can be used to deploy pywb *without* uwsgi. A configuration with uwsgi is also probably possible but this covers the simplest case of launching the `wayback` binary directly.
|
The recommended Apache configuration is to use pywb with ``mod_proxy`` and ``mod_proxy_uwsgi``.
|
||||||
|
|
||||||
The configuration assumes pywb is running on port 8080 on localhost, but it could be on a different machine as well.
|
To enable these, ensure that your httpd.conf includes:
|
||||||
|
|
||||||
|
.. code:: apache
|
||||||
|
|
||||||
|
LoadModule proxy_module modules/mod_proxy.so
|
||||||
|
LoadModule proxy_uwsgi_module modules/mod_proxy_uwsgi.so
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Then, in your config, simply include:
|
||||||
|
|
||||||
.. code:: apache
|
.. code:: apache
|
||||||
|
|
||||||
<VirtualHost *:80>
|
<VirtualHost *:80>
|
||||||
ServerName proxy.example.com
|
ProxyPass / uwsgi://pywb:8081/
|
||||||
Redirect / https://proxy.example.com/
|
|
||||||
DocumentRoot /var/www/html/
|
|
||||||
</VirtualHost>
|
</VirtualHost>
|
||||||
|
|
||||||
<VirtualHost *:443>
|
The configuration assumes uwsgi is started with ``uwsgi uwsgi.ini``
|
||||||
ServerName proxy.example.com
|
|
||||||
SSLEngine on
|
|
||||||
DocumentRoot /var/www/html/
|
Running on Subdirectory Path
|
||||||
ErrorDocument 404 /404.html
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
ProxyPreserveHost On
|
|
||||||
ProxyPass /.well-known/ !
|
To run pywb on a subdirectory, rather than at the root of the web server, the recommended configuration is to adjust the ``uwsgi.ini`` to include the subdirectory:
|
||||||
ProxyPass / http://localhost:8080/
|
For example, to deploy pywb under the ``/wayback`` subdirectory, the ``uwsgi.ini`` can be configured as follows:
|
||||||
ProxyPassReverse / http://localhost:8080/
|
|
||||||
RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
|
.. code:: ini
|
||||||
</VirtualHost>
|
|
||||||
|
mount = /wayback=./pywb/apps/wayback.py
|
||||||
|
manage-script-name = true
|
||||||
|
|
||||||
|
|
||||||
|
.. _example-deploy:
|
||||||
|
|
||||||
|
Deployment Examples
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The ``sample-deploy`` directory includes working Docker Compose examples for deploying pywb with Nginx and Apache on the ``/wayback`` subdirectory.
|
||||||
|
|
||||||
|
See:
|
||||||
|
- `Docker Compose Nginx <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/docker-compose-nginx.yaml>`_ for sample Nginx config.
|
||||||
|
- `Docker Compose Apache <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/docker-compose-apache.yaml>`_ for sample Apache config.
|
||||||
|
- `uwsgi_subdir.ini <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/uwsgi_subdir.ini>`_ for example subdirectory uwsgi config.
|
||||||
|
|
||||||
|
File diff suppressed because one or more lines are too long
@ -1,4 +1,4 @@
|
|||||||
__version__ = '2.4.2'
|
__version__ = '2.5.0'
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
print(__version__)
|
print(__version__)
|
||||||
|
34
sample-deploy/docker-compose-apache.yaml
Normal file
34
sample-deploy/docker-compose-apache.yaml
Normal file
@ -0,0 +1,34 @@
|
|||||||
|
# This example demonstrates running pywb with apache frontend under a subpath /wayback
|
||||||
|
|
||||||
|
version: '3'
|
||||||
|
|
||||||
|
services:
|
||||||
|
# main pywb image
|
||||||
|
pywb:
|
||||||
|
image: webrecorder/pywb
|
||||||
|
volumes:
|
||||||
|
- ../config.yaml:/webarchive/config.yaml
|
||||||
|
- ../sample_archive/:/webarchive/sample_archive/
|
||||||
|
- ./uwsgi_subdir.ini:/uwsgi/uwsgi.ini
|
||||||
|
|
||||||
|
# optional volume to serve static assets from nginx
|
||||||
|
- pywb-static:/pywb/pywb/static
|
||||||
|
|
||||||
|
apache:
|
||||||
|
image: httpd
|
||||||
|
ports:
|
||||||
|
- 8080:80
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
#- ./nginx-default.conf:/etc/nginx/conf.d/default.conf
|
||||||
|
- ./httpd.conf:/usr/local/apache2/conf/httpd.conf
|
||||||
|
- ./pywb-apache.conf:/usr/local/apache2/conf/extra/pywb-apache.conf
|
||||||
|
|
||||||
|
# optional volume to serve static assets from nginx
|
||||||
|
- pywb-static:/pywb/pywb/static
|
||||||
|
|
||||||
|
depends_on:
|
||||||
|
- pywb
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
pywb-static:
|
32
sample-deploy/docker-compose-nginx.yaml
Normal file
32
sample-deploy/docker-compose-nginx.yaml
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
# This example demonstrates running pywb with nginx frontend under a subpath /wayback
|
||||||
|
|
||||||
|
version: '3'
|
||||||
|
|
||||||
|
services:
|
||||||
|
# main pywb image
|
||||||
|
pywb:
|
||||||
|
image: webrecorder/pywb
|
||||||
|
volumes:
|
||||||
|
- ../config.yaml:/webarchive/config.yaml
|
||||||
|
- ../sample_archive/:/webarchive/sample_archive/
|
||||||
|
- ./uwsgi_subdir.ini:/uwsgi/uwsgi.ini
|
||||||
|
|
||||||
|
# optional volume to serve static assets from nginx
|
||||||
|
- pywb-static:/pywb/pywb/static
|
||||||
|
|
||||||
|
nginx:
|
||||||
|
image: nginx
|
||||||
|
ports:
|
||||||
|
- 8080:80
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
- ./pywb-nginx.conf:/etc/nginx/conf.d/default.conf
|
||||||
|
|
||||||
|
# optional volume to serve static assets from nginx
|
||||||
|
- pywb-static:/pywb/pywb/static
|
||||||
|
|
||||||
|
depends_on:
|
||||||
|
- pywb
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
pywb-static:
|
39
sample-deploy/docker-compose-outback.yaml
Normal file
39
sample-deploy/docker-compose-outback.yaml
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
version: '3'
|
||||||
|
|
||||||
|
services:
|
||||||
|
# outbackcdx image
|
||||||
|
outbackcdx:
|
||||||
|
image: nlagovau/outbackcdx
|
||||||
|
ports:
|
||||||
|
- 8084:8080
|
||||||
|
|
||||||
|
# use cdx-indexer to index and ingest into outbackcdx
|
||||||
|
ingest:
|
||||||
|
image: webrecorder/pywb
|
||||||
|
entrypoint: ["bash", "-c"]
|
||||||
|
command: /tmp/run.sh
|
||||||
|
|
||||||
|
depends_on:
|
||||||
|
- outbackcdx
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
- ../config.yaml:/webarchive/config.yaml
|
||||||
|
- ./run.sh:/tmp/run.sh
|
||||||
|
- ../sample_archive/:/webarchive/sample_archive/
|
||||||
|
|
||||||
|
# main pywb image
|
||||||
|
pywb:
|
||||||
|
image: webrecorder/pywb
|
||||||
|
volumes:
|
||||||
|
- ../config.yaml:/webarchive/config.yaml
|
||||||
|
- ../sample_archive/:/webarchive/sample_archive/
|
||||||
|
|
||||||
|
ports:
|
||||||
|
- 8080:8080
|
||||||
|
|
||||||
|
depends_on:
|
||||||
|
- ingest
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
555
sample-deploy/httpd.conf
Normal file
555
sample-deploy/httpd.conf
Normal file
@ -0,0 +1,555 @@
|
|||||||
|
#
|
||||||
|
# This is the main Apache HTTP server configuration file. It contains the
|
||||||
|
# configuration directives that give the server its instructions.
|
||||||
|
# See <URL:http://httpd.apache.org/docs/2.4/> for detailed information.
|
||||||
|
# In particular, see
|
||||||
|
# <URL:http://httpd.apache.org/docs/2.4/mod/directives.html>
|
||||||
|
# for a discussion of each configuration directive.
|
||||||
|
#
|
||||||
|
# Do NOT simply read the instructions in here without understanding
|
||||||
|
# what they do. They're here only as hints or reminders. If you are unsure
|
||||||
|
# consult the online docs. You have been warned.
|
||||||
|
#
|
||||||
|
# Configuration and logfile names: If the filenames you specify for many
|
||||||
|
# of the server's control files begin with "/" (or "drive:/" for Win32), the
|
||||||
|
# server will use that explicit path. If the filenames do *not* begin
|
||||||
|
# with "/", the value of ServerRoot is prepended -- so "logs/access_log"
|
||||||
|
# with ServerRoot set to "/usr/local/apache2" will be interpreted by the
|
||||||
|
# server as "/usr/local/apache2/logs/access_log", whereas "/logs/access_log"
|
||||||
|
# will be interpreted as '/logs/access_log'.
|
||||||
|
|
||||||
|
#
|
||||||
|
# ServerRoot: The top of the directory tree under which the server's
|
||||||
|
# configuration, error, and log files are kept.
|
||||||
|
#
|
||||||
|
# Do not add a slash at the end of the directory path. If you point
|
||||||
|
# ServerRoot at a non-local disk, be sure to specify a local disk on the
|
||||||
|
# Mutex directive, if file-based mutexes are used. If you wish to share the
|
||||||
|
# same ServerRoot for multiple httpd daemons, you will need to change at
|
||||||
|
# least PidFile.
|
||||||
|
#
|
||||||
|
ServerRoot "/usr/local/apache2"
|
||||||
|
|
||||||
|
#
|
||||||
|
# Mutex: Allows you to set the mutex mechanism and mutex file directory
|
||||||
|
# for individual mutexes, or change the global defaults
|
||||||
|
#
|
||||||
|
# Uncomment and change the directory if mutexes are file-based and the default
|
||||||
|
# mutex file directory is not on a local disk or is not appropriate for some
|
||||||
|
# other reason.
|
||||||
|
#
|
||||||
|
# Mutex default:logs
|
||||||
|
|
||||||
|
#
|
||||||
|
# Listen: Allows you to bind Apache to specific IP addresses and/or
|
||||||
|
# ports, instead of the default. See also the <VirtualHost>
|
||||||
|
# directive.
|
||||||
|
#
|
||||||
|
# Change this to Listen on specific IP addresses as shown below to
|
||||||
|
# prevent Apache from glomming onto all bound IP addresses.
|
||||||
|
#
|
||||||
|
#Listen 12.34.56.78:80
|
||||||
|
Listen 80
|
||||||
|
|
||||||
|
#
|
||||||
|
# Dynamic Shared Object (DSO) Support
|
||||||
|
#
|
||||||
|
# To be able to use the functionality of a module which was built as a DSO you
|
||||||
|
# have to place corresponding `LoadModule' lines at this location so the
|
||||||
|
# directives contained in it are actually available _before_ they are used.
|
||||||
|
# Statically compiled modules (those listed by `httpd -l') do not need
|
||||||
|
# to be loaded here.
|
||||||
|
#
|
||||||
|
# Example:
|
||||||
|
# LoadModule foo_module modules/mod_foo.so
|
||||||
|
#
|
||||||
|
LoadModule mpm_event_module modules/mod_mpm_event.so
|
||||||
|
#LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
|
||||||
|
#LoadModule mpm_worker_module modules/mod_mpm_worker.so
|
||||||
|
LoadModule authn_file_module modules/mod_authn_file.so
|
||||||
|
#LoadModule authn_dbm_module modules/mod_authn_dbm.so
|
||||||
|
#LoadModule authn_anon_module modules/mod_authn_anon.so
|
||||||
|
#LoadModule authn_dbd_module modules/mod_authn_dbd.so
|
||||||
|
#LoadModule authn_socache_module modules/mod_authn_socache.so
|
||||||
|
LoadModule authn_core_module modules/mod_authn_core.so
|
||||||
|
LoadModule authz_host_module modules/mod_authz_host.so
|
||||||
|
LoadModule authz_groupfile_module modules/mod_authz_groupfile.so
|
||||||
|
LoadModule authz_user_module modules/mod_authz_user.so
|
||||||
|
#LoadModule authz_dbm_module modules/mod_authz_dbm.so
|
||||||
|
#LoadModule authz_owner_module modules/mod_authz_owner.so
|
||||||
|
#LoadModule authz_dbd_module modules/mod_authz_dbd.so
|
||||||
|
LoadModule authz_core_module modules/mod_authz_core.so
|
||||||
|
#LoadModule authnz_ldap_module modules/mod_authnz_ldap.so
|
||||||
|
#LoadModule authnz_fcgi_module modules/mod_authnz_fcgi.so
|
||||||
|
LoadModule access_compat_module modules/mod_access_compat.so
|
||||||
|
LoadModule auth_basic_module modules/mod_auth_basic.so
|
||||||
|
#LoadModule auth_form_module modules/mod_auth_form.so
|
||||||
|
#LoadModule auth_digest_module modules/mod_auth_digest.so
|
||||||
|
#LoadModule allowmethods_module modules/mod_allowmethods.so
|
||||||
|
#LoadModule isapi_module modules/mod_isapi.so
|
||||||
|
#LoadModule file_cache_module modules/mod_file_cache.so
|
||||||
|
#LoadModule cache_module modules/mod_cache.so
|
||||||
|
#LoadModule cache_disk_module modules/mod_cache_disk.so
|
||||||
|
#LoadModule cache_socache_module modules/mod_cache_socache.so
|
||||||
|
#LoadModule socache_shmcb_module modules/mod_socache_shmcb.so
|
||||||
|
#LoadModule socache_dbm_module modules/mod_socache_dbm.so
|
||||||
|
#LoadModule socache_memcache_module modules/mod_socache_memcache.so
|
||||||
|
#LoadModule socache_redis_module modules/mod_socache_redis.so
|
||||||
|
#LoadModule watchdog_module modules/mod_watchdog.so
|
||||||
|
#LoadModule macro_module modules/mod_macro.so
|
||||||
|
#LoadModule dbd_module modules/mod_dbd.so
|
||||||
|
#LoadModule bucketeer_module modules/mod_bucketeer.so
|
||||||
|
#LoadModule dumpio_module modules/mod_dumpio.so
|
||||||
|
#LoadModule echo_module modules/mod_echo.so
|
||||||
|
#LoadModule example_hooks_module modules/mod_example_hooks.so
|
||||||
|
#LoadModule case_filter_module modules/mod_case_filter.so
|
||||||
|
#LoadModule case_filter_in_module modules/mod_case_filter_in.so
|
||||||
|
#LoadModule example_ipc_module modules/mod_example_ipc.so
|
||||||
|
#LoadModule buffer_module modules/mod_buffer.so
|
||||||
|
#LoadModule data_module modules/mod_data.so
|
||||||
|
#LoadModule ratelimit_module modules/mod_ratelimit.so
|
||||||
|
LoadModule reqtimeout_module modules/mod_reqtimeout.so
|
||||||
|
#LoadModule ext_filter_module modules/mod_ext_filter.so
|
||||||
|
#LoadModule request_module modules/mod_request.so
|
||||||
|
#LoadModule include_module modules/mod_include.so
|
||||||
|
LoadModule filter_module modules/mod_filter.so
|
||||||
|
#LoadModule reflector_module modules/mod_reflector.so
|
||||||
|
#LoadModule substitute_module modules/mod_substitute.so
|
||||||
|
#LoadModule sed_module modules/mod_sed.so
|
||||||
|
#LoadModule charset_lite_module modules/mod_charset_lite.so
|
||||||
|
#LoadModule deflate_module modules/mod_deflate.so
|
||||||
|
#LoadModule xml2enc_module modules/mod_xml2enc.so
|
||||||
|
#LoadModule proxy_html_module modules/mod_proxy_html.so
|
||||||
|
#LoadModule brotli_module modules/mod_brotli.so
|
||||||
|
LoadModule mime_module modules/mod_mime.so
|
||||||
|
#LoadModule ldap_module modules/mod_ldap.so
|
||||||
|
LoadModule log_config_module modules/mod_log_config.so
|
||||||
|
#LoadModule log_debug_module modules/mod_log_debug.so
|
||||||
|
#LoadModule log_forensic_module modules/mod_log_forensic.so
|
||||||
|
#LoadModule logio_module modules/mod_logio.so
|
||||||
|
#LoadModule lua_module modules/mod_lua.so
|
||||||
|
LoadModule env_module modules/mod_env.so
|
||||||
|
#LoadModule mime_magic_module modules/mod_mime_magic.so
|
||||||
|
#LoadModule cern_meta_module modules/mod_cern_meta.so
|
||||||
|
#LoadModule expires_module modules/mod_expires.so
|
||||||
|
LoadModule headers_module modules/mod_headers.so
|
||||||
|
#LoadModule ident_module modules/mod_ident.so
|
||||||
|
#LoadModule usertrack_module modules/mod_usertrack.so
|
||||||
|
#LoadModule unique_id_module modules/mod_unique_id.so
|
||||||
|
LoadModule setenvif_module modules/mod_setenvif.so
|
||||||
|
LoadModule version_module modules/mod_version.so
|
||||||
|
#LoadModule remoteip_module modules/mod_remoteip.so
|
||||||
|
LoadModule proxy_module modules/mod_proxy.so
|
||||||
|
#LoadModule proxy_connect_module modules/mod_proxy_connect.so
|
||||||
|
#LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
|
||||||
|
#LoadModule proxy_http_module modules/mod_proxy_http.so
|
||||||
|
#LoadModule proxy_fcgi_module modules/mod_proxy_fcgi.so
|
||||||
|
#LoadModule proxy_scgi_module modules/mod_proxy_scgi.so
|
||||||
|
LoadModule proxy_uwsgi_module modules/mod_proxy_uwsgi.so
|
||||||
|
#LoadModule proxy_fdpass_module modules/mod_proxy_fdpass.so
|
||||||
|
#LoadModule proxy_wstunnel_module modules/mod_proxy_wstunnel.so
|
||||||
|
#LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
|
||||||
|
#LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
|
||||||
|
#LoadModule proxy_express_module modules/mod_proxy_express.so
|
||||||
|
#LoadModule proxy_hcheck_module modules/mod_proxy_hcheck.so
|
||||||
|
#LoadModule session_module modules/mod_session.so
|
||||||
|
#LoadModule session_cookie_module modules/mod_session_cookie.so
|
||||||
|
#LoadModule session_crypto_module modules/mod_session_crypto.so
|
||||||
|
#LoadModule session_dbd_module modules/mod_session_dbd.so
|
||||||
|
#LoadModule slotmem_shm_module modules/mod_slotmem_shm.so
|
||||||
|
#LoadModule slotmem_plain_module modules/mod_slotmem_plain.so
|
||||||
|
#LoadModule ssl_module modules/mod_ssl.so
|
||||||
|
#LoadModule optional_hook_export_module modules/mod_optional_hook_export.so
|
||||||
|
#LoadModule optional_hook_import_module modules/mod_optional_hook_import.so
|
||||||
|
#LoadModule optional_fn_import_module modules/mod_optional_fn_import.so
|
||||||
|
#LoadModule optional_fn_export_module modules/mod_optional_fn_export.so
|
||||||
|
#LoadModule dialup_module modules/mod_dialup.so
|
||||||
|
#LoadModule http2_module modules/mod_http2.so
|
||||||
|
#LoadModule proxy_http2_module modules/mod_proxy_http2.so
|
||||||
|
#LoadModule md_module modules/mod_md.so
|
||||||
|
#LoadModule lbmethod_byrequests_module modules/mod_lbmethod_byrequests.so
|
||||||
|
#LoadModule lbmethod_bytraffic_module modules/mod_lbmethod_bytraffic.so
|
||||||
|
#LoadModule lbmethod_bybusyness_module modules/mod_lbmethod_bybusyness.so
|
||||||
|
#LoadModule lbmethod_heartbeat_module modules/mod_lbmethod_heartbeat.so
|
||||||
|
LoadModule unixd_module modules/mod_unixd.so
|
||||||
|
#LoadModule heartbeat_module modules/mod_heartbeat.so
|
||||||
|
#LoadModule heartmonitor_module modules/mod_heartmonitor.so
|
||||||
|
#LoadModule dav_module modules/mod_dav.so
|
||||||
|
LoadModule status_module modules/mod_status.so
|
||||||
|
LoadModule autoindex_module modules/mod_autoindex.so
|
||||||
|
#LoadModule asis_module modules/mod_asis.so
|
||||||
|
#LoadModule info_module modules/mod_info.so
|
||||||
|
#LoadModule suexec_module modules/mod_suexec.so
|
||||||
|
<IfModule !mpm_prefork_module>
|
||||||
|
#LoadModule cgid_module modules/mod_cgid.so
|
||||||
|
</IfModule>
|
||||||
|
<IfModule mpm_prefork_module>
|
||||||
|
#LoadModule cgi_module modules/mod_cgi.so
|
||||||
|
</IfModule>
|
||||||
|
#LoadModule dav_fs_module modules/mod_dav_fs.so
|
||||||
|
#LoadModule dav_lock_module modules/mod_dav_lock.so
|
||||||
|
#LoadModule vhost_alias_module modules/mod_vhost_alias.so
|
||||||
|
#LoadModule negotiation_module modules/mod_negotiation.so
|
||||||
|
LoadModule dir_module modules/mod_dir.so
|
||||||
|
#LoadModule imagemap_module modules/mod_imagemap.so
|
||||||
|
#LoadModule actions_module modules/mod_actions.so
|
||||||
|
#LoadModule speling_module modules/mod_speling.so
|
||||||
|
#LoadModule userdir_module modules/mod_userdir.so
|
||||||
|
LoadModule alias_module modules/mod_alias.so
|
||||||
|
#LoadModule rewrite_module modules/mod_rewrite.so
|
||||||
|
|
||||||
|
<IfModule unixd_module>
|
||||||
|
#
|
||||||
|
# If you wish httpd to run as a different user or group, you must run
|
||||||
|
# httpd as root initially and it will switch.
|
||||||
|
#
|
||||||
|
# User/Group: The name (or #number) of the user/group to run httpd as.
|
||||||
|
# It is usually good practice to create a dedicated user and group for
|
||||||
|
# running httpd, as with most system services.
|
||||||
|
#
|
||||||
|
User daemon
|
||||||
|
Group daemon
|
||||||
|
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
# 'Main' server configuration
|
||||||
|
#
|
||||||
|
# The directives in this section set up the values used by the 'main'
|
||||||
|
# server, which responds to any requests that aren't handled by a
|
||||||
|
# <VirtualHost> definition. These values also provide defaults for
|
||||||
|
# any <VirtualHost> containers you may define later in the file.
|
||||||
|
#
|
||||||
|
# All of these directives may appear inside <VirtualHost> containers,
|
||||||
|
# in which case these default settings will be overridden for the
|
||||||
|
# virtual host being defined.
|
||||||
|
#
|
||||||
|
|
||||||
|
#
|
||||||
|
# ServerAdmin: Your address, where problems with the server should be
|
||||||
|
# e-mailed. This address appears on some server-generated pages, such
|
||||||
|
# as error documents. e.g. admin@your-domain.com
|
||||||
|
#
|
||||||
|
ServerAdmin you@example.com
|
||||||
|
|
||||||
|
#
|
||||||
|
# ServerName gives the name and port that the server uses to identify itself.
|
||||||
|
# This can often be determined automatically, but we recommend you specify
|
||||||
|
# it explicitly to prevent problems during startup.
|
||||||
|
#
|
||||||
|
# If your host doesn't have a registered DNS name, enter its IP address here.
|
||||||
|
#
|
||||||
|
#ServerName www.example.com:80
|
||||||
|
|
||||||
|
#
|
||||||
|
# Deny access to the entirety of your server's filesystem. You must
|
||||||
|
# explicitly permit access to web content directories in other
|
||||||
|
# <Directory> blocks below.
|
||||||
|
#
|
||||||
|
<Directory />
|
||||||
|
AllowOverride none
|
||||||
|
Require all denied
|
||||||
|
</Directory>
|
||||||
|
|
||||||
|
#
|
||||||
|
# Note that from this point forward you must specifically allow
|
||||||
|
# particular features to be enabled - so if something's not working as
|
||||||
|
# you might expect, make sure that you have specifically enabled it
|
||||||
|
# below.
|
||||||
|
#
|
||||||
|
|
||||||
|
#
|
||||||
|
# DocumentRoot: The directory out of which you will serve your
|
||||||
|
# documents. By default, all requests are taken from this directory, but
|
||||||
|
# symbolic links and aliases may be used to point to other locations.
|
||||||
|
#
|
||||||
|
DocumentRoot "/usr/local/apache2/htdocs"
|
||||||
|
<Directory "/usr/local/apache2/htdocs">
|
||||||
|
#
|
||||||
|
# Possible values for the Options directive are "None", "All",
|
||||||
|
# or any combination of:
|
||||||
|
# Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
|
||||||
|
#
|
||||||
|
# Note that "MultiViews" must be named *explicitly* --- "Options All"
|
||||||
|
# doesn't give it to you.
|
||||||
|
#
|
||||||
|
# The Options directive is both complicated and important. Please see
|
||||||
|
# http://httpd.apache.org/docs/2.4/mod/core.html#options
|
||||||
|
# for more information.
|
||||||
|
#
|
||||||
|
Options Indexes FollowSymLinks
|
||||||
|
|
||||||
|
#
|
||||||
|
# AllowOverride controls what directives may be placed in .htaccess files.
|
||||||
|
# It can be "All", "None", or any combination of the keywords:
|
||||||
|
# AllowOverride FileInfo AuthConfig Limit
|
||||||
|
#
|
||||||
|
AllowOverride None
|
||||||
|
|
||||||
|
#
|
||||||
|
# Controls who can get stuff from this server.
|
||||||
|
#
|
||||||
|
Require all granted
|
||||||
|
</Directory>
|
||||||
|
|
||||||
|
#
|
||||||
|
# DirectoryIndex: sets the file that Apache will serve if a directory
|
||||||
|
# is requested.
|
||||||
|
#
|
||||||
|
<IfModule dir_module>
|
||||||
|
DirectoryIndex index.html
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
#
|
||||||
|
# The following lines prevent .htaccess and .htpasswd files from being
|
||||||
|
# viewed by Web clients.
|
||||||
|
#
|
||||||
|
<Files ".ht*">
|
||||||
|
Require all denied
|
||||||
|
</Files>
|
||||||
|
|
||||||
|
#
|
||||||
|
# ErrorLog: The location of the error log file.
|
||||||
|
# If you do not specify an ErrorLog directive within a <VirtualHost>
|
||||||
|
# container, error messages relating to that virtual host will be
|
||||||
|
# logged here. If you *do* define an error logfile for a <VirtualHost>
|
||||||
|
# container, that host's errors will be logged there and not here.
|
||||||
|
#
|
||||||
|
ErrorLog /proc/self/fd/2
|
||||||
|
|
||||||
|
#
|
||||||
|
# LogLevel: Control the number of messages logged to the error_log.
|
||||||
|
# Possible values include: debug, info, notice, warn, error, crit,
|
||||||
|
# alert, emerg.
|
||||||
|
#
|
||||||
|
LogLevel warn
|
||||||
|
|
||||||
|
<IfModule log_config_module>
|
||||||
|
#
|
||||||
|
# The following directives define some format nicknames for use with
|
||||||
|
# a CustomLog directive (see below).
|
||||||
|
#
|
||||||
|
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
|
||||||
|
LogFormat "%h %l %u %t \"%r\" %>s %b" common
|
||||||
|
|
||||||
|
<IfModule logio_module>
|
||||||
|
# You need to enable mod_logio.c to use %I and %O
|
||||||
|
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
#
|
||||||
|
# The location and format of the access logfile (Common Logfile Format).
|
||||||
|
# If you do not define any access logfiles within a <VirtualHost>
|
||||||
|
# container, they will be logged here. Contrariwise, if you *do*
|
||||||
|
# define per-<VirtualHost> access logfiles, transactions will be
|
||||||
|
# logged therein and *not* in this file.
|
||||||
|
#
|
||||||
|
CustomLog /proc/self/fd/1 common
|
||||||
|
|
||||||
|
#
|
||||||
|
# If you prefer a logfile with access, agent, and referer information
|
||||||
|
# (Combined Logfile Format) you can use the following directive.
|
||||||
|
#
|
||||||
|
#CustomLog "logs/access_log" combined
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
<IfModule alias_module>
|
||||||
|
#
|
||||||
|
# Redirect: Allows you to tell clients about documents that used to
|
||||||
|
# exist in your server's namespace, but do not anymore. The client
|
||||||
|
# will make a new request for the document at its new location.
|
||||||
|
# Example:
|
||||||
|
# Redirect permanent /foo http://www.example.com/bar
|
||||||
|
|
||||||
|
#
|
||||||
|
# Alias: Maps web paths into filesystem paths and is used to
|
||||||
|
# access content that does not live under the DocumentRoot.
|
||||||
|
# Example:
|
||||||
|
# Alias /webpath /full/filesystem/path
|
||||||
|
#
|
||||||
|
# If you include a trailing / on /webpath then the server will
|
||||||
|
# require it to be present in the URL. You will also likely
|
||||||
|
# need to provide a <Directory> section to allow access to
|
||||||
|
# the filesystem path.
|
||||||
|
|
||||||
|
#
|
||||||
|
# ScriptAlias: This controls which directories contain server scripts.
|
||||||
|
# ScriptAliases are essentially the same as Aliases, except that
|
||||||
|
# documents in the target directory are treated as applications and
|
||||||
|
# run by the server when requested rather than as documents sent to the
|
||||||
|
# client. The same rules about trailing "/" apply to ScriptAlias
|
||||||
|
# directives as to Alias.
|
||||||
|
#
|
||||||
|
ScriptAlias /cgi-bin/ "/usr/local/apache2/cgi-bin/"
|
||||||
|
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
<IfModule cgid_module>
|
||||||
|
#
|
||||||
|
# ScriptSock: On threaded servers, designate the path to the UNIX
|
||||||
|
# socket used to communicate with the CGI daemon of mod_cgid.
|
||||||
|
#
|
||||||
|
#Scriptsock cgisock
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
#
|
||||||
|
# "/usr/local/apache2/cgi-bin" should be changed to whatever your ScriptAliased
|
||||||
|
# CGI directory exists, if you have that configured.
|
||||||
|
#
|
||||||
|
<Directory "/usr/local/apache2/cgi-bin">
|
||||||
|
AllowOverride None
|
||||||
|
Options None
|
||||||
|
Require all granted
|
||||||
|
</Directory>
|
||||||
|
|
||||||
|
<IfModule headers_module>
|
||||||
|
#
|
||||||
|
# Avoid passing HTTP_PROXY environment to CGI's on this or any proxied
|
||||||
|
# backend servers which have lingering "httpoxy" defects.
|
||||||
|
# 'Proxy' request header is undefined by the IETF, not listed by IANA
|
||||||
|
#
|
||||||
|
RequestHeader unset Proxy early
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
<IfModule mime_module>
|
||||||
|
#
|
||||||
|
# TypesConfig points to the file containing the list of mappings from
|
||||||
|
# filename extension to MIME-type.
|
||||||
|
#
|
||||||
|
TypesConfig conf/mime.types
|
||||||
|
|
||||||
|
#
|
||||||
|
# AddType allows you to add to or override the MIME configuration
|
||||||
|
# file specified in TypesConfig for specific file types.
|
||||||
|
#
|
||||||
|
#AddType application/x-gzip .tgz
|
||||||
|
#
|
||||||
|
# AddEncoding allows you to have certain browsers uncompress
|
||||||
|
# information on the fly. Note: Not all browsers support this.
|
||||||
|
#
|
||||||
|
#AddEncoding x-compress .Z
|
||||||
|
#AddEncoding x-gzip .gz .tgz
|
||||||
|
#
|
||||||
|
# If the AddEncoding directives above are commented-out, then you
|
||||||
|
# probably should define those extensions to indicate media types:
|
||||||
|
#
|
||||||
|
AddType application/x-compress .Z
|
||||||
|
AddType application/x-gzip .gz .tgz
|
||||||
|
|
||||||
|
#
|
||||||
|
# AddHandler allows you to map certain file extensions to "handlers":
|
||||||
|
# actions unrelated to filetype. These can be either built into the server
|
||||||
|
# or added with the Action directive (see below)
|
||||||
|
#
|
||||||
|
# To use CGI scripts outside of ScriptAliased directories:
|
||||||
|
# (You will also need to add "ExecCGI" to the "Options" directive.)
|
||||||
|
#
|
||||||
|
#AddHandler cgi-script .cgi
|
||||||
|
|
||||||
|
# For type maps (negotiated resources):
|
||||||
|
#AddHandler type-map var
|
||||||
|
|
||||||
|
#
|
||||||
|
# Filters allow you to process content before it is sent to the client.
|
||||||
|
#
|
||||||
|
# To parse .shtml files for server-side includes (SSI):
|
||||||
|
# (You will also need to add "Includes" to the "Options" directive.)
|
||||||
|
#
|
||||||
|
#AddType text/html .shtml
|
||||||
|
#AddOutputFilter INCLUDES .shtml
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
#
|
||||||
|
# The mod_mime_magic module allows the server to use various hints from the
|
||||||
|
# contents of the file itself to determine its type. The MIMEMagicFile
|
||||||
|
# directive tells the module where the hint definitions are located.
|
||||||
|
#
|
||||||
|
#MIMEMagicFile conf/magic
|
||||||
|
|
||||||
|
#
|
||||||
|
# Customizable error responses come in three flavors:
|
||||||
|
# 1) plain text 2) local redirects 3) external redirects
|
||||||
|
#
|
||||||
|
# Some examples:
|
||||||
|
#ErrorDocument 500 "The server made a boo boo."
|
||||||
|
#ErrorDocument 404 /missing.html
|
||||||
|
#ErrorDocument 404 "/cgi-bin/missing_handler.pl"
|
||||||
|
#ErrorDocument 402 http://www.example.com/subscription_info.html
|
||||||
|
#
|
||||||
|
|
||||||
|
#
|
||||||
|
# MaxRanges: Maximum number of Ranges in a request before
|
||||||
|
# returning the entire resource, or one of the special
|
||||||
|
# values 'default', 'none' or 'unlimited'.
|
||||||
|
# Default setting is to accept 200 Ranges.
|
||||||
|
#MaxRanges unlimited
|
||||||
|
|
||||||
|
#
|
||||||
|
# EnableMMAP and EnableSendfile: On systems that support it,
|
||||||
|
# memory-mapping or the sendfile syscall may be used to deliver
|
||||||
|
# files. This usually improves server performance, but must
|
||||||
|
# be turned off when serving from networked-mounted
|
||||||
|
# filesystems or if support for these functions is otherwise
|
||||||
|
# broken on your system.
|
||||||
|
# Defaults: EnableMMAP On, EnableSendfile Off
|
||||||
|
#
|
||||||
|
#EnableMMAP off
|
||||||
|
#EnableSendfile on
|
||||||
|
|
||||||
|
# Supplemental configuration
|
||||||
|
#
|
||||||
|
# The configuration files in the conf/extra/ directory can be
|
||||||
|
# included to add extra features or to modify the default configuration of
|
||||||
|
# the server, or you may simply copy their contents here and change as
|
||||||
|
# necessary.
|
||||||
|
|
||||||
|
# Server-pool management (MPM specific)
|
||||||
|
#Include conf/extra/httpd-mpm.conf
|
||||||
|
|
||||||
|
# Multi-language error messages
|
||||||
|
#Include conf/extra/httpd-multilang-errordoc.conf
|
||||||
|
|
||||||
|
# Fancy directory listings
|
||||||
|
#Include conf/extra/httpd-autoindex.conf
|
||||||
|
|
||||||
|
# Language settings
|
||||||
|
#Include conf/extra/httpd-languages.conf
|
||||||
|
|
||||||
|
# User home directories
|
||||||
|
#Include conf/extra/httpd-userdir.conf
|
||||||
|
|
||||||
|
# Real-time info on requests and configuration
|
||||||
|
#Include conf/extra/httpd-info.conf
|
||||||
|
|
||||||
|
# Virtual hosts
|
||||||
|
#Include conf/extra/httpd-vhosts.conf
|
||||||
|
|
||||||
|
# Local access to the Apache HTTP Server Manual
|
||||||
|
#Include conf/extra/httpd-manual.conf
|
||||||
|
|
||||||
|
# Distributed authoring and versioning (WebDAV)
|
||||||
|
#Include conf/extra/httpd-dav.conf
|
||||||
|
|
||||||
|
# Various default settings
|
||||||
|
#Include conf/extra/httpd-default.conf
|
||||||
|
|
||||||
|
# Configure mod_proxy_html to understand HTML4/XHTML1
|
||||||
|
<IfModule proxy_html_module>
|
||||||
|
Include conf/extra/proxy-html.conf
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
# Secure (SSL/TLS) connections
|
||||||
|
#Include conf/extra/httpd-ssl.conf
|
||||||
|
#
|
||||||
|
# Note: The following must must be present to support
|
||||||
|
# starting without SSL on platforms with no /dev/random equivalent
|
||||||
|
# but a statically compiled-in mod_ssl.
|
||||||
|
#
|
||||||
|
<IfModule ssl_module>
|
||||||
|
SSLRandomSeed startup builtin
|
||||||
|
SSLRandomSeed connect builtin
|
||||||
|
</IfModule>
|
||||||
|
|
||||||
|
|
||||||
|
Include conf/extra/pywb-apache.conf
|
||||||
|
|
||||||
|
|
17
sample-deploy/pywb-apache.conf
Normal file
17
sample-deploy/pywb-apache.conf
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
<VirtualHost *:80>
|
||||||
|
# optional: optimization to have apache serve static assets
|
||||||
|
Alias /wayback/static "/pywb/pywb/static"
|
||||||
|
ProxyPass /wayback/static !
|
||||||
|
|
||||||
|
<Directory "/pywb/pywb/static">
|
||||||
|
Options None
|
||||||
|
AllowOverride None
|
||||||
|
Order allow,deny
|
||||||
|
Allow from all
|
||||||
|
Require all granted
|
||||||
|
</Directory>
|
||||||
|
|
||||||
|
# required: proxy pass to pywb
|
||||||
|
ProxyPass /wayback uwsgi://pywb:8081/
|
||||||
|
|
||||||
|
</VirtualHost>
|
21
sample-deploy/pywb-nginx.conf
Normal file
21
sample-deploy/pywb-nginx.conf
Normal file
@ -0,0 +1,21 @@
|
|||||||
|
# nginx config for running under /wayback/ prefix
|
||||||
|
|
||||||
|
server {
|
||||||
|
listen 80;
|
||||||
|
|
||||||
|
# optinal: optimization to have nginx serve static assets
|
||||||
|
location /wayback/static {
|
||||||
|
alias /pywb/pywb/static;
|
||||||
|
}
|
||||||
|
|
||||||
|
# required: pywb with prefix
|
||||||
|
location /wayback/ {
|
||||||
|
resolver 127.0.0.1;
|
||||||
|
|
||||||
|
uwsgi_pass pywb:8081;
|
||||||
|
|
||||||
|
include uwsgi_params;
|
||||||
|
uwsgi_param UWSGI_SCHEME $scheme;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
4
sample-deploy/run.sh
Executable file
4
sample-deploy/run.sh
Executable file
@ -0,0 +1,4 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
cdx-indexer /webarchive/sample_archive/warcs/example.warc.gz > /tmp/index.cdx
|
||||||
|
curl -X POST --data-binary @/tmp/index.cdx http://outbackcdx:8080/pywb
|
||||||
|
|
29
sample-deploy/uwsgi_subdir.ini
Normal file
29
sample-deploy/uwsgi_subdir.ini
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
[uwsgi]
|
||||||
|
if-not-env = PORT
|
||||||
|
http-socket = :8080
|
||||||
|
socket = :8081
|
||||||
|
endif =
|
||||||
|
|
||||||
|
master = true
|
||||||
|
buffer-size = 65536
|
||||||
|
die-on-term = true
|
||||||
|
|
||||||
|
if-env = VIRTUAL_ENV
|
||||||
|
venv = $(VIRTUAL_ENV)
|
||||||
|
endif =
|
||||||
|
|
||||||
|
gevent = 100
|
||||||
|
|
||||||
|
#Not available until uwsgi 2.1
|
||||||
|
#monkey-patching manually in pywb.apps.wayback
|
||||||
|
#gevent-early-monkey-patch =
|
||||||
|
# for uwsgi<2.1, set env when using gevent
|
||||||
|
env = GEVENT_MONKEY_PATCH=1
|
||||||
|
|
||||||
|
# specify config file here
|
||||||
|
env = PYWB_CONFIG_FILE=config.yaml
|
||||||
|
#wsgi = pywb.apps.wayback
|
||||||
|
|
||||||
|
# config to run pywb from a prefix
|
||||||
|
mount = /wayback=/pywb/pywb/apps/wayback.py
|
||||||
|
manage-script-name = true
|
2
wombat
2
wombat
@ -1 +1 @@
|
|||||||
Subproject commit 3f04dcdcb071042d498c4912599454a15c11f0e4
|
Subproject commit 5ede99b6ffb3e0e3c240f2403a9f58189edda543
|
Loading…
x
Reference in New Issue
Block a user