docs work: - remove old doc folder - generate new sphinx docs rewrite: fix existing docstrings for rst add 'make apidoc' to rerun apidoc on pywb root apidocs in docs/code first pass on usage manual in docs/manual
# Minimal makefile for Sphinx documentation
# You can set these variables from the command line.
SPHINXBUILD = python -msphinx
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
@sphinx-apidoc -f -T -o code ../pywb/ "../*test*" "../*git_hash*"
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@ -0,0 +1,78 @@
pywb\.apps package
pywb\.apps\.cli module
.. automodule:: pywb.apps.cli
pywb\.apps\.frontendapp module
.. automodule:: pywb.apps.frontendapp
pywb\.apps\.live module
.. automodule:: pywb.apps.live
pywb\.apps\.rewriterapp module
.. automodule:: pywb.apps.rewriterapp
pywb\.apps\.static\_handler module
.. automodule:: pywb.apps.static_handler
pywb\.apps\.warcserverapp module
.. automodule:: pywb.apps.warcserverapp
pywb\.apps\.wayback module
.. automodule:: pywb.apps.wayback
pywb\.apps\.wbrequestresponse module
.. automodule:: pywb.apps.wbrequestresponse
Module contents
.. automodule:: pywb.apps
@ -0,0 +1,30 @@
pywb\.indexer package
pywb\.indexer\.archiveindexer module
.. automodule:: pywb.indexer.archiveindexer
pywb\.indexer\.cdxindexer module
.. automodule:: pywb.indexer.cdxindexer
Module contents
.. automodule:: pywb.indexer
@ -0,0 +1,38 @@
pywb\.manager package
pywb\.manager\.autoindex module
.. automodule:: pywb.manager.autoindex
pywb\.manager\.manager module
.. automodule:: pywb.manager.manager
pywb\.manager\.migrate module
.. automodule:: pywb.manager.migrate
Module contents
.. automodule:: pywb.manager
@ -0,0 +1,46 @@
pywb\.recorder package
pywb\.recorder\.filters module
.. automodule:: pywb.recorder.filters
pywb\.recorder\.multifilewarcwriter module
.. automodule:: pywb.recorder.multifilewarcwriter
pywb\.recorder\.recorderapp module
.. automodule:: pywb.recorder.recorderapp
pywb\.recorder\.redisindexer module
.. automodule:: pywb.recorder.redisindexer
Module contents
.. automodule:: pywb.recorder
@ -0,0 +1,142 @@
pywb\.rewrite package
pywb\.rewrite\.content\_rewriter module
.. automodule:: pywb.rewrite.content_rewriter
pywb\.rewrite\.cookie\_rewriter module
.. automodule:: pywb.rewrite.cookie_rewriter
pywb\.rewrite\.cookies module
.. automodule:: pywb.rewrite.cookies
pywb\.rewrite\.default\_rewriter module
.. automodule:: pywb.rewrite.default_rewriter
pywb\.rewrite\.header\_rewriter module
.. automodule:: pywb.rewrite.header_rewriter
pywb\.rewrite\.html\_insert\_rewriter module
.. automodule:: pywb.rewrite.html_insert_rewriter
pywb\.rewrite\.html\_rewriter module
.. automodule:: pywb.rewrite.html_rewriter
pywb\.rewrite\.jsonp\_rewriter module
.. automodule:: pywb.rewrite.jsonp_rewriter
pywb\.rewrite\.regex\_rewriters module
.. automodule:: pywb.rewrite.regex_rewriters
pywb\.rewrite\.rewrite\_amf module
.. automodule:: pywb.rewrite.rewrite_amf
pywb\.rewrite\.rewrite\_dash module
.. automodule:: pywb.rewrite.rewrite_dash
pywb\.rewrite\.rewrite\_hls module
.. automodule:: pywb.rewrite.rewrite_hls
pywb\.rewrite\.rewriteinputreq module
.. automodule:: pywb.rewrite.rewriteinputreq
pywb\.rewrite\.templateview module
.. automodule:: pywb.rewrite.templateview
pywb\.rewrite\.url\_rewriter module
.. automodule:: pywb.rewrite.url_rewriter
pywb\.rewrite\.wburl module
.. automodule:: pywb.rewrite.wburl
Module contents
.. automodule:: pywb.rewrite
@ -7,13 +7,12 @@ Subpackages
.. toctree::
Module contents
@ -1,68 +1,67 @@
pywb.utils package
pywb\.utils package
pywb.utils.binsearch module
pywb\.utils\.binsearch module
.. automodule:: pywb.utils.binsearch
pywb.utils.bufferedreaders module
.. automodule:: pywb.utils.bufferedreaders
pywb.utils.canonicalize module
pywb\.utils\.canonicalize module
.. automodule:: pywb.utils.canonicalize
:special-members: __call__
pywb.utils.dsrules module
pywb\.utils\.format module
.. automodule:: pywb.utils.dsrules
.. automodule:: pywb.utils.format
pywb.utils.loaders module
pywb\.utils\.geventserver module
.. automodule:: pywb.utils.geventserver
pywb\.utils\.io module
.. automodule:: pywb.utils.io
pywb\.utils\.loaders module
.. automodule:: pywb.utils.loaders
pywb.utils.statusandheaders module
.. automodule:: pywb.utils.statusandheaders
pywb.utils.timeutils module
pywb\.utils\.memento module
.. automodule:: pywb.utils.timeutils
.. automodule:: pywb.utils.memento
pywb.utils.wbexception module
pywb\.utils\.wbexception module
.. automodule:: pywb.utils.wbexception
@ -0,0 +1,70 @@
pywb\.warcserver\.index package
pywb\.warcserver\.index\.aggregator module
.. automodule:: pywb.warcserver.index.aggregator
pywb\.warcserver\.index\.cdxobject module
.. automodule:: pywb.warcserver.index.cdxobject
pywb\.warcserver\.index\.cdxops module
.. automodule:: pywb.warcserver.index.cdxops
pywb\.warcserver\.index\.fuzzymatcher module
.. automodule:: pywb.warcserver.index.fuzzymatcher
pywb\.warcserver\.index\.indexsource module
.. automodule:: pywb.warcserver.index.indexsource
pywb\.warcserver\.index\.query module
.. automodule:: pywb.warcserver.index.query
pywb\.warcserver\.index\.zipnum module
.. automodule:: pywb.warcserver.index.zipnum
Module contents
.. automodule:: pywb.warcserver.index
@ -0,0 +1,46 @@
pywb\.warcserver\.resource package
pywb\.warcserver\.resource\.blockrecordloader module
.. automodule:: pywb.warcserver.resource.blockrecordloader
pywb\.warcserver\.resource\.pathresolvers module
.. automodule:: pywb.warcserver.resource.pathresolvers
pywb\.warcserver\.resource\.resolvingloader module
.. automodule:: pywb.warcserver.resource.resolvingloader
pywb\.warcserver\.resource\.responseloader module
.. automodule:: pywb.warcserver.resource.responseloader
Module contents
.. automodule:: pywb.warcserver.resource
@ -0,0 +1,70 @@
pywb\.warcserver package
.. toctree::
pywb\.warcserver\.basewarcserver module
.. automodule:: pywb.warcserver.basewarcserver
pywb\.warcserver\.handlers module
.. automodule:: pywb.warcserver.handlers
pywb\.warcserver\.http module
.. automodule:: pywb.warcserver.http
pywb\.warcserver\.inputrequest module
.. automodule:: pywb.warcserver.inputrequest
pywb\.warcserver\.upstreamindexsource module
.. automodule:: pywb.warcserver.upstreamindexsource
pywb\.warcserver\.warcserver module
.. automodule:: pywb.warcserver.warcserver
Module contents
.. automodule:: pywb.warcserver
@ -0,0 +1,181 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# pywb documentation build configuration file, created by
# sphinx-quickstart on Thu Sep 21 01:58:55 2017.
# This file is execfile()d with the current directory set to its
# containing dir.
# Note that not all possible configuration values are present in this
# autogenerated file.
# All configuration values have a default; values that are commented out
# serve to show the default.
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
# -- General configuration ------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.autodoc',
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = 'pywb'
copyright = 'A Webrecorder Project, Ilya Kreymer, Rhizome'
author = 'Ilya Kreymer'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
# The short X.Y version.
version = '2.0'
# The full version, including alpha/beta/rc tags.
release = '2.0'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = True
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'default'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
# This is required for the alabaster theme
# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
#html_sidebars = {
# '**': [
# 'about.html',
# 'navigation.html',
# 'relations.html', # needs 'show_related': True theme option to display
# 'searchbox.html',
# 'donate.html',
# ]
# -- Options for HTMLHelp output ------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'pywbdoc'
# -- Options for LaTeX output ---------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
# 'preamble': '',
# Latex figure (float) alignment
# 'figure_align': 'htbp',
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'pywb.tex', 'pywb Documentation',
'Ilya Kreymer', 'manual'),
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'pywb', 'pywb Documentation',
[author], 1)
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'pywb', 'pywb Documentation',
author, 'pywb', 'One line description of project.',
# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {'https://docs.python.org/': None}
@ -0,0 +1,27 @@
.. pywb documentation master file, created by
sphinx-quickstart on Thu Sep 21 01:58:55 2017.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Webrecorder pywb documentation!
Webrecorder (:mod:`pywb`) toolkit is a full-featured, advanced web archiving capture and replay framework for python.
It provides command-line tools and an extensible framework for high-fidelity web archive access and creation.
.. toctree::
:maxdepth: 2
Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
@ -0,0 +1,36 @@
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=python -msphinx
set BUILDDIR=_build
if "%1" == "" goto help
if errorlevel 9009 (
echo.The Sphinx module was not found. Make sure you have Sphinx installed,
echo.then set the SPHINXBUILD environment variable to point to the full
echo.path of the 'sphinx-build' executable. Alternatively you may add the
echo.Sphinx directory to PATH.
echo.If you don't have Sphinx installed, grab it from
exit /b 1
goto end
Normal file
@ -0,0 +1,269 @@
Configuring the Web Archive
pywb offers an extensible YAML based configuration format via a main ``config.yaml`` at the root of each web archive.
Framed vs Frameless Replay vs HTTPS proxy
pywb supports several modes for serving archived web content.
With **framed replay**, the archived content is loaded into an iframe, and a top frame UI provides info and metadata.
In this mode, the top frame url is for example, ``http://my-archive.example.com/<coll name>/http://example.com/`` while
the actual content is served at ``http://my-archive.example.com/<coll name>/mp_/http://example.com/``
With **frameless replay**, the archived content is loaded directly, and a banner UI is injected into the page.
In this mode, the content is served directly at ``http://my-archive.example.com/<coll name>/http://example.com/``
(pywb can also supports HTTP/S **proxy mode** which requires additional setup. See :ref:`https-proxy` for more details).
For security reasons, we recommend running pywb in framed mode, because a malicious site
`could tamper with the banner <http://labs.rhizome.org/presentations/security.html#/13>`_
However, for certain situations, frameless replay made be appropriate.
To disable framed replay add:
``framed_replay: false`` to your config.yaml
Directory Structure
The pywb system assumes the following default directory structure for a web archive::
+-- config.yaml (optional)
+-- templates (optional)
+-- static (optional)
+-- collections
+-- <coll name>
+-- archives
| |
| +-- (WARC or ARC files here)
+-- indexes
| |
| +-- (CDXJ index files here)
+-- templates
| |
| +-- (optional html templates here)
+-- static
+-- (optional custom static assets here)
If running with default settings, the ``config.yaml`` can be omitted.
It is possible to config these paths in the config.yaml
The following are some of the implicit default settings which can be customized::
collections_root: collections
archive_paths: archive
index_paths: indexes
(For a complete list of defaults, see the ``pywb/default_config.yaml`` file for reference)
Index Paths
The ``index_paths`` key defines the subdirectory for index files (usually CDXJ) and determine the contents of each archive collection.
The index files usually contain a pointer to a WARC file, but not the absolute path.
Archive Paths
The ``archive_paths`` key indicates how pywb will resolve WARC files listed in the index.
For example, it is possible to configure multiple archive paths::
- archive
- http://remote-bakup.example.com/collections/
When resolving a ``example.warc.gz``, pywb will then check (in order):
* First, ``collections/<coll name>/example.warc.gz``
* Then, ``http://remote-backup.example.com/collections/<coll name>/example.warc.gz`` (if first lookup unsuccessful)
Custom Defined Collections
While pywb can detect automatically collections following the above directory structure,
it may be useful to declare custom collections explicitly.
In addition, several "special" collection definitions are possible.
All custom defined collections are placed under the ``collections`` key in ``config.yaml``
Live Web Collection
The live web collection proxies all data to the live web, and can be defined as follows::
live: $live
This configures the ``/live/`` route to point to the live web.
(As a shortcut, ``wayback --live`` adds this collection via cli w/o modifiying the config.yaml)
This collection can be useful for testing, or even more powerful, when combined with recording.
Auto "All" Aggregate Collection
The aggregate all collections automatically aggregates data from all collections in the ``collections`` directory::
all: $all
Accessing ``/all/<url>`` will cause an aggregate lookup within the collections directory.
Note: It is not (yet) possible to exclude collections from the all collection, although "special" collections are not included.
Generic Collection Definitions
The collection definition syntax allows for explicitly setting the index, archive paths
and all other templates, per collection, for example::
index: ./path/to/indexes
resource: ./some/other/path/to/archive/
query_html: ./path/to/templates/query.html
This configuration supports the full Warcserver config syntax, including
remote archives, aggregation and fallback sequences (link)
This format also makes it easier to move legacy collections that have unique path requirements.
Root Collection
It is also possible to define a "root" collection, for example, accessible at ``http://my-archive.example.com/<url>``
Such a collection must be defined explicitly using the ``$root`` as collection name::
index: ./path/to/indexes
resource: ./path/to/archive/
Note: When a root collection is set, no other collections are currently accessible, they are ignored.
Recording Mode
.. _https-proxy:
HTTP/S Proxy Mode
UI Customizations
pywb supports UI customizations, either for an entire archive,
or per-collection.
Static Files
The replay server will automatically support static files placed under the following directories:
* Files under the root ``static`` directory can be accessed via ``http://my-archive.example.com/static/<filename>``
* Files under the per-collection ``./collections/<coll name>/static`` directory can be accessed via ``http://my-archive.example.com/static/_/<coll name>/<filename>``
pywb users Jinja2 templates to render HTML to render the HTML for all aspects of the application.
A version placed in the ``templates`` directory, either in the root or per collection, will override that template.
To copy the default pywb template to the template directory run:
``wb-manager template --add search_html``
The following templates are available:
* ``home.html`` -- Home Page Template, used for ``http://my-archive.example.com/``
* ``search.html`` -- Collection Template, used for each collection page ``http://my-archive.example.com/<coll name>/``
* ``query.html`` -- Capture Query Page for a given url, used for ``http://my-archive.example.com/<coll name/*/<url>``
Error Pages:
* ``not_found.html`` -- Page to show when a url is not found in the archive
* ``error.html`` -- Generic Error Page for any error (except not found)
Replay and Banner templates:
* ``frame_insert.html`` -- Top-frame for framed replay mode (not used with frameless mode)
* ``head_insert.html`` -- Rewriting code injected into ``<head>`` of each replayed page.
This template includes the banner template and itself should generally not need to be modified.
* ``banner.html`` -- The banner used for frameless replay. Can be set to blank to disable the banner.
Custom Outer Replay Frame
The top-frame used for framed replay can be replaced or augmented
by modifiying the ``frame_insert.html``.
To start with modifiying the default outer page, you can add it to the current
templates directory by running ``wb-frame template --add frame_insert.html``
To initialize the replay, the outer page should include ``wb_frame.js``,
create an ``<iframe>`` element and pass the id (or element itself) to the ``ContentFrame`` constructor:
.. code-block:: html
<script src='{{ host_prefix }}/{{ static_path }}/wb_frame.js'> </script>
var cframe = new ContentFrame({"url": "{{ url }}" + window.location.hash,
"prefix": "{{ wb_prefix }}",
"request_ts": "{{ wb_url.timestamp }}",
"iframe": "#replay_iframe"});
The outer frame can receive notifications of changes to the replay via ``postMessage``
For example, to detect when the content frame changed and log the new url and timestamp,
use the following script to the outer frame html:
.. code-block:: javascript
window.addEventListener("message", function(event.data) {
if (event.data.wb_type == "load" && event.data.wb_type == "replace-url") {
console.log("New Url: " + event.data.url);
console.log("New Timestamp: " + event.data.ts);
The ``load`` message is sent when a new page is first loaded, while ``replace-url`` is used
for url changes caused by content frame History navigation.
Normal file
@ -0,0 +1,10 @@
.. toctree::
:maxdepth: 2
@ -0,0 +1,14 @@
New Features
The 2.0 release of :mod:`pywb` is a significant refactoring over previous versions,
and introduces many new features, including:
* WARC Server and API
* WARC Recorder
* Improved replay fidelity
* Dynamic Collections
* Memento Aggregation Chains
* Customizable Rewriting System
@ -0,0 +1,4 @@
WARC Recorder
Normal file
@ -0,0 +1,127 @@
Getting Started
At its core, pywb includes a fully featured web archive replay system, sometimes known as 'wayback machine', to provide the ability to replay,
or view, archived web content in the browser.
If you have existing web archive (WARC or legacy ARC) files, here's how to make them accessible using :mod:`pywb`
(If not, see :ref:`creating-warc` for instructions on how to easily create a WARC file right away)
By default, pywb provides directory-based collections system to run your own web archive directly from archive collections on disk.
Two command line utilities are provided:
* ``wb-manager`` is a command line tool for managing common collection operations.
* ``wayback`` starts a web server that provides the access to web archives.
(For more details, run ``wb-manager -h`` and ``wayback -h``)
For example, to install pywb and create a new collection "my-web-archive" in ``./collections/my-web-archive``.
.. code:: console
pip install pywb
wb-manager init my-web-archive
wb-manager add my-web-archive <path/to/my_warc.warc.gz>
Point your browser to ``http://localhost:8080/my-web-archive/<url>/`` where ``<url>`` is a url you recorded before into your WARC/ARC file.
If all worked well, you should see your archived version of ``<url>``. Congrats, you are now running your own web archive!
Using Existing Web Archive Collections
Existing archives of WARCs/ARCs files can be used with pywb with minimal amount of setup. By using ``wb-manager add``,
WARC/ARC files will automatically be placed in the collection archive directory and indexed.
By default ``wb-manager``, places new collections in ``collections/<coll name>`` subdirectory in the current working directory. To specify a different root directory, the ``wb-manager -d <dir>``. Other options can be set in the config file.
If you have a large number of existing CDX index files, pywb will be able to read them as well after running through a simple conversion process.
It is recommended that any index files be converted to the latest CDXJ format, which can be done by running:
``wb-manager cdx-convert <path/to/cdx>``
To setup a collection with existing ARC/WARCs and CDX index files, you can:
1. Run ``wb-manager init <coll name>``. This will initialize all the required collection directories.
2. Copy any archive files (WARCs and ARCs) to ``collections/<coll name>/archive/``
3. Copy any existing cdx indexes to ``collections/<coll name>/indexes/``
4. Run ``wb-manager cdx-convert collections/<coll name>/indexes/``. This strongly recommended, as it will
ensure that any legacy indexes are updated to the latest CDXJ format.
This will fully migrate your archive and indexes the collection.
Any new WARCs added with ``wb-manager add`` will be indexed and added to the existing collection.
Dynamic Collections and Automatic Indexing
Collections created via ``wb-manager init`` are fully dynamic, and new collections can be added without restarting pywb.
When adding WARCs with ``wb-manager add``, the indexes are also updated automatically. No restart is required, and the
content is instantly available for replay.
For more complex use cases, mod:`pywb` also includes a background indexer that checks the archives directory and automatically
updates the indexes, if any files have changed or were added.
(Of course, indexing will take some time if adding a large amount of data all at once, but is quite useful for smaller archive updates).
To enable auto-indexing, run with ``wayback -a`` or ``wayback -a --auto-interval 30`` to adjust the frequency of auto-indexing (default is 30 seconds).
.. _creating-warc:
Creating a Web Archive
Using Webrecorder
If you do not have a web archive to test, one easy way to create one is to use `Webrecorder <https://webrecorder.io>`_
After recording, you can click ``Stop`` and then click `Download Collection` to receive a WARC (`.warc.gz`) file.
You can then use this with work with pywb.
Using pywb Recorder
The core recording functinality in Webrecorder ia also part of :mod:`pywb`. If you want to create a WARC locally, this can be
done by directly recording into your pywb collection:
1. Edit ``config.yaml`` to add ``recorder: live``
2. Create a collection: ``wb-manager init my-web-archive`` (if you haven't already created a web archive collection)
3. Run: ``wayback --live -a --auto-interval 10``
4. Point your browser to ``http://localhost:8080/my-web-archive/record/<url>``
For example, to record ``http://example.com/``, visit ``http://localhost:8080/my-web-archive/record/<url>``
In this configuration, the indexing happens every 10 seconds.. After 10 seconds, the recorded url will be accessible for replay, eg:
(Note: this recorder is still experimental)
HTTP/S Proxy Mode Access
It is also possible to access any pywb collection via HTTP/S proxy mode, providing possibly better replay
without client-side url rewriting.
At this time, a single collection for proxy mode access can be specified with the ``--proxy`` flag.
For example, ``wayback --proxy my-web-archive`` will start pywb and enable proxy mode access.
You can then configure a browser to Proxy Settings host port to: ``localhost:8080`` and then loading any url, eg. ``http://example.com/`` should
load the latest copy from the ``my-web-archive`` collection.
@ -0,0 +1,11 @@
WARC Server
CDX Server API
@ -136,7 +136,7 @@ class JSLinkRewriterMixin(object):
class JSLocationRewriterMixin(object):
JS Rewriter mixin which rewrites location and domain to the
specified prefix (default: 'WB_wombat_')
specified prefix (default: ``WB_wombat_``)
def __init__(self, rewriter, rules=[], prefix='WB_wombat_'):
@ -9,27 +9,30 @@ with the wayback machine.
There WbUrl may represent one of the following forms:
query form: [/modifier]/[timestamp][-end_timestamp]*/<url>
query form: ``[/modifier]/[timestamp][-end_timestamp]*/<url>``
modifier, timestamp and end_timestamp are optional
modifier, timestamp and end_timestamp are optional::
url query form: used to indicate query across urls
same as query form but with a final *
same as query form but with a final ``*``::
replay form:
replay form::
latest_replay: (no timestamp)
latest_replay: (no timestamp)::
Additionally, the BaseWbUrl provides the base components
(url, timestamp, end_timestamp, modifier, type) which
Reference in New Issue
Block a user