This is useful for HTML or other text resources that are normally rewritten when using the default (``mp_`` modifier).
Note that certain HTTP headers (hop-by-hop or cookie related) may still be prefixed with ``X-Orig-Archive-`` as they may affect the transmission,
so original headers are not guaranteed.
No Modifier
"""""""""""
The 'canonical' replay url is one without the modifier and represents the url that a user will see and enter into the browser.
The behavior for the canonical/no modifier archival url is only different if framed replay is used (see :ref:`framed_vs_frameless`)
* If framed replay, this url serves the top level frame
* If frameless replay, this url serves the content and is equivalent to the ``mp_`` modifier.
Main Page Modifier (``mp_``)
""""""""""""""""""""""""""""
This modifier is used to indicate 'main page' content replay, generally HTML pages. Since pywb also checks content type detection, this modifier can
be used for any resources that is being loaded for replay, and generally render it correctly. Binary resources can be rendered with this modifier.
JS and CSS Hint Modifiers (``js_`` and ``cs_``)
"""""""""""""""""""""""""""""""""""""""""""""""
These modifiers are useful to 'hint' for pywb that a certain resource is being treated as a JS or CSS file. This only makes a difference where there is an ambiguity.
For example, if a resource has type ``text/html`` but is loaded in a ``<script>`` tag with the ``js_`` modifier, it will be rewritten as JS instead of as HTML.
Other Modifiers
"""""""""""""""
For compatibility and historical reasons, the pywb HTML parser also adds the following special hints:
*``im_`` -- hint that this resource is being used as an image.
*``oe_`` -- hint that this resource is being used as an object or embed
*``if_`` -- hint that this resource is being used as an iframe
*``fr_`` -- hint that this resource is being used as an frame
However, these modifiers are essentially treated the same as ``mp_``, deferring to content-type analysis to determine if rewriting is needed.
The CSS rewriter rewrites any urls found in ``<style>`` blocks in HTML, as well as any files determined to be css
(based on ``text/css`` content type or ``cs_`` modifier).
JS Rewriting
~~~~~~~~~~~~
The JS rewriter is applied to inline ``<script>`` blocks, or inline attribute js, and any files determine to be javascript (based on content type and ``js_`` modifier).
The default JS rewriter does not rewrite any links. Instead, JS rewriter performs limited regular expression on the following:
*``postMessage`` calls
* certain ``this`` property accessors
* specific ``location =`` assignment
Then, the entire script block is wrapped in a special code block to be executed client side. The result is that client-side execution of ``location``, ``window``, ``top`` and other top-level objects follows goes through a client-side proxy object. The client-side rewriting is handled by ``wombat.js``
The server-side rewriting is to aid the client-side execution of wrapped code.
For more information, see :py:mod:`pywb.rewriter.regex_rewriters.JSWombatProxyRewriterMixin`
For example, a requested url might be ``/my-coll/http://example.com?callback=jQuery123`` but the returned content might be:
``jQuery456(...)`` due to fuzzy matching, which matched this inexact response to the requested url.
To ensure the JSONP callback works as expected, the content is rewritten to ``jQuery123(...)`` -> ``jQuery456(...)``
For more information, see :py:mod:`pywb.rewriter.jsonp_rewriter`
DASH and HLS Rewriting
~~~~~~~~~~~~~~~~~~~~~~
To support recording and replaying, adaptive streaming formants (DASH and HLS), pywb can perform special rewriting on the manifests for these formats to remoe all but one possible resolution/format. As a result, the non-deterministic format selection is reduced to a single consistent format.
For more information, see :py:mod:`pywb.rewriter.rewrite_hls` and :py:mod:`pywb.rewriter.rewrite_dash` and the tests in ``pywb/rewrite/test/test_content_rewriter.py``