mirror of
https://github.com/webrecorder/pywb.git
synced 2025-03-24 06:59:52 +01:00
update subpackage READMEs
This commit is contained in:
parent
a09dec4b3e
commit
7c1ac10d6f
@ -1,30 +1,20 @@
|
|||||||
## PyWb CDX v0.2
|
### pywb.cdx package
|
||||||
|
|
||||||
[](https://travis-ci.org/ikreymer/pywb_cdx)
|
|
||||||
|
|
||||||
|
|
||||||
This package contains the CDX processing suite of the pywb wayback tool suite.
|
This package contains the CDX processing suite of the pywb wayback tool suite.
|
||||||
|
|
||||||
The CDX Server loads, filters and transforms cdx from multiple sources in response
|
The CDX Server loads, filters and transforms cdx from multiple sources in response
|
||||||
to a given query.
|
to a given query.
|
||||||
|
|
||||||
### Installation and Tests
|
#### Sample App
|
||||||
|
|
||||||
`pip install -r requirements` -- to install
|
|
||||||
|
|
||||||
`python run-tests.py` -- to run all tests
|
|
||||||
|
|
||||||
|
|
||||||
### Sample App
|
|
||||||
|
|
||||||
A very simple reference WSGI app is included.
|
A very simple reference WSGI app is included.
|
||||||
|
|
||||||
Run: `python -m pywb_cdx.wsgi_cdxserver` to start the app, keyboard interrupt to stop.
|
Run: `python -m pywb.cdx.wsgi_cdxserver` to start the app, keyboard interrupt to stop.
|
||||||
|
|
||||||
The default [config.yaml](pywb_cdx/config.yaml) points to the sample data directory
|
The default [config.yaml](pywb_cdx/config.yaml) points to the sample data directory
|
||||||
and uses port 8080
|
and uses port 8080
|
||||||
|
|
||||||
### CDX Server API Reference
|
#### CDX Server API Reference
|
||||||
|
|
||||||
Goal is to provide compatiblity with this feature set and more:
|
Goal is to provide compatiblity with this feature set and more:
|
||||||
https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server
|
https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server
|
||||||
|
@ -1,6 +1,4 @@
|
|||||||
## PyWb Rewrite v0.2
|
### pywb.rewrite
|
||||||
|
|
||||||
[](https://travis-ci.org/ikreymer/pywb_rewrite)
|
|
||||||
|
|
||||||
This package includes the content rewriting component of the pywb wayback tool suite.
|
This package includes the content rewriting component of the pywb wayback tool suite.
|
||||||
|
|
||||||
@ -11,23 +9,19 @@ An additional domain-specific rewritin is planned, especially for JS, to allow f
|
|||||||
replay of difficult pages.
|
replay of difficult pages.
|
||||||
|
|
||||||
|
|
||||||
### Command-Line Rewriter
|
#### Command-Line Rewriter
|
||||||
|
|
||||||
To enable easier testing of rewriting, this package includes a command-line rewriter
|
To enable easier testing of rewriting, this package includes a command-line rewriter
|
||||||
which will fetch a live url and apply the registered rewriting rules to that url:
|
which will fetch a live url and apply the registered rewriting rules to that url:
|
||||||
|
|
||||||
After installing with:
|
|
||||||
|
|
||||||
`pip install -r requirements.txt`
|
|
||||||
|
|
||||||
Run:
|
Run:
|
||||||
|
|
||||||
`python ./pywb_rewrite/rewrite_live.py http://example.com`
|
`python ./pywb.rewrite/rewrite_live.py http://example.com`
|
||||||
|
|
||||||
To specify custom timestamp and prefix:
|
To specify custom timestamp and prefix:
|
||||||
|
|
||||||
```
|
```
|
||||||
python ./pywb_rewrite/rewrite_live.py http://example.com /mycoll/20141026000102/http://mysite.example.com/path.html
|
python ./pywb.rewrite/rewrite_live.py http://example.com /mycoll/20141026000102/http://mysite.example.com/path.html
|
||||||
```
|
```
|
||||||
|
|
||||||
This will print to stdout the content of `http://example.com` with all urls rewritten relative to
|
This will print to stdout the content of `http://example.com` with all urls rewritten relative to
|
||||||
@ -37,11 +31,12 @@ Headers are also rewritten, for further details, consult the `get_rewritten` fun
|
|||||||
[pywb_rewrite/rewrite_live.py](pywb_rewrite/rewrite_live.py)
|
[pywb_rewrite/rewrite_live.py](pywb_rewrite/rewrite_live.py)
|
||||||
|
|
||||||
|
|
||||||
### Tests
|
#### Tests
|
||||||
|
|
||||||
Rewriting doctests as well as live rewriting tests (subject to change) are provided.
|
Rewriting doctests as well as live rewriting tests (subject to change) are provided.
|
||||||
To run full test suite: `python run-tests.py`
|
|
||||||
|
pywb.rewrite is part of a full test suite that can be executed via
|
||||||
|
`python run-tests.py`
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -1,16 +1,17 @@
|
|||||||
## PyWb Utils v0.2 ##
|
### pywb.utils
|
||||||
|
|
||||||
[](https://travis-ci.org/ikreymer/pywb_utils)
|
This package contains a utils used by pywb wayback tool suite.
|
||||||
|
|
||||||
This is a standalone module contains a variety of utils used by pywb wayback tool suite.
|
|
||||||
|
|
||||||
`python run-tests.py` will run all tests
|
|
||||||
|
|
||||||
#### Modules
|
#### Modules
|
||||||
|
|
||||||
[binsearch.py](pywb_utils/binsearch.py) -- Binary search implementation over text files
|
* [binsearch.py](pywb.utils/binsearch.py) -- Binary search implementation over text files
|
||||||
|
|
||||||
[loaders.py](pywb_utils/loaders.py) -- Loading abstraction for http, local file system, as well as buffered and seekable file readers
|
* [loaders.py](pywb.utils/loaders.py) -- Loading abstraction for loading via http or local file system.
|
||||||
|
|
||||||
[timeutils.py](pywb_utils/timeutils.py) -- Utility functions for converting between standard datetime formats 14-digit timestamp
|
* [bufferedreaders.py](pywb.utils/bufferedreaders.py) -- Buffering wrappers for file-like object, also provide gzip decompression and
|
||||||
|
de-chunking facilities.
|
||||||
|
|
||||||
|
* [statusandheaders.py](pywb.utils/statusandheaders.py) -- Represent http status line + headers and parsing them out from a stream
|
||||||
|
|
||||||
|
* [timeutils.py](pywb.utils/timeutils.py) -- Utility functions for converting between standard datetime formats 14-digit timestamp
|
||||||
|
|
||||||
|
@ -1,6 +1,4 @@
|
|||||||
## PyWb Warc v0.2
|
### pywb.warc
|
||||||
|
|
||||||
[](https://travis-ci.org/ikreymer/pywb_warc)
|
|
||||||
|
|
||||||
This is the WARC/ARC record loading component of pywb wayback tool suite.
|
This is the WARC/ARC record loading component of pywb wayback tool suite.
|
||||||
|
|
||||||
@ -16,7 +14,17 @@ This package provides the following facilities:
|
|||||||
|
|
||||||
### Tests
|
### Tests
|
||||||
|
|
||||||
This package will include a test suite for different WARC and ARC loading formats.
|
This package will includes a test suite for loading a variety of WARC and ARC records.
|
||||||
|
|
||||||
To run: `python run-tests.py`
|
Tests so far:
|
||||||
|
|
||||||
|
* Compressed WARC, ARC Records
|
||||||
|
* Uncompressed ARC Records
|
||||||
|
* Compressed WARC created by wget 1.14
|
||||||
|
* Same Url revisit record resolving
|
||||||
|
|
||||||
|
|
||||||
|
TODO:
|
||||||
|
|
||||||
|
* Different url revisit record resolving (TODO)
|
||||||
|
* File type detection (no .warc, .arc extensions)
|
||||||
|
Loading…
x
Reference in New Issue
Block a user