mirror of
https://github.com/webrecorder/pywb.git
synced 2025-03-15 00:03:28 +01:00
update subpackage READMEs
This commit is contained in:
parent
a09dec4b3e
commit
7c1ac10d6f
@ -1,30 +1,20 @@
|
||||
## PyWb CDX v0.2
|
||||
|
||||
[](https://travis-ci.org/ikreymer/pywb_cdx)
|
||||
|
||||
### pywb.cdx package
|
||||
|
||||
This package contains the CDX processing suite of the pywb wayback tool suite.
|
||||
|
||||
The CDX Server loads, filters and transforms cdx from multiple sources in response
|
||||
to a given query.
|
||||
|
||||
### Installation and Tests
|
||||
|
||||
`pip install -r requirements` -- to install
|
||||
|
||||
`python run-tests.py` -- to run all tests
|
||||
|
||||
|
||||
### Sample App
|
||||
#### Sample App
|
||||
|
||||
A very simple reference WSGI app is included.
|
||||
|
||||
Run: `python -m pywb_cdx.wsgi_cdxserver` to start the app, keyboard interrupt to stop.
|
||||
Run: `python -m pywb.cdx.wsgi_cdxserver` to start the app, keyboard interrupt to stop.
|
||||
|
||||
The default [config.yaml](pywb_cdx/config.yaml) points to the sample data directory
|
||||
and uses port 8080
|
||||
|
||||
### CDX Server API Reference
|
||||
#### CDX Server API Reference
|
||||
|
||||
Goal is to provide compatiblity with this feature set and more:
|
||||
https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server
|
||||
|
@ -1,6 +1,4 @@
|
||||
## PyWb Rewrite v0.2
|
||||
|
||||
[](https://travis-ci.org/ikreymer/pywb_rewrite)
|
||||
### pywb.rewrite
|
||||
|
||||
This package includes the content rewriting component of the pywb wayback tool suite.
|
||||
|
||||
@ -11,23 +9,19 @@ An additional domain-specific rewritin is planned, especially for JS, to allow f
|
||||
replay of difficult pages.
|
||||
|
||||
|
||||
### Command-Line Rewriter
|
||||
#### Command-Line Rewriter
|
||||
|
||||
To enable easier testing of rewriting, this package includes a command-line rewriter
|
||||
which will fetch a live url and apply the registered rewriting rules to that url:
|
||||
|
||||
After installing with:
|
||||
|
||||
`pip install -r requirements.txt`
|
||||
|
||||
Run:
|
||||
|
||||
`python ./pywb_rewrite/rewrite_live.py http://example.com`
|
||||
`python ./pywb.rewrite/rewrite_live.py http://example.com`
|
||||
|
||||
To specify custom timestamp and prefix:
|
||||
|
||||
```
|
||||
python ./pywb_rewrite/rewrite_live.py http://example.com /mycoll/20141026000102/http://mysite.example.com/path.html
|
||||
python ./pywb.rewrite/rewrite_live.py http://example.com /mycoll/20141026000102/http://mysite.example.com/path.html
|
||||
```
|
||||
|
||||
This will print to stdout the content of `http://example.com` with all urls rewritten relative to
|
||||
@ -37,11 +31,12 @@ Headers are also rewritten, for further details, consult the `get_rewritten` fun
|
||||
[pywb_rewrite/rewrite_live.py](pywb_rewrite/rewrite_live.py)
|
||||
|
||||
|
||||
### Tests
|
||||
#### Tests
|
||||
|
||||
Rewriting doctests as well as live rewriting tests (subject to change) are provided.
|
||||
To run full test suite: `python run-tests.py`
|
||||
|
||||
|
||||
pywb.rewrite is part of a full test suite that can be executed via
|
||||
`python run-tests.py`
|
||||
|
||||
|
||||
|
||||
|
@ -1,16 +1,17 @@
|
||||
## PyWb Utils v0.2 ##
|
||||
### pywb.utils
|
||||
|
||||
[](https://travis-ci.org/ikreymer/pywb_utils)
|
||||
|
||||
This is a standalone module contains a variety of utils used by pywb wayback tool suite.
|
||||
|
||||
`python run-tests.py` will run all tests
|
||||
This package contains a utils used by pywb wayback tool suite.
|
||||
|
||||
#### Modules
|
||||
|
||||
[binsearch.py](pywb_utils/binsearch.py) -- Binary search implementation over text files
|
||||
* [binsearch.py](pywb.utils/binsearch.py) -- Binary search implementation over text files
|
||||
|
||||
[loaders.py](pywb_utils/loaders.py) -- Loading abstraction for http, local file system, as well as buffered and seekable file readers
|
||||
* [loaders.py](pywb.utils/loaders.py) -- Loading abstraction for loading via http or local file system.
|
||||
|
||||
[timeutils.py](pywb_utils/timeutils.py) -- Utility functions for converting between standard datetime formats 14-digit timestamp
|
||||
* [bufferedreaders.py](pywb.utils/bufferedreaders.py) -- Buffering wrappers for file-like object, also provide gzip decompression and
|
||||
de-chunking facilities.
|
||||
|
||||
* [statusandheaders.py](pywb.utils/statusandheaders.py) -- Represent http status line + headers and parsing them out from a stream
|
||||
|
||||
* [timeutils.py](pywb.utils/timeutils.py) -- Utility functions for converting between standard datetime formats 14-digit timestamp
|
||||
|
||||
|
@ -1,6 +1,4 @@
|
||||
## PyWb Warc v0.2
|
||||
|
||||
[](https://travis-ci.org/ikreymer/pywb_warc)
|
||||
### pywb.warc
|
||||
|
||||
This is the WARC/ARC record loading component of pywb wayback tool suite.
|
||||
|
||||
@ -16,7 +14,17 @@ This package provides the following facilities:
|
||||
|
||||
### Tests
|
||||
|
||||
This package will include a test suite for different WARC and ARC loading formats.
|
||||
This package will includes a test suite for loading a variety of WARC and ARC records.
|
||||
|
||||
To run: `python run-tests.py`
|
||||
Tests so far:
|
||||
|
||||
* Compressed WARC, ARC Records
|
||||
* Uncompressed ARC Records
|
||||
* Compressed WARC created by wget 1.14
|
||||
* Same Url revisit record resolving
|
||||
|
||||
|
||||
TODO:
|
||||
|
||||
* Different url revisit record resolving (TODO)
|
||||
* File type detection (no .warc, .arc extensions)
|
||||
|
Loading…
x
Reference in New Issue
Block a user