update readme (and trigger travis ci build?)

This commit is contained in:
Noah Levitt 2013-12-04 17:44:08 -08:00
parent 20c25da48d
commit 235e0dce45

View File

@ -82,33 +82,33 @@ Usage
To do To do
~~~~~ ~~~~~
- integration tests, unit tests * (partly done) integration tests, unit tests
- [STRIKEOUT:url-agnostic deduplication] * (done) url-agnostic deduplication
- unchunk and/or ungzip before storing payload, or alter request to * unchunk and/or ungzip before storing payload, or alter request to
discourage server from chunking/gzipping discourage server from chunking/gzipping
- check certs from proxied website, like browser does, and present * check certs from proxied website, like browser does, and present
browser-like warning if appropriate browser-like warning if appropriate
- keep statistics, produce reports * keep statistics, produce reports
- write cdx while crawling? * write cdx while crawling?
- performance testing * performance testing
- [STRIKEOUT:base32 sha1 like heritrix?] * (done) base32 sha1 like heritrix?
- configurable timeouts and stuff * configurable timeouts and stuff
- evaluate ipv6 support * evaluate ipv6 support
- [STRIKEOUT:more explicit handling of connection closed exception * (done) more explicit handling of connection closed exception
during transfer? other error cases?] during transfer
- dns cache?? the system already does a fine job I'm thinking * dns cache?? the system already does a fine job I'm thinking
- keepalive with remote servers? * keepalive with remote servers?
- python3 * (done) python3
- special handling for 304 not-modified (write nothing or write revisit * special handling for 304 not-modified (write nothing or write revisit
record... and/or modify request so server never responds with 304) record... and/or modify request so server never responds with 304)
- [STRIKEOUT:instant playback on a second proxy port] * (done) instant playback on a second proxy port
- special url for downloading ca cert e.g. http(s)://warcprox./ca.pem * special url for downloading ca cert e.g. http(s)://warcprox./ca.pem
- special url for other stuff, some status info or something? * special url for other stuff, some status info or something?
- browser plugin for warcprox mode * browser plugin for warcprox mode
- accept warcprox CA cert only when in warcprox mode - accept warcprox CA cert only when in warcprox mode
- separate temporary cookie store, like incognito - separate temporary cookie store, like incognito
- "careful! your activity is being archived" banner - "careful! your activity is being archived" banner
- easy switch between archiving and instant playback proxy port - easy switch between archiving and instant playback proxy port
To not do To not do
^^^^^^^^^ ^^^^^^^^^
@ -118,8 +118,8 @@ belong here, since this is a proxy, not a crawler/robot. It can be used
by a human with a browser, or by something automated, i.e. a robot. My by a human with a browser, or by something automated, i.e. a robot. My
feeling is that it's more appropriate to implement these in the robot. feeling is that it's more appropriate to implement these in the robot.
- politeness, i.e. throttle requests per server * politeness, i.e. throttle requests per server
- fetch and obey robots.txt * fetch and obey robots.txt
- alter user-agent, maybe insert something like "warcprox mitm * alter user-agent, maybe insert something like "warcprox mitm
archiving proxy; +http://archive.org/details/archive.org\_bot" archiving proxy; +http://archive.org/details/archive.org\_bot"