mirror of
https://github.com/internetarchive/warcprox.git
synced 2025-01-18 13:22:09 +01:00
change trough dedup date
type to varchar
This is a backwards-compatible change whose purpose is to clarify the existing usage. In sqlite (and therefore trough), the datatypes of columns are just suggestions. In fact the values can have any type. See https://sqlite.org/datatype3.html. `datetime` isn't even a real sqlite type. Warcprox stores a string formatted like '2019-11-19T01:23:45Z' in that field. When it pulls it out of the database and writes a revisit record, it sticks the raw value in the `WARC-Date` header of that record. Warcprox never parses the string value. Since we use the raw textual value of the field, it makes sense to use a textual datatype to store it.
This commit is contained in:
parent
f77c152037
commit
ac959c6db5
@ -500,7 +500,7 @@ class TroughDedupDb(DedupDb, DedupableMixin):
|
||||
SCHEMA_SQL = ('create table dedup (\n'
|
||||
' digest_key varchar(100) primary key,\n'
|
||||
' url varchar(2100) not null,\n'
|
||||
' date datetime not null,\n'
|
||||
' date varchar(100) not null,\n'
|
||||
' id varchar(100));\n') # warc record id
|
||||
WRITE_SQL_TMPL = ('insert or ignore into dedup\n'
|
||||
'(digest_key, url, date, id)\n'
|
||||
|
Loading…
x
Reference in New Issue
Block a user