61 Commits

Author SHA1 Message Date
Łukasz Langa
ad04f72da6
Move freeze_support to the main file, remove unused bin/bitrot 2023-08-02 13:38:31 +02:00
Łukasz Langa
87e15913a5
Move to pyproject.toml, drop Python 2 2023-08-02 13:00:30 +02:00
Łukasz Langa
7f9a2e2efc
Remove unused 'wait' import 2020-06-18 20:06:53 +02:00
p1r473
6168723f5b
Unused variable deletion (#42) 2020-06-18 20:05:08 +02:00
Łukasz Langa
67e7b8c904
v1.0.0 2020-05-18 00:15:24 +02:00
Łukasz Langa
0dc3390b7f
Use a process pool to calculate hashes and perform stat()
Fixes #23
2020-05-17 22:55:35 +02:00
Łukasz Langa
52677d2b5d
Optimization: don't SELECT the path twice if it's not there 2020-05-17 21:58:30 +02:00
Łukasz Langa
45ab4501ee
Make handle_unknown_path more readable 2020-05-17 21:50:09 +02:00
Łukasz Langa
8ee84344e8
Simplify normalization and Unicode handling 2020-05-17 21:18:48 +02:00
Łukasz Langa
7608b56ea6
Remove trailing whitespace 2020-05-17 21:17:19 +02:00
Łukasz Langa
8ec9ea9629
Use NFKD instead of NFKC because that's what macOS uses by default 2020-05-17 18:33:23 +02:00
Stan Senotrusov
74f043b3ca
Normalize unicode paths in the database (#37)
* Normalize unicode paths in the database to enable use of the same database across different platforms

* Check if unicode normalization should be applied without regexp
2020-05-17 18:27:05 +02:00
Reid Williams
4ea0a57e0a
Add and remove unnecessary / needed decodes (#38)
* remove unecessary decode

* add needed decode

Co-authored-by: Reid Williams <reid@computable.io>
2020-05-17 17:07:47 +02:00
p1r473
a043402114 Vacuuming (#34)
Added database vacuuming to shrink DB size on hard drive of old hashes that went missing
2017-06-13 13:34:44 -07:00
liloman
a8e52626ef Swap sqlite cursor with dictionary and set data structures (#24)
1. Use 2 new data structures:
-paths (set) contains all the files in the actual filesystem
-hashes (dictionary) substitute the sqlite query with dict[hash] = set(db paths)

2. Minimal unitary tests created with bats (bash script)

See https://github.com/ambv/bitrot/issues/23 for details.
2017-03-03 10:16:46 -08:00
Lukasz Langa
58aa762e5c 0.9.2, updated README and benchmarks 2016-11-01 12:02:34 -07:00
benshep
53b1a12301 Corrected assumed 'utf-8' encoding
Was failing on my machine - traced it to line 189 using a hard-coded 'utf-8' encoding. Everywhere else uses FSENCODING (which on my machine is 'mbcs'), so replaced it here as well.
2016-10-31 14:03:59 +00:00
Łukasz Langa
d192fa0175 Fixed -s 2016-10-29 19:27:18 -07:00
Łukasz Langa
5e66b772d2 More robust filename encoding during stdout handling 2016-10-29 19:12:33 -07:00
Łukasz Langa
8c871b1319 Merge pull request #18 from vain/fix-encoding-warning
Print full path when file name decoding fails
2016-10-29 15:04:46 -07:00
Peter Hofmann
313347dd61 Show warnings about unreadable files even with verbosity == 0
I had a semi-corrupt encfs (which I detected, thanks to this tool!). A
file was only partially readable. Somewhere in the middle, an IOError
occured. Essentially, this is a corrupt file system -- which this tool
is meant to help detect --, so this class of errors shouldn't be
suppressed by "-q".
2016-09-21 17:45:29 +02:00
Peter Hofmann
4e3c840eb0 Show warnings about un-stat-able files even with verbosity == 0
I get that this is some kind of a grey area due to the underlying race
condition (files vanishing after they have been scanned). However, if we
can't stat() a file it can have many different causes -- the file being
vanished is just one of them. Since this tool is meant to help detect
bit rot and corrupt file systems, I'd rather be informed about
un-stat-able files.
2016-09-21 17:45:24 +02:00
Peter Hofmann
4bd293f024 Print full path when file name decoding fails 2016-09-19 18:16:55 +02:00
Lukasz Langa
5ed89d8b1a [0.9.0] Python 3 compat, --quiet obeyed for bitrot.db checksum checks 2016-08-09 14:51:57 -07:00
Philip Lundrigan
6405beaeba Fix if condition 2016-07-14 08:28:05 -06:00
Philip Lundrigan
18bf67317e Make error files available in exception 2016-07-13 12:58:20 -06:00
Łukasz Langa
e4efbc290c bitrot 0.8.0, fsencoding and self-integrity check 2016-05-02 17:52:20 -07:00
Łukasz Langa
a09f0b0ad6 Bump version after fixing #13 2015-11-02 16:27:00 -08:00
Łukasz Langa
539c277bd8 Open file in binary mode for SHA1 computation
Fixes #13.
2015-11-02 16:23:18 -08:00
Łukasz Langa
13b0067ac8 [0.7.0] Multiple bug fixes and refactors 2015-06-22 18:08:26 -07:00
Łukasz Langa
08c6d436bf Merge pull request #7 from msloth/hotfix-missing-dbase-at-test
catch missing database file when running 'test' instead of crashing
2014-12-30 16:37:33 -08:00
Jean-Louis Fuchs
4d1ca47777 Fixed possible unique constraint exception
When renaming a file its hash can't be used in the WHERE
condition in the UPDATE statement since there _can_ be more
than one file having the same hash and not all of them are
renamed just the one not existing anymore. So we need to use
the old path (now non-existent) to specify the record to
update.

To make the code more clear I also added setting the hash
explicitly in the UPDATE statement.
2014-09-10 15:48:33 +02:00
Marcus Linderoth
a6e1bb9b4c catch missing database file when running 'test' instead of crashing 2014-08-10 18:03:28 +02:00
Łukasz Langa
e5f737b09d PEP 8 2013-11-11 00:43:22 -08:00
Łukasz Langa
1b8a582e34 Add --follow-links, skip files with ENOACCES et al. 2013-11-11 00:38:05 -08:00
Łukasz Langa
1f94944f87 commit at exit, in case of interrupted execution 2013-10-27 06:49:55 +01:00
Łukasz Langa
9521bdea00 minor formatting fixes, bumped to 0.6.0 2013-10-27 06:45:25 +01:00
Łukasz Langa
a8faff93e1 Merge branch 'periodic-commits' of git://github.com/yang/bitrot into yang-periodic-commits
Conflicts:
	src/bitrot.py
2013-10-19 22:17:17 +02:00
Łukasz Langa
dbdf7cf99b Merge branch 'master' of github.com:ambv/bitrot
Conflicts:
	src/bitrot.py
2013-10-19 22:15:51 +02:00
Łukasz Langa
f0e2d61fc3 Merge pull request #2 from yang/ignore-broken-symlinks
Ignore broken symlinks/files that disappear
2013-10-19 13:00:04 -07:00
Łukasz Langa
2cf550d6a3 Merge pull request #1 from yang/div-by-zero
Fix division-by-zero bug
2013-10-19 12:58:01 -07:00
Yang Zhang
af81b67d58 Use proper defined constant ENOENT 2013-10-17 11:42:23 -07:00
Yang Zhang
11e94f663c Clean up throttling and sha1 from feedback 2013-10-17 11:40:01 -07:00
Yang Zhang
b6faaf94fa Make chunk size configurable 2013-08-29 15:51:06 -07:00
Yang Zhang
fc46cb7c53 Add optional commit throttling 2013-08-29 15:39:33 -07:00
Yang Zhang
0afdaddd0a Create index on hash 2013-08-26 19:08:57 -07:00
Yang Zhang
a9b57b5814 Remove dangling pdb usage 2013-08-23 14:05:24 -07:00
Yang Zhang
37104d7b78 Clean up division by zero fix 2013-08-23 14:04:09 -07:00
Yang Zhang
3b3770d46a Ignore broken symlinks/files that disappear 2013-08-18 20:16:36 -07:00
Yang Zhang
f2c37cae26 Fix division-by-zero bug 2013-08-18 20:09:14 -07:00