All source listed below is under MIT license if no LICENSE file stating different is available.
isspam
Fast as light evaluator for text files to summarize specific details about the text files.
This repository contains multiple versions of the same(-ish) algorithm.
Versions
- C (isspam) written by @retoor
- Rust (risspam) written by @12bitfloat
- C++ (isspam_cpp) written by @BordedDev
- Rust (jisspam) written by @jestdotty
Building
Build all versions to the repo root:
make build_all
Build isspam (C) with memory check (requires valgrind to be installed):
make valgrind
Benchmarking
After all binaries have been build to the repo root, you can benchmark them like this:
make benchmark
or without extracting books again:
make benchmark_only
Running
Using files as parameter
./(r)isspam ./spam/*.txt
./(r)isspam ./not_spam/*.txt
Using stdin
Useful for automation. Works only on the isspam version.
cat ./spam/example_spam1.txt | ./isspam
Example output
Output example made by isspam.
File: ./spam/example_spam3.txt
Capitalized words: 39
Sentences: 20
Words: 420
Numbers: 1
Forbidden words: 15
<0:recovery>
<1:techie>
<2:https>
<3:digital>
<4:hack>
<5://>
<6:com>
<7:@>
<8:crypto>
<9:bitcoin>
<10:whatsapp>
<11:cryptocurrency>
<12:stolen>
<13:contact>
<14:understanding>
Word count per sentence: 21
Memory usage: 1 MB, 6.460 (re)allocated, 4.222 unqiue free'd, 0 in use.
Valgrind status
Valgrind output for isspam version.
Rust variant thinks it's too cool for memory checks afterwards.
Date: 2024-11-30
==58062==
==58062== HEAP SUMMARY:
==58062== in use at exit: 0 bytes in 0 blocks
==58062== total heap usage: 6,490 allocs, 6,490 frees, 2,343,156 bytes allocated
==58062==
==58062== All heap blocks were freed -- no leaks are possible
==58062==
==58062== For lists of detected and suppressed errors, rerun with: -s
==58062== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
| .gitea/workflows | |
| 12bitfloat_rust | |
| borded_cpp | |
| jest_rust | |
| not_spam | |
| retoor_c | |
| spam | |
| .clang-format | |
| .gitignore | |
| bench.py | |
| books.tar.gz | |
| Makefile | |
| README.md |