paperwork - 使用扫描仪和OCR转化纸质文件的简单方法
Paperwork
Description
Paperwork is a personal document manager for scanned documents (and PDFs).
It's designed to be easy and fast to use. The idea behind Paperwork is "scan & forget": You should be able to just scan a new document and forget about it until the day you need it again.
In other words, let the machine do most of the work for you.
Screenshots
Main Window
Search Suggestions
Labels
Settings window
Main features
- Scan
- Automatic detection of page orientation
- OCR
- Document labels
- Automatic guessing of the labels to apply on new documents
- Search
- Keyword suggestions
- Quick edit of scans
- PDF support
Installation
Contact/Help
Details
Papers are organized into documents. Each document contains pages.
It mainly uses:
- Sane/Pyinsane: To scan the pages
- Tesseract/Pyocr: To extract the words from the pages (OCR)
- GTK: For the user interface
- Whoosh: To index and search documents, and provide keyword suggestions
- Simplebayes: To guess the labels
- Pillow: Image manipulation
Licence
GPLv3 or later. See COPYING.
Archives
Github can automatically provides .tar.gz and .zip files if required. However, they are not required to install Paperwork. They are indicated here as a convenience for package maintainers.
- Paperwork 0.3.0.1
- Paperwork 0.3.0
- Paperwork 0.2.5
- Paperwork 0.2.4
- Paperwork 0.2.3
- Paperwork 0.2.2
- Paperwork 0.2.1
- Paperwork 0.2
- Paperwork 0.1.3
- Paperwork 0.1.2
- Paperwork 0.1.1
- Paperwork 0.1
Development
All the information can be found on the wiki
本文由用户 jopen 自行上传分享,仅供网友学习交流。所有权归原作者,若您的权利被侵害,请联系管理员。
转载本站原创文章,请注明出处,并保留原始链接、图片水印。
本站是一个以用户分享为主的开源技术平台,欢迎各类分享!