| 注册
请输入搜索内容

热门搜索

Java Linux MySQL PHP JavaScript Hibernate jQuery Nginx
jopen
10年前发布

从文档(office,pdf,hwp)抽取文本的Java类库:JSearch

从文档(office,pdf,hwp)抽取文本的Java类库:JSearch。

Download & Installation

JSearch.jar
Just import JSearch.jar to your project

Requirement

  1. It should work with various types of document. ex) hwp, pdf, office
  2. It should support extract string and rapidly find keyword from doucments.
  3. It will be jar library.
  4. All functions are synchronous.
  5. a result of extraction contains full string.
  6. a result of finding contains word count.

Class

public class JSearch

JSearch supports various types of documents with open source engines.
And this library contains 3 types of functions. extract...() and isContainsKeyword...() and getFileList...()

HWP, DOC, PPT, EXCEL, TEXT, PDF and UNKNOWN are supported.

Modifier and Type Method and Description
static java.lang.String extractContentsFromFile(java.io.File target)
extract string
static java.lang.String extractContentsFromFile(java.lang.String filePath)
extract string
static java.util.List getFileListContainsKeywordFromDirectory(java.lang.String dirPath, java.lang.String keyword)
get a list of files which are containing keyword.
static java.util.List getFileListContainsKeywordFromDirectory(java.lang.String dirPath, java.lang.String keyword, boolean recursive)
get a list of files which are containing keyword.
static boolean isContainsKeywordFromFile(java.io.File file, java.lang.String keyword)
get true or false about containing keyword.
static boolean isContainsKeywordFromFile(java.lang.String filePath, java.lang.String keyword)
get true or false about containing keyword.

项目主页:http://www.open-open.com/lib/view/home/1439124196411

 本文由用户 jopen 自行上传分享,仅供网友学习交流。所有权归原作者,若您的权利被侵害,请联系管理员。
 转载本站原创文章,请注明出处,并保留原始链接、图片水印。
 本站是一个以用户分享为主的开源技术平台,欢迎各类分享!
 本文地址:https://www.open-open.com/lib/view/open1439124196411.html
Java开发 jsearch