Announcement
Starting on July 4, 2018 the Indonesian Publication Index (IPI) has been acquired by the Ministry of Research Technology and Higher Education (RISTEKDIKTI) called GARUDA Garba Rujukan Digital (http://garuda.ristekdikti.go.id)
For further information email to portalgaruda@gmail.com

Thank you
Logo IPI  
Journal > IAES International Journal of Artificial Intelligence (IJ-AI) > Effect of feature selection on small and large document summarization

 

IAES International Journal of Artificial Intelligence (IJ-AI)
Vol 3, No 3: September 2014
Effect of feature selection on small and large document summarization
Sakhare, Dipti Yashodhan ( MIT AOE,Pune,Alandi,Maharashtra)
Article Info   ABSTRACT
Published date:
30 Nov 2014
 
As    the   amount   of   textual   Information    increases,   we experience a need for Automatic Text Summarizers. In Automatic summarization a text document is reduced to a short set of words or paragraph that conveys the main meaning of the text. This paper focuses on extraction based summarization approach. The goal of text summarization based on extraction approach is sentences selection. The first step in summarization by extraction is the identification of important features. In our The pproach short stories and biographies are used as test documents. Each document is prepared by pre-processing process: sentence segmentation, tokenization, stop word removal, case folding, lemmatization, and stemming. With the important features, sentence filtering, data compression and sentences scoring is done. In this paper we proposed various features of Summary Extraction and also analyzed features that are to be applied depending upon the size of the Document. The experimentation is performed with the DUC 2002 dataset. The comparative results of the proposed approach and that of MS-Word are also presented here. The concept based features are given more weightage. We propose that use of the concept based features helps in improving the quality of the summary in case of large documents.
Copyrights © 2014