Universal text preprocessing and postprocessing for PPM using Alphabet Adjustment

K. Alhawiti, W.J. Teahan

    Research output: Contribution to conferencePaper

    Abstract

    In this paper, we introduce several new universal pre-processing techniques to improve Prediction by Partial Matching (PPM) compression of UTF-8 encoded natural language text. These methods essentially 'adjust' the alphabet in some manner (for example, by expanding or reducing it) prior to the compression algorithm then being applied to the amended text.
    Original languageEnglish
    Pages395
    DOIs
    Publication statusPublished - 26 Mar 2014
    EventProceedings of the Data Compression Conference, Snowbird, Utah, 26 - 28 March 2014 -
    Duration: 3 Jan 0001 → …

    Conference

    ConferenceProceedings of the Data Compression Conference, Snowbird, Utah, 26 - 28 March 2014
    Period3/01/01 → …

    Fingerprint

    Dive into the research topics of 'Universal text preprocessing and postprocessing for PPM using Alphabet Adjustment'. Together they form a unique fingerprint.

    Cite this