Tools

December 3rd, 2010 Leave a comment Go to comments

These are some of the tools I’ve found for Japanese text analysis:

  • CaboCha: “Yet Another Japanese Dependency Structure Analyzer” – The dependency parser used by the Japanese FrameNet project.
  • MeCab: “Yet Another Part-of-Speech and Morphological Analyzer” – The part of speech tagger used by CaboCha.
  • GoSen: A part of speech tagger and morphological analyzer for Japanese written in Java. This is a fork of Sen, which was a Java rewrite of MeCab. It is part of the Itadaki project.
  • Kakasi – A tool to convert kanji to hiragana, katakana, or romaji.
  • ChaSen – A morphological analysis tool for Japanese.
  1. Nathan Glenn
    October 26th, 2010 at 21:37 | #1

    Add GoSen to the list. It works great. I haven’t tried its parent project, Itadaki.
    http://itadaki.org/wiki/index.php/GoSen

  2. October 26th, 2010 at 21:52 | #2

    @Nathan Glenn

    Man, that would have made my research much easier, what with it being in Java and all. Thanks for the great link!

  1. No trackbacks yet.