Tools
These are some of the tools I’ve found for Japanese text analysis:
- CaboCha: “Yet Another Japanese Dependency Structure Analyzer” – The dependency parser used by the Japanese FrameNet project.
- MeCab: “Yet Another Part-of-Speech and Morphological Analyzer” – The part of speech tagger used by CaboCha.
- GoSen: A part of speech tagger and morphological analyzer for Japanese written in Java. This is a fork of Sen, which was a Java rewrite of MeCab. It is part of the Itadaki project.
- Kakasi – A tool to convert kanji to hiragana, katakana, or romaji.
- ChaSen – A morphological analysis tool for Japanese.
Add GoSen to the list. It works great. I haven’t tried its parent project, Itadaki.
http://itadaki.org/wiki/index.php/GoSen
@Nathan Glenn
Man, that would have made my research much easier, what with it being in Java and all. Thanks for the great link!