Archive

Archive for the ‘Research’ Category

Japanese Dependency Vectors

December 3rd, 2009 No comments

I’ve been working on a new project I call “Japanese Dependency Vectors” or “jpdv” for short.  It’s a program that generates  dependency based semantic vector spaces for Japanese text.  (There’s already an excellent tool for doing this with English, which was written by Sebastian Pado.)

However, jpdv still has a way to go before it works as promised.  So far the tool can parse CaboCha formatted XML and produce both a word co-occurrence based vector space and a slightly modified XML representation that better demonstrates the dependency relationships of the words in the text.  The next step is to use the dependency information to produce the vector space that I need.  Unfortunately, I only have until the end of next week to finish it, because I’m working on this as the final project in my NLP class this semester.  I also plan to use the vector spaces created by the tool to do word sense disambiguation for the SEMEVAL-2 shared task on Japanese WSD.

(The image included here was generated by jpdv as a LaTeX file from one of the sentences I’m using for testing.)

Replacing Emoji...
Replacing Emoji...
Replacing Emoji...
Replacing Emoji...

Emacs, Clojure, and Japanese

November 28th, 2009 No comments

This might be proof that I’m crazy:

Read more…