Natural Language Processing

First Release of Japanese Dependency Vectors

At the end of last semester I finished the first version of Japanese Dependency Vectors (jpdv). I had to give up on using Clojure at the last minute because it was taking me too long to make progress and I needed to have some sort of a working system to turn in for my NLP final project. To accomplish this I rewrote jpdv in Java. It took me about 18

Japanese Dependency Vectors

I've been working on a new project I call “Japanese Dependency Vectors” or “ jpdv” for short. It's a program that generates dependency based semantic vector spaces for Japanese text. (There's already an excellent tool for doing this with English, which was written by Sebastian Pado.) However, jpdv still has a way to go before it works as promised. So far the tool can parse CaboCha formatted XML and produce both a word co-occurrence based vector space and a slightly modified XML representation that better demonstrates the dependency relationships of the words in the text.