September 15, 2013
Some parts of the code is still a mystery to me, so I took some notes trying to port the C code into a python version and share it here. Here is the notebook and the source in my github. The current version
1. has NOT been tested so mistakes and bugs are expected
2. only reflects my own understanding of the code so all blames should be on me, : )
3. only covers certain parts such as exp_table, unigram_table, hashed_vocab, and cbow – I plan to finish skip-gram soon
4. made certain simplification so the focus of understanding the code will not be blurred.
5. It is by no means optimal in Python, though I am thinking of a parallel implementation in the future.
Anyone is more than welcome to comment, discuss, share, and contribute to the notebook, just to make the whole project taste more “open source”. So my code or understanding doesn’t make any sense, just let me know – because I am also a newbie on the journey of understanding the original code.
Enjoy!
my post in google group and another python implementation by Radim Řehůřek
Reply