Java or Python distributed compute job (on a student budget)? -


I have a large dataset (C.40g) which is some NLP on some combinations (roughly emeric parallel) I want to use computers in the laboratory for which I am doing only 1G of Root Access and User Space for not . I experimented with the hop, but of course it died in water - the data is stored on external USB hard drives, and I can not load it in DFS due to the 1G user space cap. I'm looking at some of the dragon-based alternatives (as I am using NLT instead of Java's penis if I can help it), and it seems that the compute option is distributed:

< Ul>
  • Ipython
  • Disco
  • After experiencing my help, I am trying to ensure that I should try a suitable alternative - Which may be more suitable but any help is very high Appreciated Minister.

    Amazon's EC2 etc. is not really an option because I do not have a budget.

    Speak with the IT department in your school (especially if you are in college), if this assignment For research, I am sure you will be happy to give more disk space.


    Comments