Hey Guys,
I have a few questions about MPI and peoples ideas of the scope of this project more generally.
1. ) Is it possible, using MPI, to spawn processes on specific servers? The tutorials I have seen on line appear to indicate that you can specify the number of processes to spawn but not precisely where they will be spawned ( aside from the mpd.hosts file ), can this be set programatically?
2.) How is MPI's error handling and failure resilience? If a node fails mid-computation what happens? Or even pre-computation, is there an online way of knowing which nodes are available, or will MPI just not even attempt to launch on servers it perceives as down?
3.) Are we expecting to support persistence of distributed objects? This is important since "Big Data" is seldom ephemeral. People want to load their tables / data-cubes once. Of course we should also support a mechanism for temporary distributed objects for constructs like ifft(fft(X)) and for ad-hoc analysis / prototyping. Therefore I vote yes, that we should support persistent distributed objects, but just wanted to get other peoples views. This also motivates my thought that slave processes should launch as close to the data as possible a la. Hadoop.
4.) What are peoples opinions on other transport layer technologies like Google's protocol buffers or Thrift?
Regards,
Bipin