hitrate DHT

DHT data exchanges

What am I looking at?
On the right you see an animation of MaRVIN peers exchanging data (you may need to zoom in). The horizontal axis shows 100 different peers, the vertical axis shows 100 different keys. Each white dot at (x,y) means that peer X holds one piece of data with key Y. The brighter the dot, the more data this peer has.
What is the data exchange strategy?
In this animation, peers exchange data similarly to a DHT: based on a hash of the data, they pick the peer responsible for this key, and give him some all pieces of their data that "belong" to him.
What are the advantages of this strategy?
Such a DHT-like exchange are very efficient for our task. We want to reason with triples, which means that triples with a shared key should meet at one peer to derive a consequence. With these targeted exchanges, since all triples that share a key are sent to the same peer, their chance of meeting is maximal. Within a limited amount of exchanges (depending on the bandwith they have for sending and receiving data) all data items will be at the peer "responsible" for them, and will have met their "buddy" triples to produce a consequence.
What are the disadvantages of this strategy?
This approach ignores load balancing. Since keys in triples are very unevenly distributed (some terms are much more popular than others), some peers will be "responsible" for much more triples than others. This means that some peers will be overloaded and others will be underutilised.