Monday, January 5, 2015

Ceph Pi - ... and now for some production numbers!

Note: This is a follow up to this article.  This article will make a lot more sense if you read it after reviewing the previous one.

Copying to the Ceph Cluster


So here we are.  Ceph-client and Mon are installed on a micro x64 system.  2 OSDs are installed on 2 Raspberry Pis.  And here are the numbers.

We are running consistently in the 30 Megabit range copying to the Raspberry PIs.

The limiting factor is the CPU on the Raspberrys.  Here they are.

They are bellow 100, though.  Which is slightly puzzling.

For completeness here are are the network utilization numbers from the OSD RPis

...as well as the HDD utilization

Copying From the Ceph Cluster

Copying from the cluster is moving at a decent 40Megabits/second. 

The CPUs of the Ceph Nodes are not pegged.  I am not sure why we are not getting better performance.  There really does not seem to be a bottleneck anywhere in sight.

The disks are doing well too

Conclusion

While we are getting decent numbers, we are not pegging anything.  I am not sure why we are not getting better numbers.  Any clues would be REALLY appreciated.

The only clue I have is that there is pretty massive Wait time going on in the ceph-client node.  We are getting pretty much a whole core pegged.


Next Step

We are going to add one more Raspberry Pi OSD and observe the impact.  Hopefully we are going to get an increase in throughput.  

Again - comments and suggestions are very welcome!