Tuesday, January 6, 2015

Ceph Pi - Adding an OSD, and more performance

So one of the most important features of a distributed storage array is that...  being distributed you should be able to expand it quickly in terms of size as well as  get better throughput and performance as you add more nodes..

We are going to add another Raspberry Pi + USB attached HDD ( Pi1) as an OSD.

This will make our cluster look like

  1. Monitor, Admin, Ceph-client - ( x64, gig port, hostname: charlie )
  2. OSD ( Rapberry, 100Mbp eth, USB attached 1TB HDD, pi2)
  3. OSD ( Rapberry, 100Mbp eth, USB attached 1TB HDD, pi3)
  4. New OSD ( Rapberry, 100Mbp eth, USB attached 1TB HDD, pi1)
Attaching another OSD is pretty easy! (note - I am using Pi1, which was already initialized in previous guides, so there are some steps missing here)

http://millibit.blogspot.com/2014/12/ceph-pi-installing-ceph-on-raspberry-pi.html

From the Monitor/Admin node (nor running on my x64 box) we run
 ceph-deploy install pi1  
 ceph-deploy osd prepare pi1:/mnt/sda1  
 ceph-deploy osd activate pi1:/mnt/sda1  



Here is the performance of the cluster while the OSD was being added.  Its a pretty complex graph, but it does cover all the KPIs we are tracking.  

Saving data to the ceph-cluster

We are getting measurably better performance from the 3 OSD ceph-cluster.  The throughput is at about 44Mbps.

This compares very favorably to the 2 OSD test which run at about 30Mbps.  Here is the data as a reminder

The Raspberry Pis had their CPUs peaked pretty flat, so I am not sure we can squeeze a whole lot out of them.

Here is the ethernet port utilization on all the OSD Raspberrys.


Loading data from the ceph cluster

Performance when loading data from the ceph cluster was surprisingly lower than writing the data!  It was stuck at 40Mbs
  As a matter of fact we are running at exactly same speed as we did when we had only 2 OSDs involved (reference the previous article).

Here is the graphs of the CPU on the Raspberry OSDs.

The utilization of the CPUs is pretty low and we have quite a bit of head room.

Finally here is the throughput from the individual network utilization from every OSD.



And just because it looks cool and matches perfectly the throughput of the ceph-client host, here is the stack line graph of the same data.

Conclusion.

Adding an extra OSD gave us a measurable boost on the writing end of the equation.  It did nothing for reading performance.  We did observe lower loads both on the OSD CPUs and network utilization, yet the numbers on the client were unmoved.  

I am at a loss as to how to explain this.  There is a core on the client that is doing an inordinate amount of Wait...  but why? Maybe it is network latency of some sort?  Thought I am on a gig switch that has all devices plugged in directly...  I do not know.

The next step (and probably the last in this series) will be measuring concurrent access.  See where that takes us.  

Ideas and requests are welcome.