Professional Documents
Culture Documents
Introduction
What I'm using
My place of employment is getting rid of a bunch of old Dell Optiplex 780s in a
computer refresh. Typically these would just go to our surplus depart to be sold for
cheap to anyone who wants one. Since none of this money ever makes it back to
our department, it's of little consequence to my higher ups whether they are sold or
repurposed.
So I have free range of several hundred EoL, but still modestly powerful, desktop
computers. So I've grabbed up four of them to work my way through the Ceph
evaluation instructions. Maybe this will be a valid way to re-purpose some otherwise
in-the-trash hardware, or maybe it will just be a learning tool for me.
Optiplex 780 Specs:
They came with 1TB, but our replacements, if the drive ever failed,
were often not 1TB
One disappointing thing is that the power supply in these units only
has one SATA power, so I can't hook up a second drive - at least not
easily.
Setup Process
I'm writing this as I go, and may or may not feel like editing it later, so bear with me
- this is very much a train-of-thought.
Note from the future: Setup has not been as quick as the quick setup guide would
lead you to believe, so I'm splitting into multiple posts. This post gets through the
very basic setup - getting a cluster with two OSDs to an "active+clean" state.
Further information on expanding the cluster, and setting up file shares coming
soon (is available here). I'm giving this a quick once-over now, but, barring any
glaring errors, will remain largely as it is
OS Install
I'm using CentOS 6.5 x86_64 - minimal installer. I'm using CentOS because that's
what I'm most familar with. However I hope to use btrfs (because this is an
experiment, and what's an experiment without experimental software), which
requires an updated Kernel, so I'm going to figure how to do the Kernel upgrade as
well, which I've never done before so that should be fun.
I'm using the minimal installer because GUIs are for jerks, etc. But mostly because I
don't want a bunch of unnecessary programs chewing up resources. I'm using 64bit
because, seriously who uses 32bit stuff any more. The processor is 64bit, but I'm
not positive the Dell MoBo/BIOS are truly 64bit. But either way it should be fine.
The only thing special I'm doing in the install process is leaving a large portion of
the drive unformatted to become the brtfs partition later. I'm also not creating a
swap partition because using HDD as RAM for a storage device seems a bit silly.
(CentOS ended up creating a tempfs parition anyway against my will - I'll probably
remove that when I can be asked).
Partition table ended up looking like this:
10 GB / ext4
8 GB /home ext4
OS setup
Most of these steps are fairly routine, but I figured I'd include them here just for
posterity sake - and maybe so it's more evident what I screwed up later when
something goes wrong.
vi /etc/sysconfig/network-scripts/ifcfg-eth0
#disabled netmanager on eth0
#set onboot to yes
service network restart
#eth0 is now up and has an ip
#get all the latest things
yum upgrade
#get dependencies for kernel upgrade
yum install gcc ncurses ncurses-devel
#create myself a user
useradd myuser
passwd myuser
#remove root from ssh permission
vi /etc/sshd/sshd_config
#change PermitRootLogin from "yes" to "no"
service sshd restart
#so I don't have to download and scp
yum install wget
#download kernel source
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.6.11.tar.bz2
#I'm using 3.6.11 here because that is what Ceph currently recommends -- "latest
in the 3.6 stable"
#to avoid redundancy I'll just post the link to the steps I'm follow for updating the
kernel
#http://www.tecmint.com/kernel-3-5-released-install-compile-in-redhat-centos-andfedora/
#Note: I had to install perl to get compiler to complete
yum install perl
#Once new kernel is installed reboot and press a key to during the "Booting Centos
in ...." screen to show the new 3.6.11 boot option
#I edited grub.conf to make 3.6.11 the default so I don't have to remember to
select it each boot
#Add ceph repo to you, follow instructions on ceph website - under "Redhat
Package Manager"
#http://ceph.com/docs/master/start/quick-start-preflight/
Cloning image
So, obviously I don't want have to do all the above to each machine (3 minimum) so
I want to clone the disk. But it's a 500GB drive, I don't want to wait for dd to run on
each machine. So I found a possibility here that I'm going to try. Theory is to fill the
empty part of the drive with zeros so that it can be compressed with gzip. This will
be easy with my existing partitions, but I guess I'll have to create a temporary
partition to zero-out the unused space. If I had thought of this before I could have
zero'd the disk before install, but live and learn I suppose.
So I created a partition, formatted to ext4, mounted to /temp and then issued
cat /dev/zero | tee -a /zero.txt /home/zero.txt /temp/zero.txt
to zero out all unused space on each partition. This took a long time. After that I run
this
rm /zero.txt /home/zero.txt /temp/zero.txt
dd if=/dev/sda bs=4M | gzip > /external/CephImage.gz
Where /external is an external drive I've attached to the machine to hold the image.
This also takes a long time. A little over 3 hours to be precise. But I ended up with
an image that was 3.3GB rather than 400GB - a significant savings. Seriously, that's
some ridiculous compression, I'm a little worried it's going to corrupt on the image
write... we'll see I guess.
Now I plug in a bare drive and begin the opposite process
dd if=CephImage.gz bs=4M | gunzip > /dev/sdc
where /dev/sdc is an unformatted bare drive I plugged in. This, again, will take
awhile. I'm actually wondering if this will take longer than just your standard DD,
because it now has to uncompress the whole thing and write it. But still worth it if it
means having 3GB image rather than a 400GB one.
A little longer, but not by much.
...and it boots!
It's not a great clone method; ~3 hours does not make for rapid deployment, but it
should suit my purposes here. I don't know that this actually saved any time over
the standard "dd if=/dev/sda of=/dev/sdc" but this does at least give me an image
backup in case something happens.
Setting Up Ceph
Many hours later I've got some cloned harddrive.
I install ceph deploy on my main machine now
yum install ceph-deploy
Boot up the first node (Ceph-Node1) and follow the Preflight Checklist to get it
ready. I've moved to a private network so I set the hosts up manually in the hosts
file. Then used ceph-deploy to install to each node
Ceph-Deploy new Ceph-Node1
Ceph-Deploy new Ceph-Node2
Ceph-Deploy new Ceph-Node3
At this point, as I went to set up the partitions with btrfs, I noticed I hadn't installed
the btrfs userspace programs. While support is built into the kernel for btrfs, the
programs to actually use it are not, so I had to install that on each node (since I'm
on a private/not-internetted network now, downloaded rpm from pkgs.org and used
a flash drive to get it to each node).
Recreated /dev/sda4 by deleting/readding with the full space of each drive (again,
this varies drive-to-drive based on what I had lying around). The used mkfs.btrfs to
format it. Edited /etc/fstab to make it mount on boot.
Hmmm, so looks like I followed the wrong page before. "Ceph-Deploy new" installs a
monitor node, so I purged everything and started over via instructions at the start of
the storage cluster quick start guide which I will be subsequently following.
So, now correctly, I do:
Ceph-Deploy new Ceph-Node1
#It knows the correct user for Ceph-Node1 via the ~/.ssh/config file
This creates a monitor node on Node1.
I'm not sure I'll ever understand how Linux user context works. Ceph-Deploy doesn't
like being run as root or with sudo, so I had to log in with a non-root account, then
run "su" to get permission to run it, but not "su -" so I'm still the other user but with
root permissions. Trying to just run as root gives errors (paradoxically) saying the
command must be run as root. This does actually make sense, it's the remote
machine that needs root, and for whatever reason ceph doesn't run remotely as
root if it's root locally.... anyway so now running:
Ceph-Deploy install Ceph-Node1 Ceph-Node2
Gives me an error that it can't get a valid baseurl for the repo. Fantastic. I'm trying
to set this up on a private, non-interneted network, and now it wants internet.
After some trying, I've decided a proxy is probably the way to go for this. Trying to
resolve all dependencies and download all requisite .rpm files myself is proving too
tiresome. Luckily I've set up proxy servers (with squid) before, so hopefully this
won't be too bad. I'm not going to post all the steps involved with that, there's squid
guides elsewhere and would just clutter this already cluttered post.
With proxy server setup, I've found the the ceph-deploy install does not appear to
respect http_proxy settings in the ~/.bash_profile (I say this because I can wget
things from the internet, but when ceph-deploy tries, it fails). So I've had to set