Storage Server

BobGeorge · Jul 26, 2017

Okay, thanks to Till's help, I've set up ISPConfig but my desired configuration is somewhere between the multi-server setup and the mirroring setup in the manual's installation instructions. A bit unusual, perhaps.

I've got a storage server with a RAID array, which is also serving as the DB server for the LAN. Following Till's advice, I've mounted a directory from the RAID to /var/lib/mysql, so that the databases of this DB server will have their files stored on the RAID.

I'd like to do the same sort of thing with websites and emails too. To that end, I've mounted another directory from the RAID to /var/www, so that the website files are stored on the RAID array and I've mounted yet another directory to /var/vmail to do the same with emails.

The overall concept is that the storage server is a "data server" for other "processing" nodes in the LAN.

An Internet request comes in and HAProxy will load balance these between, currently, two nodes (although, much the point with the architecture I'm going for is to make it easy to slide in more processing nodes later - 6 nodes, 10 nodes, etc. - and it'll all tick along just fine) and then these two nodes will get their data - websites, vhost configuration, emails, databases, etc. - from the storage server.

The "processing" nodes run the code, the "data server" provides the data.

(Well, again, I say "data server" but the other idea in separating storage from processing is that it'll eventually become "data servers" - an array of storage servers or SAN - and that's kind of why things have to be done this way, as the storage required will eventually become too much for the local storage of the processing nodes - of any singular server - and I'm planning ahead with this slightly more complex arrangement.)

I'm conscious, though, that ISPConfig is not really rigged up by default to operate this way. For example, when you create a new website, you're asked which web server to create it on, but as all the web servers and mail servers - the "processing" nodes - use the shared storage, then the website is effectively created on all of them (or none of them, depending on your perspective).

The website files are stored in the shared /var/www directory and then when a web server needs to serve up a website, it goes to this shared storage to find the files to serve up.

Of course, there's also the vhost configuration for Apache too. And what I've done is create a shared "etc" directory on the RAID - called "netc" for "net etc" - and then, on the local "etc", I've got symbolic links to the "netc" directory.

I've already done this with the "hosts" file. On every local node, the "/etc/hosts" links to "/netc/hosts", so they all see the exact same "hosts" file (and this, by the way, Till, was how I knew for sure that the IP addresses in the "hosts" file for every node in the LAN was correct and identical, as there is only one "hosts" file that the whole LAN shares).

Also, as I've mounted a shared "/home" on every node, I also made the "/etc/skel" local directory link to the "/netc/skel" network directory. Mapping the "/home" directory has been handy, as I've only downloaded ISPConfig once into the admin home directory and then can access it from any node.

Hopefully, you can understand what I'm going for. Via the storage server, you can access the websites, emails, databases, home directories, etc. from any node. Any of the processing nodes can serve up any of the websites or emails. I'm decoupling processing from storage, so that we can add more processing nodes later, if needed, and it'll still work. Or if we need more storage, I can expand the storage array and add more storage servers to build up a LAN.

It's modest to begin with, but we have a 42U cabinet and the intention is to eventually fill it. We already are running a modest little single server - I used Virtualmin for that - and, basically, our demand has justified going multi-server. And I want to plan ahead so that this setup can go all the way to a whole 42U cabinet full of servers before I need to revisit the architecture.

That's the plan. The question is how to pull it off. Are there any other directories - like "/etc/apache2/sites-*" for the vhosts - that I also need to account for and move to the storage server? How do I get ISPConfig to play ball with this kind of setup?

(Really, I could put it together manually, but it needs a public face, as we'll be having admin people adding clients and resellers (web designers) using the system to cater for their clients, and that's where ISPConfig comes into things. The nice and simple interface for everyone else to use, that hides all my horrible hacking behind the scenes. If it were just me dealing with this system, then I could do it the ugly way, but the general public needs to see "pretty".)

Yeah, I know. I don't ask for much, do I? ;D

till · Jul 26, 2017

Using a shared storage is fine for the data, but you should not use it for the configuration as the configuration is handled on each node individually. So sharing /var/www and /var/vmail is absolutely fine, but do not share other directories like /etc. ISPConfig takes care on writing the config files on the nodes, using a shared storage for the config will simpy mess things up.

If you would want to share e.g. /etc, then ISPConfig may not be installed on any of the slaves. Such a setup might be possible as well, but it's not an ISPConfig setup then. In such a setup, yo would simply have a single server and single server ispconfig installation and this single server setup stores everything on a shared system so that other processing nodes may access it. The problem with such a setup is that your processing nodes don't know when reloads of services are needed to apply config changes.

Croydon · Jul 26, 2017

I would highly recommend to NOT use a setup like you planned. You will surely suffer from very low speed of MySQL. Database storage is not suited for being served over network due to high latency.

BobGeorge · Jul 26, 2017

No, it's fine, I will not share "/etc" if it's a bad idea to do so.

Really, the only reason I was contemplating it was because I want to ensure that any and every processing node can access any website or email account.

But when I create a website in ISPConfig, there's a "server" field to choose which web server to create the website on. Except that, with my setup, I want it to create the website on the shared storage and be accessible to all the processing nodes.

And I guess what I was worried about is that if I chose "node0" as the web server when creating the website, then only node0's vhost configuration would be updated, so node1 would not know about the new website?

Is there a way to tell ISPConfig that this is a network setup, so that it understands that websites are not created on specific nodes, but are actually created on the storage server that all the web server nodes can access?

(There's also the "ipv4 address" and "ipv6 address" fields on the website creation form. Again, this is not relevant here.)

What I'm concerned with is that when you create a website, that this is reflected in the vhost configuration of all the nodes, so that any processing node can be chosen by HAProxy to serve up any of the websites on the storage server. And the same thing with emails too.

It's that "server" and "IP address" fields when I create a website putting me off, I guess. They suggest that the website is only being created and configured on a single server, while I actually need it created on the storage server and the configuration accessible to all the nodes.

till · Jul 26, 2017

What you are doing is a mirrored server and on a mirrored server, you can not select any of the mirror nodes in the server field after you configured the slaves as mirror in ISPConfig. try it.

BobGeorge · Jul 26, 2017

Croydon said: ↑

I would highly recommend to NOT use a setup like you planned. You will surely suffer from very low speed of MySQL. Database storage is not suited for being served over network due to high latency.
Click to expand...

Till has warned me.

The issue is that if, say, a client creates a Wordpress site, as an example, then most of the website data is actually in the MySQL database, not the /var/www website files (those only really contain the theme and Wordpress files). I also, so that all of the website data is stored together, for the databases to also be on the storage server.

This is both because the storage server's RAID array is going to be backed up elsewhere over the network. It wouldn't be much of a backup, though, if I could not fully restore all the website data - website files AND databases - to recover a website.

The other aspect is that the processing nodes only have fast but small SAS drives. Whereas, the storage server has a large storage capacity (24 slots or 12 hard drives of data, if one accounts for the mirroring). There will be a point when the local storage of the processing nodes just can't physically cope with it all. I'm planning ahead so that it doesn't matter from the off, as nothing of importance is permanently stored on the (perfectly expendable) processing nodes.

Viewing the whole system as "a computer made of computers", the processing nodes are the CPUs, the storage server is the hard drive and the network is the bus.

But, yes, MySQL is the headache here. Because, with the website files and emails, I am locally caching - using "cachefilesd" - the NFS share. So, when a website file is accessed, it's cached locally and then access is to the local file.

The idea is a kind of "moving window". The processing nodes do not have the storage capacity for everything on the storage server, but when they access files on the storage server, it's cached locally. Thus, if there's, say, 2 or 3 websites that are very popular and accessed all the time, then the files for those will actually end up in the local cache (and, by "least recently used", stick around every time they're accessed). But if there's a cache miss then it goes to the network and does the full round-trip to grab the files from the storage server. Though these are then stored locally, so subsequent accesses will be local (unless the file changes on the storage server - "cachefilesd" checks that with faster checksums - or it gets bumped out of the local cache by "least recently used").

Hopefully you can grasp the idea. Eventually, there will simply be more data on the storage server than any of the processing nodes could store locally. The storage server is dedicated to storage and has 24 slots (or 12 hard drives worth of data, once you account for mirroring), which the processing nodes simply can't match.

This is why I'm going, from the off, for a system where the data is all on the storage server and the local storage of the processing nodes is just OS, applications, temporary storage and cache.

BobGeorge · Jul 26, 2017

till said: ↑

What you are doing is a mirrored server and on a mirrored server, you can not select any of the mirror nodes in the server field after you configured the slaves as mirror in ISPConfig. try it.
Click to expand...

Ah, right. Yes, of course. That's obviously how it should be, now that you say.

till · Jul 26, 2017

Regarding MySQL: Why did you do not use one dedicated large MySQL server for this setup instead that the websites on all nodes connect to? I suggested that in the other thread already. A dedicated server which uses a local SSD to store the data. This will be several times faster than your shared storage solution for MySQL.

BobGeorge · Jul 26, 2017

till said: ↑

Regarding MySQL: Why did you do not use one dedicated large MySQL server for this setup instead that the websites on all nodes connect to? I suggested that in the other thread already. A dedicated server which uses a local SSD to store the data. This will be several times faster than your shared storage solution for MySQL.
Click to expand...

Basically, I'm still working on it. Not set up yet.

BobGeorge · Jul 26, 2017

Okay, the backend appears to be working.

I created a client, created a website, created a database for the website, used the APS installer to install Wordpress, created an FTP user, used Filezilla to access it and, yes, there were the Wordpress files. Then deleted it all.

(And I did directly look on the server itself to see that these files were appearing in the expected places, which they were.)

Such a simple thing to forgot to choose "mirror". That's all it was.

Granted, I need to do a lot more testing than that. But the preliminary tests sailed through smoothly and without issue, so that's a good sign.

Excellent. Thanks a bunch.

(Now I only have to tackle HAProxy and Heartbeat on the frontend. Wish me luck.)

BobGeorge · Jul 29, 2017

I've got it all up and running now. Thanks.

Just one minor issue, though.

I've got two "frontend" nodes that run HAProxy to load balance the work between the processing nodes. I've also got Heartbeat running on them, so that if the first node goes down, the second node can take over.

(By the way, the "howtoforge" tutorial on this is a bit out-of-date in that the syntax it gives for the HAProxy config file has been changed and HAProxy considered it invalid, refusing to start. But I was able to fathom out, from the HAProxy documentation on the 'Net, what I needed to change it to make it work.)

I also thought that these "frontend" nodes would be the place to install the ISPConfig interface, as these are the "front end" in the sense that they are the outward-facing part of the system. They take in the Internet requests and distribute them to the relevant parts of the cluster. So, essentially, the "manager" nodes, as it were.

But in testing HAProxy and Heartbeat - turning off nodes and confirming that, yes, the second "frontend" node takes over from the first and that, if one of the processing nodes is taken down, then HAProxy detects this with a health check and stops distributing work to it - I noticed that when the second "frontend" node takes over from the first, it is a "slave" interface as far as ISPConfig is concerned.

That is, when I go to "Monitor" in ISPConfig, it only lists itself, not the other nodes in the network.

I guess this makes some sense, in that when I set up ISPConfig, I made the first node the "master" and then, for the others, added them to the existing network. And, in terms of operation, I guess it makes little difference, in the sense that websites and emails will continue to be served.

I tried making the second frontend node a mirror of the first. But this doesn't make it into a "co-master".

Is it even possible to have "co-master" interfaces with ISPConfig?

I don't suppose it matters, though, in that if I see that it's the second node in control, that means the first one has gone down and I need to be fixing that to restore everything back to normal. So running from the second node is never meant to be a permanent situation anyway.

But I thought, if it is possible to have "co-master" nodes like that, then I'd do that. But, if not, then I can live with it.

till · Jul 30, 2017

BobGeorge said: ↑

Is it even possible to have "co-master" interfaces with ISPConfig?
Click to expand...

Yes, you may run as many ispconfig interfaces as you like. Just ensure that they connect to the master ispconfig DB.

Log in or Sign up

Storage Server

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

Croydon ISPConfig Developer ISPConfig Developer

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

BobGeorge Member

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

BobGeorge Member

BobGeorge Member

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

Share This Page

Log in or Sign up

Storage Server

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

Croydon ISPConfig Developer ISPConfig Developer

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

BobGeorge Member

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

BobGeorge Member

BobGeorge Member

BobGeorge Member

till Super Moderator Staff Member ISPConfig Developer

Share This Page

Useful Searches