Hello dear ISPConfig developer team. Recently after updating ISPConfig from slightly older version to 3.0.4.2 one of my clients changed sites quota to 0 and instantly lost ALL files in the given site without any possibility of recovery. Huge community site was lost, and many other things. Years of work. Projects management team might be responsible for not making descent backups and I am not trying to be an a**hole here, but seriously? Is what happened here even possible and are there any chances of getting anything back? I know how Linux file recovery works and getting files back without file names is not really a deal closer. This is bad beyond belief...
Setting quota to 0 can not delete any files, so there must have happened somthing else on your server, maybe a problem on the harddisk, filesystem etc. Feel free to install a ispconfig server and set quota to any value you like if you dont believe me, it will not delete any files. Beside there are ten thousands of servrs running ispconfig and we received no report of dataloss on quota change. Changing quota does nothing else then setting a new quota limit in the linux quota database of the partition, it neither moves, copies or deletes files.
Yes I know it should not do that, but it did. Nothing besides changing quota was done. That is why I am sharing my frustration in this forum. It would suck if it happened to anyone else and I do understand that if the files where deleted, there is not much chance of recovery. Right after changing the quota, user directory looked like it was just created. All I am trying to understand here is why? edit: there might be a possibility, that my client is lying, but the thing is - he is not completely stupid. I don't think he deleted those files. If there is even a possibility of a bug doing that, it should be raided with tanks.
Did you set the quota yourself or did someone told you that he just changed the quota? I have to ask that before we try to check that more in detail as I heard it more then once that clients just said that they did a thing because they were ashamed that they did somtehing else which caused a problem. I have no idea what might caused that, iam sure that it can not be caused by the qota change. I will try to reproduce that but iam sure that it cant be reproduced by changing the quota. As i explained above, there are many installs in production, about 15 - 20thousand new installs per month, so if ispconfig would delete files on quota change, then you would find a few thousand posts about that here in the forum. Have you checked all subfolders in /var/www if the site is not somwhere there, especially all subfolders in the clients directors. Maybe someone changed the path of the site so that the domain symlink points to a new empty directory while the old site is still there.
As I wrote, I did not change it myself. Rather it was done by I client of mine whom I know personally and who is quite reasonably intelligent. Can any quota change trigger this at all? For example changing quota to -1? There can be millions of active installs not suffering from this but in reality there are no 2 equally configured deployments.
A quota change can not delete files, if you change the quota of a linux user as root to a value that is lower then the existing amount of files, then nothing will happen except that repquota reports the user as over quota and that the user will not be able to upload new files. There is no function in the linux quota system nor ispconfig that removes files when a user is over quota. Pleae check the server, e.g. With a full file search, if the site is in a different folder where you might not expect it. A empty site does not nescesarily mean that the site has been deleted, it can also mean that the site has been recreated in a new location because the site path was changed in a way that ispconfig could not recognize it, e.g on the shell, so the quota change triggered a update on the site, the scripts did not find the site in the directory that is set as site path an created a new empty site so that apache wont fail but the original site still exists in another path. Did your client told ou why he changed the quota? In most cases users change quota when they recive some kind of error becuse they think that the quota change will fix it. If there was uch a error, then this error might be the indication what happened to the site.
Thank You. Unfortunately files are nowhere to be found. I assume they are gone. It's either a diabolical interference or the client actually deleted his whole site. A sad day form me in either case. I will keep you up to date as of what really happened.
So we tested and replicated the situation. It actually happens when traffic (not disk) quota is changed to -1. Is this normal behavior? It is 100% tested and replicated.
I will check that and will try to reproduce that here. Changing traffic quota does nothing directly after the change as only the value in the database gets changed. All actions related to this field happen at 0:30 AM in the morning, if disk quota is exceeded, the symlink to the vhost file in /etc/apache2/sites-enabled gets removed.
I've just run some tests and 'am not able to reproduce that. Please see test steps below: - Installed ISPConfig 3.0.4.2 on Debian 6 - Created a new website with domain "test.tld", left all as default (traffic quota -1 is the default in ISPConfig). Then I created a file test.html so that I'am able to verify if the site gets deleted cd /var/www/test.tld/web touch test.html chown web2:client0 test.html ls -la /var/www/test.tld/web total 36 drwx--x--- 4 web2 client0 4096 Jan 11 11:19 . drwxr-x--x 6 web2 client0 4096 Jan 11 11:18 .. drwxr-xr-x 2 web2 client0 4096 Jan 11 11:18 error -rwxr-xr-- 1 web2 client0 7358 Jan 11 11:18 favicon.ico -rwxr-xr-- 1 web2 client0 26 Jan 11 11:18 .htaccess -rwxr-xr-- 1 web2 client0 1861 Jan 11 11:18 index.html -rwxr-xr-- 1 web2 client0 24 Jan 11 11:18 robots.txt drwxr-xr-x 2 root root 4096 Jan 11 11:18 stats -rw-r--r-- 1 web2 client0 0 Jan 11 11:19 test.html Then I changed the traffic quota to a different value, I used 1000, then checked the site again: ls -la /var/www/test.tld/web total 36 drwx--x--- 4 web2 client0 4096 Jan 11 11:19 . drwxr-x--x 6 web2 client0 4096 Jan 11 11:18 .. drwxr-xr-x 2 web2 client0 4096 Jan 11 11:18 error -rwxr-xr-- 1 web2 client0 7358 Jan 11 11:18 favicon.ico -rwxr-xr-- 1 web2 client0 26 Jan 11 11:18 .htaccess -rwxr-xr-- 1 web2 client0 1861 Jan 11 11:18 index.html -rwxr-xr-- 1 web2 client0 24 Jan 11 11:18 robots.txt drwxr-xr-x 2 web2 client0 4096 Jan 11 11:18 stats -rw-r--r-- 1 web2 client0 0 Jan 11 11:19 test.html Then I changed the traffic limit to -1, waited a few minutes and tested again: ls -la /var/www/test.tld/web total 36 drwx--x--- 4 web2 client0 4096 Jan 11 11:19 . drwxr-x--x 6 web2 client0 4096 Jan 11 11:18 .. drwxr-xr-x 2 web2 client0 4096 Jan 11 11:18 error -rwxr-xr-- 1 web2 client0 7358 Jan 11 11:18 favicon.ico -rwxr-xr-- 1 web2 client0 26 Jan 11 11:18 .htaccess -rwxr-xr-- 1 web2 client0 1861 Jan 11 11:18 index.html -rwxr-xr-- 1 web2 client0 24 Jan 11 11:18 robots.txt drwxr-xr-x 2 web2 client0 4096 Jan 11 11:18 stats -rw-r--r-- 1 web2 client0 0 Jan 11 11:19 test.html The site is still there and the test file has not been deleted. Please provide me with detailed steps on how to reproduce that problem.
And please send me the debug output of the operation that removes a site on your server: http://www.faqforge.com/linux/debugging-ispconfig-3-server-actions-in-case-of-a-failure/
Please watch this video. It shows exactly what happens. And note that this was on updated ISPConfig installation. I will post log soon. edit: looks like it only happens to already existing vhosts which have non -1 traffic quota. Newly created vhosts have traffic quota set to -1 by default and changing it to some value and then changing it back to -1 does not cause this. This is output of server.sh when the files get deleted:
Seems as if your server uses the apache and nginx plugin at the same time while only one of these two should be enabled at a time and additionally the client which owns the website was changed which caused a change of the path. The path change in conjunction with the two enabled server plugins caused the problem, so the problem is not directly related to the traffic quota. Please post the output of: netstat -tap and ls -la /usr/local/ispconfig/server/plugins-enabled/
Ok, so the remaining question is how got this plugin activated. Did you or your client activate it or did the ISPConfig updater activate it. The ISPConfig updater and installer contain code that prevents that both plugins get activated at the same time even when apache and nginx are both installed. I tested this here with Debian by installing a nginx server beside the existing apache server and the updater enabled only the apache plugin and skipped the nginx plugin. So only one of the plugins was enabled at a time, never both together. I will redo the test with centos to see if that makes any difference.
Plugin was not activated by me or client. I have no idea what brought it up. For me the real question is - why such a situation causes such a terrible damage.
I tested it on CentOS and I'am not able to trick the ispconfig 3.0.4.x updater to enable both plugins, so I can not reproduce this on current versions. The only option that I can think of is that someone tried a downgrade of ISPConfig on the server so that the updater from a ISPConfig version < 3.0.4 (e.g. 3.0.3.3) was run on a server again that was updated to ISPConfig > 3.0.4 already. In your case, several things came together in a combination which is not common for a ISPConfig setup plus that there must have happened some kind of ISPConfig software downgrade attempt which explaines why no other users had this problem yet. A server where nginx and apache are installed at the same time. This is a combination of services that is not used by ISPConfig normally and does not exist as ISPConfig perfect setup guide. ISPConfig uses either nginx or apache but not both. Nevertehless we took care of this situation in the installer and updater of ISPConfig > 3.0.4 so that not both plugins were activated at a time. The problem itself is caused by the fact that the client of the website was changed and the nginx and apache plugins were both enabled. So both plugins tried to do the same thing after another, they moved the website from the old path to the new path on the harddisk and that failed for the second plugin as the site was moved already to the new location. I will add a check in ispconfig that checks at runtime and not just in the installer and updater that ensures that not both plugins are enabled.