Specifications: Linux Distro - CentOS 7 CPU - 2 CPU's, 6 cores per CPU = 12 cores total RAM - 32Gb SAMBA - verison 4.6.2 10 users connected to the SMB share at any one time. Hi Guru's, I have a very strange situation and I am currently at a lost on where to go or how to troubleshoot. Basically, my CentOS server seems to run really slow only on a Wednesday. The only applications I'm running on it are: SAMBA - file sharing CrashPlan - for backups online VNC Server On Wednesday, users connected to the SAMBA share sometimes get disconnected from the share but more often, access to the files are really slow. I noticed that on Wednesday and when I get a call from the end user saying its slow, the load average can fluctuate and go up to 2.40. But seeing that I have 12 cores and it is only using 20% of its total number of cores when this happens, I can't see why this would be a factor. Am I wrong? Every other day (apart from Wednesday) the load average is between 0.5 - 1.1 and I get no reports of it being slow. The online back (Crashplan) doesn't run during the day so this cannot be causing the issue.There are no cron jobs created as well and no automated scripts running on the server from what I can tell. The only thing I can think of is that the SAMBA service is slowing the server down on Wednesday for some reason. How can I find out what process is using up the CPU and why does it spike only after mid-day on Wednesday? Many thanks in advance. Kind Regards GMSS
Hi Experts, Any advice on this conundrum? It doesn't make sense that the server would slow down even when the load average is just 2.5 when there are 12 cores (2 CPU x 6 Cores) Kind Regards GMSS
Install a monitoring software like munin on your server to analyze which resources get used over the time.
Hi Till, Many thanks for your reply. I will attempt to get this done. I was hoping if you could guide me to analysing my server live time and to see why even when having 12 cores, when the 1 min load average reaches 2.4, should the SMB process run slow for users connecting to it? Kind Regards GMSS
I don't offer remote login support, but you might want to contact Florian and ask him if he can take a look at your server directly: http://www.ispconfig.org/get-support/?type=ispconfig
Hi Till Thanks for this and that's good to know. Sorry for the confusion but what I meant by "guide me and live time" is to perhaps let me know what commands/steps (apart from installing Munin) I could use to try and troubleshoot this conundrum myself. Kind Regard GMSS
Munin writes graphs of many relevant system parameters like load, io usage etc. let it run for a few days and then check the graphs to see if there are spikes in any of the graphs e.g. in the io graph, when you experience the slowing down.
Hi Till, Ok and will attempt to do so later in the week. Today is the day where I get users connecting to the SMB share say that it is running slow. So i will monitor "Top" and see what is eating the CPU in the interim period. Kind Regards GMSS
Hi Till, We've now managed to find out what's causing our server to slow down on Wednesday. Here is what we did: Typed in: systemctl list-timers at the command prompt and this is what was returned: It looks like there is process setup to clean up the tmp files which starts 8.50pm on Tuesday and doesn't finish till about 24hrs later on Wednesday at 8.50pm. So this is what is slowing down to server and causing the end user to slow down. I've now stopped this process by running the command: systemctl stop systemd-tmpfiles-clean.timer And the load is looking much better and users are much happier with the performance. Obviously this schedule is important to run on the server but can you tell me how I can reschedule this so that it runs on a Friday night instead? Kind Regards GMSS
Hi Till, Just to inform you and everyone that the previous thread was just a red haring, this is NOT what is causing our server to slow on Wednesday. I've now found out what the real cause is. It relates to "Patrol Read Mode" on a Dell server. By default the "Patrol Read Mode" on our Dell R510 is set to "Automatic" and it runs between the hours of 12pm-10pm on Wednesdays. This process reoccurs again 7 days later and is definitely the cause of why the server crawls only on Wednesday. Here is the link to see what this process does: http://www.dell.com/support/article/us/en/19/sln292303/dell-perc-controller-disk-patrol-read?lang=en I found this out by searching the logs (/var/log/messages) and running the command: more messages | grep "Server_Administrator" | more It then showed me that "Patrol Read" starts running at 12pm and ends at 10pm. At that point I knew that I'd nailed this conundrum. I've now changed this to run manually instead of automatic. To do this you will need to install Dell OMSA. Here is a snippet of the option in OMSA: This has really taken me a long time to figure out and do hope it helps someone out there who may be having the same issue I had. Kind Regards GMSS