Heya all! For a customer I have installed MySQL cluster in a three node setup. One management node and 2 data/sql nodes. The OS is CentOS 6.0. This is the first Linux experience I have with MySQL cluster, before this I only did configurations like these on FreeBSD. For whatever reason the load on the SQL/data nodes is quite high, even when it should be idle. Can you guys help me figure out why the load is this high? The SQL/Data nodes config: Code: [MYSQLD] server-id=6 ndbcluster ndb-connectstring=10.0.0.11 # IP management server [MYSQL_CLUSTER] ndb-connectstring=10.0.0.11 # location of management server [NDBD] bind-address=10.0.0.1 The config of the management node: Code: # Options affecting ndbd processes on all data nodes: [NDBD DEFAULT] NoOfReplicas=2 # Number of replicas LockPagesInMainMemory=1 ODirect=1 DataMemory=48000M # How much memory to allocate for data storage IndexMemory=12000M # How much memory to allocate for index storage # For DataMemory and IndexMemory, we have used the # default values. Since the "world" database takes up # only about 500KB, this should be more than enough for # this example Cluster setup. TimeBetweenLocalCheckpoints=6 NoOfFragmentLogFiles=128 MaxNoOfOrderedIndexes=1024 MaxNoOfTables=512 MaxNoOfAttributes=5000 MaxNoOfConcurrentOperations=400000 datadir=/var/db/mysql # TCP/IP options: [TCP DEFAULT] #portnumber=2202 SendBufferMemory=2M ReceiveBufferMemory=2M [ndbd] HostName=10.0.0.1 NodeId=2 [ndbd] HostName=10.0.0.2 NodeId=3 [NDB_MGMD DEFAULT] DataDir=/usr/mysql-cluster # Management process options: #[NDB_MGMD] #HostName=10.0.0.10 #NodeId=1 #ArbitrationRank=1 # Management process options: [NDB_MGMD] HostName=10.0.0.11 NodeId=1 ArbitrationRank=1 # SQL node options: [MYSQLD] NodeId=4 hostname=10.0.0.10 [MYSQLD] NodeId=5 hostname=10.0.0.11 [MYSQLD] NodeId=6 hostname=10.0.0.1 [MYSQLD] NodeId=7 hostname=10.0.0.2 Output of 'top' on one of the data nodes: Code: top - 15:21:55 up 11 days, 59 min, 1 user, load average: 2.10, 1.85, 1.84 Tasks: 115 total, 1 running, 114 sleeping, 0 stopped, 0 zombie Cpu(s): 2.9%us, 0.2%sy, 0.0%ni, 81.2%id, 15.8%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 65334968k total, 33668620k used, 31666348k free, 177080k buffers Swap: 4094968k total, 0k used, 4094968k free, 677680k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31477 root 20 0 31.9g 29g 7360 S 12.3 48.1 726:06.24 ndbmtd 25920 root 20 0 201m 5332 1896 S 0.3 0.0 2:06.42 snmpd 1 root 20 0 19116 1444 1188 S 0.0 0.0 0:01.11 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:05.95 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 6 root RT 0 0 0 0 S 0.0 0.0 0:00.01 migration/1 7 root 20 0 0 0 0 S 0.0 0.0 0:02.99 ksoftirqd/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 9 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2 10 root 20 0 0 0 0 S 0.0 0.0 1:35.34 ksoftirqd/2 11 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 12 root RT 0 0 0 0 S 0.0 0.0 0:00.02 migration/3 13 root 20 0 0 0 0 S 0.0 0.0 0:03.15 ksoftirqd/3 14 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 15 root 20 0 0 0 0 S 0.0 0.0 0:00.51 events/0 16 root 20 0 0 0 0 S 0.0 0.0 0:01.44 events/1 17 root 20 0 0 0 0 S 0.0 0.0 0:06.41 events/2 18 root 20 0 0 0 0 S 0.0 0.0 0:00.74 events/3 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuset 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 21 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 24 root 20 0 0 0 0 S 0.0 0.0 0:00.02 sync_supers 25 root 20 0 0 0 0 S 0.0 0.0 0:00.06 bdi-default 26 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/0 27 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/1 28 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/2 29 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/3 30 root 20 0 0 0 0 S 0.0 0.0 0:03.29 kblockd/0 31 root 20 0 0 0 0 S 0.0 0.0 0:02.76 kblockd/1 32 root 20 0 0 0 0 S 0.0 0.0 0:04.01 kblockd/2 33 root 20 0 0 0 0 S 0.0 0.0 0:02.86 kblockd/3 34 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpid 35 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_notify 36 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_hotplug 37 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata/0 38 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata/1 39 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata/2 40 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata/3 41 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata_aux 42 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksuspend_usbd 43 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khubd 44 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kseriod 49 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd 50 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0 51 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 52 root 39 19 0 0 0 S 0.0 0.0 0:00.22 khugepaged 53 root 20 0 0 0 0 S 0.0 0.0 0:00.00 aio/0 54 root 20 0 0 0 0 S 0.0 0.0 0:00.00 aio/1 Snippet of a strace on the ndbmtd process: Code: epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {{EPOLLIN, {u32=7, u64=7}}}, 6, 1) = 1 recvfrom(30, "$\t\0\24\273\2 \0\242\17\1\1\377\377\377\377>\2\5\0\7\0\242\17\1\0\0\0\23\0\0\0"..., 2097152, 0, NULL, NULL) = 36 futex(0xaf5e18, FUTEX_WAKE, 1) = 1 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {{EPOLLIN, {u32=6, u64=6}}}, 6, 1) = 1 recvfrom(29, "$\t\0\24\273\2 \0\242\17\1\1\377\377\377\377>\2\5\0\6\0\242\17\1\0\0\0\23\0\0\0"..., 2097152, 0, NULL, NULL) = 36 futex(0xaf5e18, FUTEX_WAKE, 1) = 1 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 6, 1) = 1 recvfrom(31, "$\t\0\24\273\2 \0\242\17\1\1\377\377\377\377>\2\5\0\5\0\242\17\1\0\0\0\23\0\0\0"..., 2097152, 0, NULL, NULL) = 36 futex(0xaf5e18, FUTEX_WAKE, 1) = 1 epoll_wait(3, {{EPOLLIN, {u32=3, u64=3}}}, 6, 1) = 1 recvfrom(14, "$\t\0\24\273\2 \0\1\1\1\1{@7\6>\2\5\0\3\0\1\1\1\0\0\0\23\0\0\0"..., 2097152, 0, NULL, NULL) = 36 futex(0xaf5e18, FUTEX_WAKE, 1) = 1 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 epoll_wait(3, {}, 6, 1) = 0 Any help is appreciated!
This issue has been resolved. Turns out that with 4 cores the multithreaded solution is not quite stable, switched back to single threaded ndbd and all is good