Greetings,
I imagine that you guys have plans to improve this subsystem of the
package. But, I have been having problems with it for months now and have
seen nothing in the way of the changelog regarding chunk allocator
improvements. I wonder that perhaps you could use some profile data from a
MUSH that is experiencing reproduceable issues? I admin on such a game.
The first problem we were experiencing is jittery CPU usage, varying
(according to top) in intervals less than 1 second between never less than
15% and often as much 80%, but averaging 30%-60%. On occasion, it would
jump up to >90% and subside a few minutes later, but during this time
players cannot send or receive data with the game. The cause of this was a
mystery for a while, until the server admin became concerned about resource
use and I started an investigation. I found that the cause was my setting
the chunk_cache_memory option to 2000000000 as indicated by the comments
for that option in mush.cnf to effectively 'incapacitate' the chunk
allocator. I did this originally to save myself from the headache of other
chunk allocator bugs that I've been introduced to in the past.
As a remedy, I changed the chunk_cache_memory option back to the default,
1000000, and the CPU freakage problem was immediately solved -- now the
game averages at most 0.9% according to top, as I expect it should.
Granted some would say this is not the most reliable way to examine
performance, but I hope that anyone would agree that a measure of 0.9% is
better than 60%.
However, where the game's CPU usage would jump up and down and occasionally
spike when the chunk_cache_allocator option was set to a really high value,
now it just spikes on occasion (every 5 to 7 days) and never comes back
down! When this happens, data cannot be sent or received from the game and
the only way to recover is to kill it and start over. Players have waited
as long as 10 hours for a response from the server before someone with
shell access could kill and restart it. This is separate mystery from the
first, because I can't actually reproduce this, and now I feel compelled to
predict when it's going to happen. So far I've noted that it mostly
happens during the weekend (or as last as Monday). I wonder that it has
something to do with player activity, but I can't be sure. Average number
of player connections for a weekend evening is 25-35 I suppose.
Aside from actually providing you with some actual information that may
help to fix problems that are related to the chunk allocator, is there a
sort of tool I can use to monitor response times for commands sent to a
MUSH -- one that would support sending me a notification if the response
time is over a certain limit and/or kill and restart the game? This would
be different from a network monitor tool because there is nothing wrong
with the host or its connection -- the problem is in the game process.
Sholevi (mrdyg@yahoo.com)
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com