2006-06-21 Lon Hohberger * src/daemons/nodeevent.c: Don't use the rg thread refcount in node event handling (#194491) 2006-06-16 Lon Hohberger * src/daemons/fo_domain.c, groups.c: Get rid of compiler warnings * src/daemons/rg_state.c: Change clu_lock_verbose to use the NULL lock/convert mechanism offered by DLM to work around #193128 * src/daemons/restree.c: Apply patch from Navid Sheikhol-Eslami (navid at redhat.com) to fix #193859 * src/resources/fs.sh, clusterfs.sh, nfsexport.sh, nfsclient.sh, service.sh, svclib_nfslock: Finish up initial NFS workaround. 2006-05-23 Lon Hohberger * src/daemons/members.c: Zap pad fields on copy-out * src/daemons/main.c: Give notice if skipping an event because of locked services. Call the self-watchdog init function * src/daemons/watchdog.c: Add Stanko Kupcevic's self-watchdog from CVS head (fixes #193247) * src/daemons/groups.c: Add debug messages. Actually count resgroups during node transition handling * src/daemons/rg_state.c: allow failover of stopping services if the owner died (#193255) * src/utils/clustat.c: fix typo, misc. usability problems (#192999) 2006-05-16 Lon Hohberger * src/resources/nfsclient.sh: Fix 189218 - nfsclient not matching wildcards correctly when checking status. Allow disabling of recovery for services where the nfs clients are ordered (this will cause a full service restart, but works) * src/resources/clusterfs.sh, fs.sh, svclib_nfslock, service.sh: Implement rudimentary atomic bomb-style NFS lock reclaim handling Needs compatible and correctly configured version of nfs-utils installed and running on the system. For clusterfs.sh, ensure that we flush buffers during service tear-down - regardless of whether or not we unmount the file system. * src/utils/clunfslock.sh: HA-callout program (/usr/sbin/clunfslock) for use with the rpc.statd -H parameter. Copies the client to all cluster-managed mounted file systems so that it will get lock reclaim notification on failover. 2006-05-09 Lon Hohberger * include/list.h: Prevent dereferencing curr if it's null for some reason * include/resgroup.h: Clean up alignment, add rgmanager lock/unlock message types * src/daemons/Makefile: Add nodeevent.o to the build for rgmanager * src/clulib/msgsimple.c: Misc code path cleanups * src/clulib/vft.c: Add local reads for fast clustat operation. * src/daemons/groups.c: Count all resource groups for all nodes in one pass, rather than one node per pass. Split queueing of status checks off so we never block the main thread. Mark services which have autostart=0 in the config as "disabled" to help remove confusion between "disabled", "stopped", and the no-longer-needed "stopped but behave like disabled" states. bz #182454 / #190234 / #190408 * src/daemons/fo_domain.c: Add patch from Josef Whiter to implement no-failback option for a given FO domain - bz #189841 * src/daemons/main.c: Queue node events for another thread to handle, so we never block the main thread. Also, implement cluster-wide service lock/unlock feature from clumanager 1.2.x - bz #175010 * src/daemons/nodeevent.c: Split out node event queueing / handling in to a separate thread so the main thread does not block * src/daemons/rg_state.c: Return error codes if resource groups are locked. * src/daemons/rg_thread.c: Fix assertion failure causing segfault in extremely rare cases. Quash the rg queue during shutdown. - bz #181539 * src/daemons/rg_state.c: Add fast local service state query to reduce unnecessary lock contention * src/daemons/groups.c: Handle request for expedited information from clustat. * src/daemons/main.c: Pass arg1 to send_rg_states() to enable fast clustat operation. * src/resources/fs.sh: Implement user/group quota support if enabled in the file system options * src/utils/clustat.c: Misc. error handling. Add single service / member output and add -Q to the help information. #185952. Added -f flag. * src/utils/clusvcadm.c: Implement client-side of #175010 * src/utils/clustat.c: show transition time in clustat -x - bz #191398 * src/resources/fs.sh: enable user/group quotas if enabled in the options attribute - bz #191182 * init.d/rgmanager: fix typo - bz #191205 ------------- 2005-03-21 Lon Hohberger * init.d/rgmanager, Makefile: Fix up init script and add Makefile so that the init script is properly installed #142754 * src/daemons/*: Fixes for #150344, #151187: Relocate to same node returns failure, hang during shutdown if user relocate is in-flight. Fix service state getting stuck in "recoverable" on fail-to-start scenarios where other nodes failed (or no other node was available) Rename "resourcegroup" to "service" to be consistent with UI * src/resources/fs.sh, clusterfs.sh: Fix #151077: Force unmount broken * src/resources/netfs.sh: Fix #151091: netfs status broken * src/resources/resourcegroup.sh, service.sh: Remove resourcegroup, rename to service.sh 2005-03-14 Lon Hohberger * src/resources/clusterfs.sh, fs.sh: Make clusterfs actually work. Clean up fs.sh + clusterfs.sh "status" when mount reports none/ devpets/usbdev/etc. * src/daemons/test.c: Add a 'rules' test function for printing resource rules to stdout. * src/daemons/reslist.c: Fix 151095 2005-03-07 Lon Hohberger * include/resgroup.h: Add STOP_USER so we can handle user STOP (instead of just DISABLE) requests. #150333 * src/resources/fs.sh: umount should umount mount points, not devices. Handle symlinks to file system block devices. #150481 * src/clulib/rg_strings.c: Add user stop string. * src/clulib/gettid.c: errno fix from trunk * src/clulib/vft.c: Connect timeout extension for VF * src/daemons/main.c: Separate connect + login. GuLM doesn't know about SGs. * src/daemons/rg_state.c: Change stop handling. Add generic recover function. * src/daemons/rg_thread.c: Add support for RESTART, USER_STOP. #150330, #150333 * src/utils/clusvcadm.c: Use USER_STOP to signal a user-called stop. #150333 2005-03-02 Lon Hohberger * include/clulog.h: Change default log level to INFO * include/resgroup.h: Add proto for "best_target_node" * src/clulib/clulog.c: Change log facility to LOG_DAEMON to match other cluster daemons (e.g. lock_gulmd) * src/daemons/groups.c: Add best_target_node, count_resource_groups. Implement missing autostart-disable feature and requested exclusive resource group feature. Store configuration view number so we can tell when the configuration changes. * src/daemons/main.c: Print node state transition messages before calling node_event(). Use do_status_checks() so we don't try to check services we're not running. Bump periodic status event queueing to 10 seconds instead of 5. Poll ccsd for config updates since we have no other way to find them. Fix bug preventing status checks when clustat -i 1 is running. * src/daemons/rg_state.c: Fix handle_relocate_req so that it uses best_target_node() correctly. Leave services which failed on all current nodes as 'stopped', so the next node transition will cause us to try to restart it automagically. Consider recovery policy when taking recovery action. * src/daemons/rg_thread.c: Use recovery routine instead of start. * src/daemons/restree.c: Fix tree delta updates. * src/resources/resourcegroup.sh: Add 'exclusive' parameter. Change 'autostart' to a boolean instead of string. Add recovery policy parameter. * src/utils/clusvcadm.c: Make "relocate to node X" work. 2005-02-28 Lon Hohberger * errors.txt: Remove random whitespace at the bottom. * include/resgroup.h: Add do_status_checks proto * include/rg_queue.h: Remove __ definitions so as not to conflict with glibc internals. * include/vf.h: Increase VF_COORD_TIMEOUT to something reasonable. * src/daemons/groups.c: Add do_status_checks(). We were previously queueing status checks for RGs that we didn't own. Not useful. * src/daemons/main.c: Fix for #149410. * src/daemons/rg_state.c: Fix various failover service problems. * src/resources/script.sh: Remove "recover" from generic script wrapper. ======================================================================= 2004-09-23 Lon Hohberger * include/reslist.h: Add needstart/needstop/common flags for reconfiguration. Added RS_CONDSTART/CONDSTOP to perform "stop if needed/start if needed" operations after a resource [group] reconfiguration. Cleaned up structures. Added NO_CCS define for testing. * src/daemons/fo_domain.c: Added NO_CCS defines for testing. * src/daemons/main.c: Added reconfigure() stub function. Added testing support. * src/daemons/reslist.c: Added comparsion + primary-attr functions for resources. Add printout of needstart/stop in resource dump. * src/daemons/restree.c: Added resource list comparsion + resource tree comparison functions. Added condstart/stop to ops list. Added CONDSTART/CONDSTOP handling in res_op. * src/daemons/rg_locks.c: Added NO_CCS support for testing. 2004-09-13 Lon Hohberger * include/resgroup.h: Add a default check interval. * include/reslist.h: Add a recover operation, and put operations and checks together in each instantiated resource structure. * src/daemons/groups.c: Don't use the old rg_status() func -- its internals have changed. * src/daemons/reslist.c: Duplicate the action structure of a parent resource type into an instantiated resource. * src/daemons/resrules.c: Find the actions with the correct path. * src/daemons/restree.c: Add depth parameter to res_exec. Add do_status - find the highest check/status level to perform given the elapsed time since another status operation was performed. Add a reference count each time a resource is started on a node. * src/daemons/rg_thread.c: Implement periodic status checks. Currently (in contrast to clumanager 1.2), these status checks are automatic and not configurable. * src/resources/*: Misc updates re: check intervals, new parameters, etc. 2004-09-07 Lon Hohberger * src/resources/group.sh: Add 'autostart' parameter to group entity * src/daemons/*: Add support for OCF 'action' specifications. 2004-08-30 Lon Hohberger * src/resources/*: Add status/monitor actions to metadata * include/list.h: Update to fix compiler warnings. This is not complete; it's better to add a 'field' to structures requiring list specs. * src/clulib/vft.c: Remove unnecessary pthread locks. * src/daemons/*: Misc. code cleanups. 2004-08-12 Lon Hohberger * global: prepare for RPM build