[jira] [Created] (HBASE-18215) some advises about refactoring of rsgroup

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Created] (HBASE-18215) some advises about refactoring of rsgroup

JIRA jira@apache.org
chenxu created HBASE-18215:

             Summary: some advises about refactoring of rsgroup
                 Key: HBASE-18215
                 URL: https://issues.apache.org/jira/browse/HBASE-18215
             Project: HBase
          Issue Type: Improvement
          Components: Balancer
            Reporter: chenxu

recently we have Integrated rsgroup into our cluster,  after Integrated, found some refactoring points. maybe the points were not right, but i think there is a need to share with you guys.
# when hbase.balancer.tablesOnMaster configured, RSGroupBasedLoadBalancer should consider masterServer assignment first in balanceCluster, roundRobinAssignment, retainAssignment and randomAssignment
  do the same thing as BaseLoadBalancer
# why not use a local file as the persistence layer instead of rsgroup table.
in our implementation, we first modify the local rsgroup file, then load the group info into memory, after that execute the balancer command, everything is OK.
when loading do some sanity check:
(1) one server can not be owned by multi group
(2) one table can not be owned by multi group
(3) if group has table, it must also has servers
(4) default group must has servers in it
if sanity check can’t pass, give up the following process.work as this, it can greatly reduce the complexity of rsgroup implementation, there is no need to wait for the rsgroup table to be online, and methods like moveServers, moveTables, addRSGroup, removeRSGroup, moveServersAndTables can be removed from RSGroupAdminService.only a refresh method is need(modify persistence layer first and refresh the memory)
# we should add some group informations on master web UI
to do this, RSGroupBasedLoadBalancer should move to hbase-server module, because MasterStatusTmpl.jamon depends on it
# there may be some issues about RSGroupBasedLoadBalancer.roundRobinAssignment
if two groups both include BOGUS_SERVER_NAME, assignments.putAll will overwrite the previous data
# there may be some issues about RSGroupBasedLoadBalancer.randomAssignment
when the return value is BOGUS_SERVER_NAME, AM can not handle this case. we should return null value instead of BOGUS_SERVER_NAME.
# when RSGroupBasedLoadBalancer.balanceCluster execute, groups are balanced one by one, if there are two many groups, we can do this in parallel.

This message was sent by Atlassian JIRA