2015年9月2日 星期三

Cloudera Manager agent failed to connect to previous supervisor

Continue to another article written earlier, I am hitting another road block while installing Cloudera manager agent.

[10/Sep/2015 09:10:54 +0000] 19017 MainThread agent        ERROR    Failed to connect to previous supervisor.
Traceback (most recent call last):
  File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 1524, in find_or_start_supervisor
    self.get_supervisor_process_info()
  File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 1725, in get_supervisor_process_info
    self.identifier = self.supervisor_client.supervisor.getIdentification()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/xmlrpc.py", line 460, in request
    self.connection.request('POST', handler, request_body, self.headers)
  File "/usr/lib64/python2.6/httplib.py", line 914, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib64/python2.6/httplib.py", line 951, in _send_request
    self.endheaders()
  File "/usr/lib64/python2.6/httplib.py", line 908, in endheaders
    self._send_output()
  File "/usr/lib64/python2.6/httplib.py", line 780, in _send_output
    self.send(msg)
  File "/usr/lib64/python2.6/httplib.py", line 739, in send
    self.connect()
  File "/usr/lib64/python2.6/httplib.py", line 720, in connect
    self.timeout)
  File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
    raise error, msg
error: [Errno 111] Connection refused

So if you are seeing something like this which complain failure of Cloudera manager agent to provide the heartbeat and such, your probably run into a hostname issue. You may want to fix the hostname entry by referencing this link Check the hostname on the server and compare it with the one shown in the installer web GUI, if they are different then you probably want to follow below procedure to refresh the cached hostname (ref: link). It took me like an hour to figure out this painful workaround.

Installing on AWS, you must use private EC2 hostnames.
When installing on an AWS instance, and adding hosts using their public names, the installation will fail when the hosts fail to heartbeat.

Severity: Med

Workaround:

Use the Back button in the wizard to return to the original screen, where it prompts for a license.

Rerun the wizard, but choose "Use existing hosts" instead of searching for hosts. Now those hosts show up with their internal EC2 names.

Continue through the wizard and the installation should succeed.

沒有留言:

張貼留言