2011-10-13

Strange MySQL (5.0) issue with authentication

Posted in Computers, PlanetDebian at 07:47 UTC (+0000) by sven

Dear Lazyweb….

I ran into a very strange issue with MySQL authentication. The initial situation is as follows:
Two MySQL servers, let’s name them a0.my.do.main and a1.my.do.main. Their my.cnf configuration is identical except for the server_id (equals 1 on a0 and 2 on a1).
Since I wanted to set up a master< ->master replication, I created a user “repl@%.my.do.main” on a0, then shut down the mysql server on both hosts, copied over the mysql database on the filesystem layer (i.e. rsync of the data directory), then started mysql on both hosts again. So far, so good, both servers came up nicely, and local (unix socket) login as a root user worked fine as expected. However:

Logging in via TCP socket only works from a1 (to both a0 and a1). Logging in via TCP socket from a0 doesn’t work to either host. I also created a test user “sven”@”%.my.do.main” (I also tried with @”%”) which has all privileges.
Here is what the commands output:

sven@a0 ~ > mysql -u sven -ptestpass -h a0.my.do.main
ERROR 1045 (28000): Access denied for user 'sven'@'a0.my.do.main' (using password: YES)
sven@a0 ~ > mysql -u sven -ptestpass -h a1.my.do.main
ERROR 1045 (28000): Access denied for user 'sven'@'a0.my.do.main' (using password: YES)

sven@a1 ~ > mysql -u sven -ptestpass -h a0.my.do.main
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 32
Server version: 5.0.77-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> Bye
sven@a1 ~ > mysql -u sven -ptestpass -h a1.my.do.main
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 24
Server version: 5.0.77-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

All packages on both machines are up to date (CentOS5 with all updates available). The server variables shown with “show variables” also seem identical (except for the hostname and the server id). Besides, it looks more like a client problem to me than a server problem.

Last datapoint: If I define the mysql user “sven” with an explicit host (a0.my.do.main and a1.my.do.main) instead of a wildcard, everything works nicely, from and to both hosts.

So, dear lazyweb: Does anyone have an idea what might be causing this issue?

Update:

Due to some comments, let me add a few points of information:

  1. flush hosts doesn’t help. Host resolution looks OK, according to documentation, the hostname in the error message is what the server things is right, not what the client specified, and that documentation seems to be consistent with my experiences.
  2. The “db” and “user” tables look as expected. And I’m quite sure they are not the issue here, as I can connect from other hosts that match the host pattern.
  3. There is no ~/.my.cnf on either host
  4. Copying the database files on the filesystem layer is one of the documented ways of initializing the database on the slave for replication. Anyway, I had the same issue when trying to set up the database on the slave by loading a dump.
  5. DNS looks fine, both forward and reverse mappings, both servers use the same DNS server.
  6. IPv6 is not involved, it is disabled on these servers

Thanks for the hint at serverfault.com, I will try that if I don’t get sufficient info here.

Update 2

Another strange data point:
When I disable the (non-privileged) anonymous access user in the mysql.user table (i.e. drop the rows with user=”"), everything works as expected from both servers. However, why does it work from one host (client) to both DBs if the anonymous access is allowed (with no privileges except for being able to see that the test and mysql databases exist), but not from the other host (client)? It really looks like an issue on the client side, but I still don’t get what exactly goes wrong.