SQL Server Agent Starts and Stops Immediately (Event ID 103)

If you are facing an issue where SQL Server Agent Service starts and stops almost immediately, then here are these following steps that you might want to follow or look at-

1) Verify the SQL Server and Agent log file, and look for errors. Most of the time, you will get some clue where to start troubleshooting from.
2) Verify the Service account under which your SQL Server Agent service is running.
3) Have a look at the permissions set at folder level (A.k.a ACL Permissions) for this particular account. Give full control to service account on the directory where MSDB is hosted. In ideal conditions, it should be given full admin privileges to the complete folder directory where MS SQL Server setup/DB’s are hosted.
4) Have a look at MSDB Size, and try to shrink the DB if there is a possibility to gain some space. Also, look at the number of Virtual Log Files (DBCC LogInfo). Based on the past experiences, having large number of VLF’s could also contribute to such errors (Agent starts and stops etc.).
5) Agent XP’s server level configuration parameter should also be enabled via sp_configure.

When all these steps are of no help to you, then verify the Network Level Settings within SQL Server Configuration Manager.
Ensure that TCP/IP protocol is enabled, and if Shared Memory and Named Pipes protocol is disabled due to constraints in your respective environment, then have a close look at the Aliases Configuration.

In my case, there was a problem with the Aliases Configuration Setup. Named Pipes protocol was set for it, while due to environment and security constraints team had earlier disabled Named Pipes Protocol.

I changed it to TCP/IP and mentioned the customized port and BINGO, SQL Agent started and then resumed to be in consistent/healthy state.

Hope this helps you. Happy learning!

Cheers!
Aman Kharbanda

Advertisements

Repairing MySQL Replication (Error 1396)

There are times when MySQL replication stops when you run certain updates on the master and the slave fails. If you have set up MySQL replication, then you probably know this problem. At times there are invalid MySQL queries which cause the replication to stop and you start getting these issues-

Sample-
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: *******
Master_User: repl_usr
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: jprd-binlog.000047
Read_Master_Log_Pos: 521059438
Relay_Log_File: replication-relay-bin.000040
Relay_Log_Pos: 466229228
Relay_Master_Log_File: jprd-binlog.000047
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: facebook,wordpress,yahoo,google
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1396
Last_Error: Error ‘Operation CREATE USER failed for ‘test_usr’@’10.10.10.10” on query. Default database: ”. Query: ‘CREATE USER ‘test_usr’@’10.10.10.10’ IDENTIFIED BY ‘test123”
Skip_Counter: 0
Exec_Master_Log_Pos: 466229077
Relay_Log_Space: 521060120
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1396
Last_SQL_Error: Error ‘Operation CREATE USER failed for ‘test_usr’@’10.10.10.10” on query. Default database: ”. Query: ‘CREATE USER ‘test_usr’@’10.10.10.10’ IDENTIFIED BY ‘test123”
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)

We have to follow the below mentioned steps to overcome such issues-
1) Skipping the Error Msg in Slave’s CNF file (Never suppress or skip all message codes).This is just to skip duplicate errors.
2) Setting the Global SQL Slave Skip Counter Value to 1.

Locate the my.cnf file, and include the parameter (–slave-skip-errors=1396) and save it again.
Next step is to Stop the slave (Stop Slave;) and then run this statement – (SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;).
Start the slave again (Start Slave;)

Verify the slave status, and you should be good now. You will notice that slave has now slowly and gradually started catching up with the Master.

For more details on Replication Slave Options and Variables, you can browse thru’ this link 

Happy learning..

Cheers!
Aman Kharbanda

Starting MySQL…….Manager of pid-file quit without update[FAILED]

Imagine on a Friday afternoon, when you want to leave office a bit early and you are informed of an unimaginative error such as the subjected one. Your MySQL DB Server was running smoothly and out of nowhere, you get a ticket stating MySQL Service is unavailable. There could be many reasons why the service is not starting at the first place. There was sufficient space on all the file systems so space constraint was not an issue here. Then, you need to verify if all the permissions are intact or not. Next helpline could be your error log and when you read the error log, you get following details-

InnoDB: Unable to lock ./ibdata1, error: 11
InnoDB: Check that you do not already have another mysqld process
InnoDB: using the same InnoDB data or log files.
141106 20:41:20 InnoDB: Retrying to lock the first data file
InnoDB: Unable to lock ./ibdata1, error: 11
InnoDB: Check that you do not already have another mysqld process
InnoDB: using the same InnoDB data or log files.
InnoDB: Unable to lock ./ibdata1, error: 11
InnoDB: Check that you do not already have another mysqld process
InnoDB: using the same InnoDB data or log files.
141106 20:41:23 InnoDB: Shutdown completed; log sequence number 882 2873506614
141106 20:41:23 [Note] /usr/sbin/mysqld: Shutdown complete

141106 20:41:23 mysqld_safe mysqld from pid file /var/lib/mysql/*******.pid ended
141106 20:41:23 InnoDB: Started; log sequence number 882 2873506614
/usr/sbin/mysqld: File ” not found (Errcode: 2)
141106 20:41:23 [ERROR] Failed to open log (file ”, errno 2)
141106 20:41:23 [ERROR] Could not open log file
141106 20:41:23 [ERROR] Can’t init tc log
141106 20:41:23 [ERROR] Aborting
Referring to these logs, it looked to be an issue with the log or data files. I decided to delete the ib_logfile0 and ib_logfile1 files, and tried restarting MySQL Service once again but it was still throwing the same error.
(Note that once MySQL Service is restarted, these log files shall be created again).

Another area to look for is to rename your my.cnf file and try restarting the service again. It was of no luck for me.

Next, I tried renaming the data file (Be very cautious here- you need to take a backup of the original file, later rename the actual file). Voila! I was able to restart MySQL without any issues.

Hope this helps you.. Happy learning.

Cheers!
Aman Kharbanda