With respect to hard drives, the acronym “SMART” stands for Self-Monitoring, Analysis and Reporting Technology. This was built into many ATA-3 and later ATA, IDE and SCSI-3 hard drives. Basically anything after about 2005 should have it.

Ubuntu/Debian:

sudo apt-get install smartmontools

CentOS/Fedora/RH:

sudo yum install smartmontools

Gentoo:

sudo emerge sys-apps/smartmontools

Wiki: http://sourceforge.net/apps/trac/smartmontools/wiki

smartctl

The program smartctl is used to interface with the SMART features on the drive firmware. Here are a couple of easy things to get started with (however some versions do not have the –scan option):


$ smartctl --scan -d ata
/dev/hda -d ata # /dev/hda, ATA device
/dev/hdc -d ata # /dev/hdc, ATA device
$ sudo smartctl --info /dev/hdc
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.33.1-xedvia] (local
build)
Copyright (C) 2002-11 by Bruce Allen,
http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.7 and 7200.7 Plus
Device Model:     ST3160023A
Serial Number:    5JS9MDKW
Firmware Version: 8.01
User Capacity:    160,041,885,696 bytes [160 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Thu Feb  7 09:27:18 2013 PST
SMART support is: Available - device has SMART capability.
SMART support is: Disabled

Note that the “SMART support” is listed as available but disabled. To enable full diagnostic checking turn it on with something like this:


$ sudo smartctl --smart=on --offlineauto=on --saveauto=on /dev/hdc
=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.
SMART Attribute Autosave Enabled.
SMART Automatic Offline Testing Enabled every four hours.

In theory this should only need to be done once and the drive should remember this (because of the saveauto directive). The offlineauto will cause automatic testing every 4 hours. In theory it will wait “nicely” if the drive is already busy so performance should not be seriously impacted.
Testing

Here’s a way to run a “short” off-line test. This tests electrical and mechanical performance of the drive and does read testing.


$ sudo smartctl --test=short /dev/hda
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Thu Feb  7 10:13:19 2013
Use smartctl -X to abort test.

$ sudo smartctl --log=selftest /dev/hda
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     43398        -

$ sudo smartctl --log=selftest /dev/hdc
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     37994         7234643

The first command starts the test off and it tells you to come back in 1 or 2 minutes. The second command shows how to query the log file to see if anything bad came up. In this case hda was fine (“Completed without error”) but hdc had a very important “read error”. Replace that drive ASAP!

[1] Install SYSSTAT


[root@dlp ~]#yum -y install sysstat
[root@dlp ~]#/etc/rc.d/init.d/sysstat start

Calling the system activity data collector (sadc):


[root@dlp ~]# chkconfig sysstat on

[2] The jobs getting system information are set like follows. Logs are stored in “/var/log/sa/sa**” and “/var/log/sa/sar**”.


[root@dlp ~]# cat /etc/cron.d/sysstat


# Run system activity accounting tool every 10 minutes
*/10 * * * * root /usr/lib64/sa/sa1 1 1
# 0 * * * * root /usr/lib64/sa/sa1 600 6 &
# Generate a daily summary of process accounting at 23:53
53 23 * * * root /usr/lib64/sa/sa2 -A

View RAM usage

# sar -r
12:00:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
12:10:01 AM    419648   1502728     78.17     66732    397336   1376580     38.26
12:20:01 AM    419276   1503100     78.19     66984    397428   1376580     38.26
12:30:01 AM    419152   1503224     78.20     67196    397488   1376580     38.26
12:40:01 AM    418904   1503472     78.21     67316    397528   1376580     38.26


view Swap usage

sar -S
12:00:01 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
12:10:01 AM   1607192     68068      4.06      8468     12.44
12:20:01 AM   1607196     68064      4.06      8464     12.44
12:30:01 AM   1607200     68060      4.06      8460     12.43
12:40:01 AM   1607208     68052      4.06      8464     12.44

It’s possible to output current or past status of system with “sar” command like follows.
[3] Output the past CPU usage from the log file.


[root@dlp ~]#
sar -u -f /var/log/sa/sa24


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/24/2013      _x86_64_        (2 CPU)

04:45:47 AM       LINUX RESTART

01:50:01 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
01:59:03 PM     all      0.00      0.00      0.02      0.09      0.00     99.89
02:00:01 PM     all      0.02      0.00      0.07      0.29      0.02     99.60
02:10:01 PM     all      0.00      0.00      0.01      0.05      0.00     99.94
02:20:01 PM     all      0.00      0.00      0.00      0.01      0.00     99.99
02:30:01 PM     all      0.00      0.00      0.00      0.01      0.00     99.99
02:40:01 PM     all      0.00      0.30      0.50      0.36      0.00     98.84
02:50:01 PM     all      0.00      0.00      0.00      0.02      0.00     99.98
03:00:01 PM     all      0.00      0.00      0.00      0.02      0.00     99.98
03:10:01 PM     all      0.00      0.00      0.00      0.04      0.00     99.95
Average:        all      0.00      0.04      0.07      0.08      0.00     99.82

03:19:29 PM       LINUX RESTART
...
...

[4] Output the current CPU usage for 3 times every second.


[root@dlp ~]# sar -u 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/24/2013      _x86_64_        (2 CPU)

05:29:12 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
05:29:13 PM     all      0.00      0.00      0.00      0.00      0.00    100.00
05:29:14 PM     all      0.00      0.00      0.00      0.00      0.50     99.50
05:29:15 PM     all      0.50      0.00      0.50      0.00      0.00     99.00
Average:        all      0.17      0.00      0.17      0.00      0.17     99.50

[5] Output the current Disk usage for 3 times every second.


[root@dlp ~]#sar -b 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

06:02:27 AM       tps      rtps      wtps   bread/s   bwrtn/s
06:02:28 AM      0.00      0.00      0.00      0.00      0.00
06:02:29 AM      0.00      0.00      0.00      0.00      0.00
06:02:30 AM      0.00      0.00      0.00      0.00      0.00
Average:         0.00      0.00      0.00      0.00      0.00

[6] Output the current Memory usage for 3 times every second.


[root@dlp ~]# sar -r 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

03:06:52 PM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
03:06:53 PM   1832648     89916      4.68      5796     24552     53452      0.88
03:06:54 PM   1832648     89916      4.68      5796     24552     53452      0.88
03:06:55 PM   1832648     89916      4.68      5796     24552     53452      0.88
Average:      1832648     89916      4.68      5796     24552     53452      0.88


[7] 	Output the current send/receive packets for 3 times every second.
[root@dlp ~]#
sar -n DEV 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

03:20:05 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
03:20:06 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:20:06 PM      eth2      0.00      0.00      0.00      0.00      0.00      0.00      0.00

03:20:06 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
03:20:07 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:20:07 PM      eth2      0.00      0.00      0.00      0.00      0.00      0.00      0.00

03:20:07 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
03:20:08 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:20:08 PM      eth2      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Average:        IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth2      0.00      0.00      0.00      0.00      0.00      0.00      0.00

[8] Output the current paging usage for 3 times every second.


[root@dlp ~]# sar -B 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

03:13:12 PM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
03:13:13 PM      0.00      0.00     80.00      0.00     65.00      0.00      0.00      0.00      0.00
03:13:14 PM      0.00      0.00     40.00      0.00     63.00      0.00      0.00      0.00      0.00
03:13:15 PM      0.00      0.00     32.00      0.00     63.00      0.00      0.00      0.00      0.00
Average:         0.00      0.00     50.67      0.00     63.67      0.00      0.00      0.00      0.00

It’s possible to output every CPU cores’ usage with “mpstat” command like follows.
[9] Output the current all CPU cores’ usage for 3 times every second.


[root@dlp ~]# mpstat -P ALL 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

03:55:12 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
03:55:13 PM  all    0.00    0.00    0.00    0.00    0.50    0.00    0.00    0.00   99.50
03:55:13 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:55:13 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:55:13 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
03:55:14 PM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:55:14 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:55:14 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:55:14 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
03:55:15 PM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:55:15 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:55:15 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.00    0.00    0.00    0.00    0.17    0.00    0.00    0.00   99.83
Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
Average:       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

It’s possible to output disk I/O usage with “iostat” command like follows.
[10] Output the current I/O usage with Mega-bytes for 3 times every second.


[root@dlp ~]# iostat -mx 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.01    0.00    0.03    0.08    0.00   99.87

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
vda               0.38     0.15    0.67    0.10     0.01     0.00    29.46     0.03   41.78   4.44   0.35
dm-0              0.00     0.00    0.76    0.25     0.01     0.00    20.34     0.06   54.84   3.25   0.33
dm-1              0.00     0.00    0.08    0.00     0.00     0.00     8.00     0.00    1.05   0.53   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.50    0.00    0.00   99.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
vda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.50    0.00    0.00   99.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
vda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

It’s possible to output every CPU usage for processes with “pidstat” command like follows.
[11] Output the CPU usage for a process ID “1169” for 3 times every second.


[root@dlp ~]# pidstat -p 1169 1 3


Linux 2.6.32-358.6.2.el6.x86_64 (dlp.server.world)      06/25/2013      _x86_64_        (2 CPU)

04:09:09 PM       PID    %usr %system  %guest    %CPU   CPU  Command
04:09:10 PM      1169    0.00    0.00    0.00    0.00     1  bash
04:09:11 PM      1169    0.00    0.00    0.00    0.00     1  bash
04:09:12 PM      1169    0.00    0.00    0.00    0.00     1  bash
Average:         1169    0.00    0.00    0.00    0.00     -  bash

Find out if your server is affected

http://filippo.io/Heartbleed/
Run the command:

[root@austin ~]# openssl version
OpenSSL 1.0.1e-fips 11 Feb 2013

to get the version number of openssl. If the command shows e.g.:

[root@austin ~]# rpm -qa | grep openssl
openssl-1.0.1e-16.el6_5.7.x86_64

Your server might be vulnerable as the version is below 1.0.1g. But some Linux distributions patch packages, see below for instructions to find out if the package on your server has been patched.

If your server uses a 0.9.8 release like it is used on Debian squeeze, then the server is not vulnerable as the heartbeat function has been implemented in OpenSSL 1.0.1 and later versions only.

Fix the vulnerability

To fix the vulnerability, install the latest updates for your server.

Debian

apt-get update
apt-get upgrade

Ubuntu

apt-get update
apt-get upgrade

Fedora and CentOS

yum update

OpenSuSE

zypper update

Then restart all services that use OpenSSL.

On a ISPConfig 3 server, restart e.g. these services (when they are installed): sshd, apache, nginx, postfix, dovecot, courier, pure-ftpd, bind and mysql. If you want to be absolutely sure that you did not miss a service, then restart the whole server by running “reboot” on the shell.

Check if the Linux update installed the correct package

After you installed the Linux updates, check if the openssl package has been upgraded correctly. Some Linux distributions
patch packages, so “openssl version” does not always show whether the correct patch that fixes the vulnerability has been installed.

Check the package on Debian and Ubuntu:

dpkg-query -l 'openssl'

Here the output for a correctly patched Debian 7 (Wheezy) server:

dpkg-query -l ‘openssl’
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===================-===============-==============-============================================
ii openssl 1.0.1e-2+deb7u5 amd64 Secure Socket Layer (SSL) binary and related

For Fedora and CentOS, use this command to find the installed package name:

rpm -qa | grep openssl

Here are the links with the release notes that contain the package names of the fixed versions:

Check Here for the packages:

https://rhn.redhat.com/errata/RHSA-2014-0376.html

Affected versions of the OpenSSL:

Status of different versions:

OpenSSL 1.0.1 through 1.0.1f (inclusive) are vulnerable
OpenSSL 1.0.1g is NOT vulnerable
OpenSSL 1.0.0 branch is NOT vulnerable
OpenSSL 0.9.8 branch is NOT vulnerable

Bug was introduced to OpenSSL in December 2011 and has been out in the wild since OpenSSL release 1.0.1 on 14th of March 2012. OpenSSL 1.0.1g released on 7th of April 2014 fixes the bug.
Operating Systems

Some operating system distributions that have shipped with potentially vulnerable OpenSSL version:

Debian Wheezy (stable), OpenSSL 1.0.1e-2+deb7u4
Ubuntu 12.04.4 LTS, OpenSSL 1.0.1-4ubuntu5.11
CentOS 6.5, OpenSSL 1.0.1e-15
Fedora 18, OpenSSL 1.0.1e-4
OpenBSD 5.3 (OpenSSL 1.0.1c 10 May 2012) and 5.4 (OpenSSL 1.0.1c 10 May 2012)
FreeBSD 8.4 (OpenSSL 1.0.1e) and 9.1 (OpenSSL 1.0.1c)
NetBSD 5.0.2 (OpenSSL 1.0.1e)
OpenSUSE 12.2 (OpenSSL 1.0.1c)
Operating system distribution with versions that are not vulnerable:

Debian Squeeze (oldstable), OpenSSL 0.9.8o-4squeeze14
SUSE Linux Enterprise Server
How to fix:

Even though the actual code fix may appear trivial, OpenSSL team is the expert in fixing it properly so latest fixed version 1.0.1g or newer should be used. If this is not possible software developers can recompile OpenSSL with the handshake removed from the code by compile time option -DOPENSSL_NO_HEARTBEATS.

With regards to the openSSL heartbleed issue and resolution, should I revoke OR re-key my existing SSL cert?

Any certificate that was ever hosted on an internet-facing vulnerable version of OpenSSL should be revoked and replaced. The cost of exhaustively evaluating whether a certificate was in jeopardy is almost certainly going to be higher than the cost of simply replacing the certificate. This is also a good opportunity to make sure that your certificate key length and signature algorithms are ‘up to code.'”

Because the private key might be compromised you need to re-key the certificate instead of just renew it, e.g. use a new public/private key pair instead of renewing one. Revoking the compromised certificate need to be done too, which may be done automatically if you create the new certificate by the same CA but you should check this with the issuer (CA).

Note, that the revoking process of the current PKI structure in the browsers is bad, e.g. some don’t check, some ignore OCSP errors etc. And it is worse outside the browsers (e.g. scripts, mobile apps…). That’s why in the last big compromises or wrong behavior of CA (Comodo, DigiNotar, FGC/A …) you always got a new browser version 🙁

As noted on the Heartbleed site, appropriate reponse steps are broadly:

  •         Patch vulnerable systems.
  •         Regenerate new private keys.
  •         Submit new CSR to your CA.
  •         Obtain and install new signed certificate.
  •         Revoke old certificates.

 

How to locate the top scripts on your server that send out email. You can then search the Exim mail log for those scripts to determine if it looks like spam, and even check your Apache access logs in order to find how a spammer might be using your scripts to send out spam. Login to your server via SSH as the root user. Run the following command to pull the most used mailing script’s location from the Exim mail log:

grep cwd /var/log/exim_mainlog | grep -v /var/spool | awk -F"cwd=" '{print $2}' | awk '{print $1}' | sort | uniq -c | sort -n

Code breakdown:

grep cwd /var/log/exim_mainlog 	

Use the grep command to locate mentions of cwd from the Exim mail log. This stands for current working directory.

grep -v /var/spool

Use the grep with the -v flag which is an invert match, so we don’t show any lines that start with /var/spool as these are normal Exim deliveries not sent in from a script.

   
awk -F"cwd=" '{print $2}' | awk '{print $1}' 	

Use the awk command with the -Field seperator set to cwd=, then just print out the $2nd set of data, finally pipe that to the awk command again only printing out the $1st column so that we only get back the script path.

sort | uniq -c | sort -n 	

Sort the script paths by their name, uniquely count them, then sort them again numerically from lowest to highest.

You should see something like this:


    15 /home/username/public_html/about-us
    25 /home/username/public_html
    7866 /home/username/public_html/data

Here we can see that the /home/userna5/public_html/data directory by far has more deliveries coming in than any others.Now we can run the following command to see what scripts are located in that directory:

ls -lahtr /username/public_html/data

In thise case we got back:

    drwxr-xr-x 17 username username 4.0K Jan 20 10:25 ../
    -rw-r--r-- 1 username username 5.6K Jan 20 11:27 mailer.php
    drwxr-xr-x 2 username username 4.0K Jan 20 11:27 ./

So we can see there is a script called mailer.php in this directory. Knowing the mailer.php script was sending mail into Exim, we can now take a look at our Apache access log to see what IP addresses are accessing this script using the following command:

grep "mailer.php" /home/username/access-logs/example.com | awk '{print $1}' | sort -n | uniq -c | sort -n

You should get back something similar to this:


    2 123.123.123.126
    2 123.123.123.125
    2 123.123.123.124
    7860 123.123.123.123

So we can clearly see that the IP address 123.123.123.123 was responsible for using our mailer script in a malicious nature. If you did find a malicious IP address sending out a large volume of messages from a script on your server you’ll probably want to go ahead and block them at your server’s firewall so that they can’t try to connect again.

Remove exim emails:

# exim -bp | exiqgrep -i | xargs exim -Mrm

Getting rkhunter failed emails in your email? Here is how to configure the email to send to a correct address.

Edit /etc/sysconfig/rkhunter:


nano /etc/sysconfig/rkhunter

# System configuration file for Rootkit Hunter which
# stores RPM system specifics for cron run, etc.
#
#    MAILTO= <email address to send scan report>
# DIAG_SCAN= no  - perform  normal  report scan
#            yes - perform detailed report scan
#                  (includes application check)

MAILTO=root@localhost
DIAG_SCAN=no

Change the email to your email.

# System configuration file for Rootkit Hunter which
# stores RPM system specifics for cron run, etc.
#
#    MAILTO= <email address to send scan report>
# DIAG_SCAN= no  - perform  normal  report scan
#            yes - perform detailed report scan
#                  (includes application check)

MAILTO=admin@yourdomain.com
DIAG_SCAN=no

Save and you are all set!

Root Cause Analysis

Root cause analysis (RCA) is a method of problem solving that tries to identify the root causes of faults or problems.

RCA practice tries to solve problems by attempting to identify and correct the root causes of events, as opposed to simply addressing their symptoms. Focusing correction on root causes has the goal of preventing problem recurrence. RCFA (Root Cause Failure Analysis) recognizes that complete prevention of recurrence by one corrective action is not always possible.

http://en.wikipedia.org/wiki/Root_cause_analysis

If you cannot send emails to Outlook or Hotmail or MSN, then your server’s IP address maybe blacklisted. Here are some tips to get removed from the MSN blacklist.

Before jumping through the blacklist removal hoops, you may want to double-check that your emails are not simply going into the spam folder. This process will not help you with emails being dropped into the spam folder. This is for getting off of MSN’s blacklist. I am going to outline 3 steps.

Verify you are on the MSN blacklist.
Perform preliminary blacklist removal checks.
Submit MSN blacklist delisting request.

Delist Here – Sender Information for Outlook.com Delivery –

https://support.microsoft.com/en-us/getsupport?oaspworkflow=start_1.0.0.0&wfname=capsub&productkey=edfsmsbl3&locale=en-us&ccsid=635808707851479494&wa=wsignin1.0

MSN Blacklist Check

If MSN has blacklisted your IP, you will receive a delivery rejection notice from MSN or Hotmail. If you check your server’s logs or your email bounce you may see something like this:

SMTP error from remote mail server after end of data:
host mx1.hotmail.com [65.54.188.94]: 550 SC-001 (BAY0-MC2-F59) Unfortunately, messages from 216.55.xxx.xxx weren't sent. Please contact your Internet service provider since part of their network is on our block list. You can also refer your provider to <a href="http://mail.live.com/mail/troubleshooting.aspx#errors." target="_blank" rel="noopener">http://mail.live.com/mail/troubleshooting.aspx#errors.</a>

If you are seeing this or a similar email error, then your server’s IP has likely been blocked by MSN/Hotmail. There could be other response codes, but typically all MSN blacklist notifications will include a 500 series error. MSN’s postmaster service as a list of MSN’s blacklist codes.

MSN Blacklist Codes

I suggest you check this list to find the exact reason Hotmail or MSN is rejecting your emails.

There are some 400 series errors that deal with email volume rather than suspected spam. If you are sending high volumes of email to MSN, you may need to sign up for their bulk sender’s program.

If you are not seeing 500 errors, then you may not have an email blacklist problem but some other email delivery issue.
Preliminary Blacklist Delisting Tasks

Before requesting removal from MSN’s blacklist, you will want to take some steps to stop whatever caused the listing.

Make sure there is no unauthorized email going from your server.

  • Check the daily volume of email going to Hotmail, MSN or Outlook
  • Look for compromised user accounts.
  • Look for people forwarding email to Hotmail, MSN, or Outlook.com.
  • Do you have SPF and rDNS records set up?

If someone is forwarding email to Hotmail related addresses and then marketing it as spam, Hotmail will lower your server’s sender reputation. Window’s Live and related email services such as Hotmail and MSN.com emails work with Return Path to filter email. So email server reputation is more important for sending to these accounts than some of the other ISP’s covered in this series.

Hotmail/MSN Blacklist Removal Process

To start the process of getting removed from Hotmail’s blacklist, you will need to complete their sender information form.

Unfortunately since Microsoft maintains their own blacklist they have no obligation to accept email from anyone. Please have a look at some of their suggestions located at https://mail.live.com/mail/services.aspx

I would suggest signing up for both SNDS and Microsoft’s Junk Mail Reporting Program.

Submit to get Delisted!

Sender Information for Outlook.com Delivery – https://support.microsoft.com/en-us/getsupport?oaspworkflow=start_1.0.0.0&wfname=capsub&productkey=edfsmsbl3&ccsid=636529520240187401&wa=wsignin1.0

Provide all of the requested information. Unlike some other ISPs, MSN Support requires you to run some telnet tests from the command line on your server. If you do not know how to run these tests, you will need to get someone to help you.

In working with MSN, I have found it very important to provide accurate email headers. If you provide reliable information and are truly not spamming their systems, you will typically see removal in 2-3 business days. MSN is very picky about DNS. So be sure your DNS, PTR and SPF/SenderID records are in order before requesting removal.

If you have root access and need to send email now, try below:

Partial Solution:

Re-route the IP on port 25 if you have a linux box and you have another IP that is not blacklisted.

apply an iptables rule to route your outbound SMTP to a new IP

216.55.xxx.xxx


# iptables -t nat -A POSTROUTING -p tcp --dport 25 -j SNAT --to-source 216.55.xxx.xxx