A Netbackup admin test toolbox

As a Netbackup admin, solving other problems may not be a stranger to you. Most seen is very bad backup performance. This article describe the test tool i use in my day to day work.

vxbench
A utility created by Symantec (previous Veritas). It’s splendid tool for finding bad disk performance, read or write. Works best on VXFS file system (obvious). Vxbench has different workload built in (seq read/write – random read/write) and you can specify block size as well. I always check new disk storage unit with vxbench before putting them in production. Vxbench is available for Solaris, AI, HP-UX and Linux – The package is called VRTSspt and can be downloaded from Symantec site.

Examples:
vxbench -w write -i iosize=128,iocount=262144 /diskstu4/dsu/testfile1

output:
total: 111.531 sec 300852.32 KB/s cpu: 48.65 sys 0.04 user

You can get VRTSspt package from here: http://www.symantec.com/docs/TECH27451

tcpdump
Whenever a firewall closes inn on you, tcpdump is you’re find. You don’t need to understand all the stuff, it’s reasonable easy to see connections in and out.

Netbackup only uses port 1556  and 13724 (for backward compatibility)

Here is a list of my most often used tcpdump commands. I always use the following arguments

-i To specify what interface to listen to e.g. eth6

-f Causes tcpdump to print internet addresses in numerical notation

-n Prevent service port to get translated into names (prints 1556 instead of VRTSpbx).

Listen for traffic for a entire network
# tcpdump -n -f -i eth6 net 10.10.10.0/24

Listen for traffic for just one host
# tcpdump -n -f -i eth6 host 10.1.1.1

Or just one service port.
# tcpdump -n -f -i eth2 port ssh

# You can also trace traffic for two host on a IP only layer.
tcpdump -n -f -i eth1 ip host 10.224.13.1 or 10.224.13.2

Listen for traffic but don’t clutter the picture with your’e own SSH traffic
# tcpdump -n -f -i eth5 ip and not port 22

Using Netbackup bpbkar as test tool
You can run bpbkar (the process responsible for reading from disk) by hand to see how performance is when network/tape drive layer is cut off. When issuing bpbkar by hand data is read from disk and thrown in the bit bucket. This will enable the admin to find out whether the problem is on the client side or server side.

# Windows
[INSTALL_PATH]\NetBackup\bin\bpbkar32.exe -nocont D:\ 1> nul 2> nul

# Unix
/usr/openv/netbackup/bin/bpbkar -nocont -nofileinfo -nokeepalives /var > /dev/null 2> /tmp/file.out

Make sure you have created the bpbkar debug directory in [INSTALL_PATH]\NetBackup\logs before starting. Command is return immediately, but the process will be visible in task manager, and the debug log will grow in size as well.

if running bpbkar manual takes the same time as a “real” backup to tape or disk you know the problem is on the client.

if bpbkar run by hand takes the same amount of time as a real backup, you know the problem is on the client and know where to chase the next bottleneck.

nbpercheck
Netbackup has a disk performance test tool included.

The tool is describe in this tech note:
http://www.symantec.com/docs/HOWTO94369

NB_ORA_PLOG

There is a new environment variable in Netbackup 7.6 called NB_ORA_PLOG. The variable is not documented in Symantec NetBackup 7.6 for Oracle Administrator’s Guide.

My curiosity never the less wanted an answer to what the new variables was used for. This is the answer I got:

NB_ORA_PLOG parameter is a NetBackup internal mechanism to identify the progress log to be shared/referenced by multiple processes involved in template/Guided Recovery/Intelligent Policy operations. Users should never configure this setting, hence the reason it is not included in the NetBackup for Oracle Admin Guide.

So now you know 🙂

ddboost storage-unit show compression

Be aware that “ddboost storage-unit show compressions {storage-unit}” only show information about 16384 backup files. According to EMC this is “by design”.

Using DataDomain OS 5.3 or newer you can use this command instead that will display all files in a Mtree:
# filesys show compression /data/col1/{mtree name} recursive no-sync

Setting the Instance or Database field in Netbackup 7.6 activity from RMAN

You may have noticed the new field called “Instance or Database” in Netbackup 7.6 activity monitor. This field will populate with the SID during backup using Intelligent Policies. But the joy does not stop here – you can also populate this field using custom scripts.  Just set NB_ORA_SID= during the RMAN send in the RCV script.

This is a sample script - database SID is NMATEST:
# -----------------------------------------------------------------
# RMAN command section
# -----------------------------------------------------------------
RUN {
ALLOCATE CHANNEL ch00
TYPE 'SBT_TAPE'
PARMS 'SBT_LIBRARY=/usr/openv/netbackup/bin/libobk.so64';
SEND 'NB_ORA_CLIENT=ora1.mass.dk,NB_ORA_SID=NMATEST,NB_ORA_SERV=srv1.mass.dk,NB_ORA_POLICY=ORA_MANUAL,NB_ORA_SCHED=daily';
BACKUP
INCREMENTAL LEVEL=1
FORMAT 'bk_d%d_u%u_s%s_p%p_t%t'
DATABASE;
RELEASE CHANNEL ch00;
# Backup Archived Logs
sql 'alter system archive log current';
ALLOCATE CHANNEL ch00
TYPE 'SBT_TAPE'
PARMS 'SBT_LIBRARY=/usr/openv/netbackup/bin/libobk.so64';
ALLOCATE CHANNEL ch01
TYPE 'SBT_TAPE'
PARMS 'SBT_LIBRARY=/usr/openv/netbackup/bin/libobk.so64';
SEND 'NB_ORA_CLIENT=ora1.mass.dk,NB_ORA_SID=NMATEST,NB_ORA_SERV=srv1.mass.dk,NB_ORA_POLICY=ORA_MANUAL,NB_ORA_SCHED=daily';
BACKUP
FORMAT 'arch_d%d_u%u_s%s_p%p_t%t'
ARCHIVELOG
ALL;
DELETE ARCHIVELOG ALL BACKED UP 2 TIMES to DEVICE TYPE sbt;
RELEASE CHANNEL ch00;
RELEASE CHANNEL ch01;
# Control file backup
ALLOCATE CHANNEL ch00
TYPE 'SBT_TAPE'
PARMS 'SBT_LIBRARY=/usr/openv/netbackup/bin/libobk.so64';
SEND 'NB_ORA_CLIENT=ora1.mass.dk,NB_ORA_SID=NMATEST,NB_ORA_SERV=srv1.mass.dk,NB_ORA_POLICY=ORA_MANUAL,NB_ORA_SCHED=daily';
BACKUP
FORMAT 'ctrl_d%d_u%u_s%s_p%p_t%t'
CURRENT CONTROLFILE;
RELEASE CHANNEL ch00;
}

Clustered Netbackup 7.6 master server softlinks

Clustered Netbackup 7.6 Master Servers uses undocumented link that will cause issues when performing catalog restore, if the nbu_server resource is either stopped or failed state.

VCS creates a link upon startup from /usr/openv/var/global to /opt/VRTSnbu/var/global and removes it again when nbu_server resource is offlined.

Before performing any restores of the NBDB/EMM database create the link manual:

# cd /usr/openv/var
# ln -s /opt/VRTSnbu/var/global global

Catalog restores will then work according to the manual in the troubleshooting normal.

Netbackup FT debuggin

On the SAN client set the DEBUG logging

vxlogcfg -a -p 51216 -o 200 -s DebugLevel=6 -s DiagnosticLevel=6
vvlogcfg -a -p 51216 -o 137 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -a -p 51216 -o 156 -s DebugLevel=6 -s DiagnosticLevel=6

2: Stop and start “nbftlcnt” services.

3: Capture debugging logging from the SAN client console.

nbftclnt -console “monitor
Successful discovery:
DeviceInquiry: EVPD Page 0x83 “SYMANTECFATPIPE 0.0 tamar”
GetScsiAddress:GetScsiAddress: m_DeviceName = (/dev/sg435)
AddDevice:/dev/sg435
Inquiry “SYMANTECFATPIPE 0.0 tamar”
TargetHBA:LUN:InitiatorHBA = 0:0:0x10 State = 1 RefCount = 0
ClosePTDeviceHandle:/dev/sg435 m_HandleOpenCount 0
DeviceInquiry: EVPD Page 0x83 “SYMANTECFATPIPE 0.1 tamar”
GetScsiAddress:GetScsiAddress: m_DeviceName = (/dev/sg436)
AddDevice:/dev/sg436
Inquiry “SYMANTECFATPIPE 0.1 tamar”
TargetHBA:LUN:InitiatorHBA = 0:1:0x10 State = 1 RefCount = 0
ClosePTDeviceHandle:/dev/sg436 m_HandleOpenCount 0
DeviceInquiry: EVPD Page 0x83 “0”