Server Squabbling

Tuesday, October 4, 2016

Upgrading HP DL580 G7 from 7500 Xeons to E7s.

Every once in a blue moon I jot something on this blog that I spent a great deal of time researching fruitlessly that I think might help others. I've had one of those moments, and this time it involves the HP DL580 G7 4U rack server featuring 4 sockets for Intel Xeon 7500 and E7 (series I) processors.

I recovered this server from a non-production environment and discovered it had lived a VM life of zero love - was running the original firmware for all of the components. At the time, it was also configured with a Xeon E7530 CPU (Nehalem architecture, 6 cores/per, HT, 2.0GHz)

These processors are both unimpressive and lack a few modern features that the "tock" cycle Westmere CPUs have. The Westmere CPUs are also more power conservative and offer quite a few more cores per socket!

However, when researching the complications of this switch over, I was all but convinced that this swap required a different motherboard. I've seen this happen before - the Xeon Tulsa CPUs had different FSB clocks within the same CPU set; some motherboards could run one speed, other could run two speeds.

https://community.hpe.com/t5/ProLiant-Servers-ML-DL-SL/DL580-G7-E7-Upgrade-issue/td-p/6841870

http://serverfault.com/questions/764327/proliant-dl580-g7-e7-upgrade-issue

Between these two threads, I had considered all hope lost. This machine was built in 2010, it was an early release. It had never been serviced or anything. Hopeless, right? Well, I caught a deal on eBay for 4x e7-4850 CPUs (Westmere architecture, 10 cores/per, HT, 2.0GHz) and thought the worst I could do is break even on the CPUs if they didnt' work.

So I did some prodding. Like many of you, I discovered that HP has pay-walled access to Firmware patches, even for machines that would have been covered under a service agreement at the time the firmware was released. Its horrid, I hate them for it, but they've been cruising under this modus operandi for a few years now and they're still making bank.

http://h20564.www2.hpe.com/hpsc/swd/public/detail?sp4ts.oid=4142793&swItemId=MTX_0f761666783e4f7fae26bc8bab&swEnvOid=4168#tab-history

Under the BIOS section for the DL580 G7 and nested under Revision history we discover than on 2011/03, HP released updates that support the E7 processors. However, paywalled.

http://h20564.www2.hpe.com/hpsc/swd/public/detail?swItemId=MTX_8680da63a0df49c5893917591b

So, what could we do to escape the paywall for hardware that may be worth a grand all-in? I tried googling for cp015261.exe but between dead FTPs and Russian indexes, I don't believe anyone has it any more. However, if you're HP smart, you know there's another way. HP bundles ALL of their firmware updates into handy "Smart Update" fix packs. I found two that were "accessible" via methods that may or may not work for you - it depends on your appetite and accessibility.

HP Proliant Smart Update Firmware DVD version 10.10
HPE Service Pack for Proliant DVD release 2016.04

The 10.10 release is pretty dated, but it worked just fine in Windows 2012 R2. This release patched the firmware to August 2011, supporting the E7s but without a quantum leap. It also performed lots of other, smaller firmware patches that simply downloading cp015261 wouldn't have taken into account. Once this package fully installed and reboots were had, I installed my E7 processors. They worked on the first boot, no complaints, no errors, no grief. Once I gave those a moment to roast, I went ahead with the 2016.04 release. The BIOS update gave me grief, saying there was a problem applying. However, after rebooting, the BIOS was updated and a subsequent run of 2016.04 ended up patching the iLO.

This is my story of upgrading a DL580 G7 with low-SKU 7500s to high-SKU E7s with minimal work on the box and a little bit of background effort making all of the right media available. If your shop has numerous HP boxes in production, especially if you've got an in-support G6/G7 around, getting your hands on that 10.10 release shouldn't be all that difficult. If for some reason you have a hard time getting a hold of it, I recommend modifying your Google query in such a way that alternate resources are presented. It's out there, I promise!

Thursday, September 12, 2013

Keeping track of DBCC SHRINKFILE, EMPTYFILE

The EMPTYFILE option is meant to give you a means to remove extra datafiles you added, particularly ones that have data in them. Executing this command will cause SQL server to rebalance data to other existing files within the filegroup. Its unusual to need to do this with tiny databases - you likely have to do this with existing, legacy systems that exploded in growth, existed through technological innovations, and likely are more complicated now than they need to be.

When executing this command, there are a litany of things you must consider - IO contention, user access, latching/locking/blocking, and given that you're likely executing this process on big databases - progress. I use Adam Machanic's "SP_WHOISACTIVE" for so many things and this is no exception. One of the columns in his script output is "Percent Complete". In my cases, this column is unpopulated, likely due to the unpredictable nature of many SQL operations. Fortunately, percent complete is valid for the shrinkfile process! Use this to give you a better handle on just when your shrinkfile will complete!

Bringing it back

I originally created this blog as an opportunity to voice my experiences, attempts at researching the undocumented, and perhaps giving the interwebs a chance to learn from things I spent way too much time on. Like most good intentions, these efforts went stale rather quickly and I left my name associated with not much more than stub. Today, I'm going to reignite that energy.

These days I work as a Sr. DBA for a very large healthcare company. I don't spend as much time fiddling with hardware and quite alot more time dealing with application nuances and how hardware can be used to solve those challenges.

Thursday, January 21, 2010

SQL Server - show list of all database files

select database_id, type_desc, name, physical_name, state_desc, size from sys.master_files;

You can play with this query to display more or less columns, of course. This will give you a list of all files including LOG files, which may not be your cup o' tea. (where type_desc <> 'LOGS')

Cheers

Wednesday, January 20, 2010

SQL Server Shrinkage

Sometimes you run into log files (and sometimes data files) that run away with you in size. To shrink or reduce the size of a SQL server file, especially logs, you need to trick the system into thinking you've written this data elsewhere. Here's how I do it.

use {database name}
go
backup log {database name} with truncate_only;
checkpoint;
dbcc shrinkfile ('{database filename}',{new file size}) ;
USE [master]
GO
ALTER DATABASE [{database name}] MODIFY FILE ( NAME = N'{database filename}', MAXSIZE = {new max size}KB )
GO

Oracle 10G startup problems - dbstart

When oracle doesn't want to start or simply doesn't start at all, here are some things that work for me:

Check $ORACLE_HOME/bin/dbstart -- can you start it manually and everything go smoothly?
Check $ORACLE_HOME/startup.log -- did something bomb?
Check /etc/oratab -- are your desired databases flagged with Y to indicate your interest in starting them?

There is a bug/QA opportunity in Oracle 10.2g Linux in $ORACLE_HOME/bin/dbstart. Evidentally a developer left a hardcoded path for the LISTNER (also spelled wrong) variable around line 78. The path in place is /ade/vikrkuma or something to that effect. Make sure this is replaced with $ORACLE_HOME for the entire path.

Out of the box, Oracle does not provide an auto startup script to automatically bring Oracle up. There are several examples on the internet to create an init.d script. The goal is to make sure you build an ability to run $ORACLE_HOME/bin/dbstart for your database. If you have multiple instanaces, your script has to be smart enough to declare its environment variables as user ORACLE prior to running the dbstart script. I generally wrote a single shell script for each startup process and an init.d script to call each one individually. Made for alot cleaner init.d script.

Also check running processes for your oracle instance. Run cd $ORACLE_HOME and then type "pwd" to see what directory you're in. Your directory should have a unique name at some point (some structures are like /oracle/product/10.2.0/db10gR2). In this case, I would use db10gR2 as my unique name. Take this name and grep it from the running process output to see if anything at all ran. ( ps ax | grep db10gR2 ) If you have any processes running but do not have tnslsnr, you likely cannot connect to your database. I would recommend running a $ORACLE_HOME/bin/dbshut and closing everything out. Check your oratab file to make sure you have the necessary databases flagged to start, and try again. Also, your oratab should list ASM instances prior to databases since the databases rely on ASM!

Upgrading Dell 6850 to Xeon Tulsa 7130M (Req 800T)

This blog is for my own notes. If you run into it from Google and get an answer or have a question, well that's cool too.

The Dell PowerEdge 6850 server comes in many variants, but only two primary variants actually matter. There are two series designations; original and Series II. The main and only difference is support for 800MHz bus. Systems that support 800MHz bus systems are Series II, are referred to as 800T systems, and use separate motherboards, memory risers, voltage regulators, and of course CPUs. Various BIOS revisions also dictate CPU support based on the date of release. A fully updated system should support any CPU designed for the Socket including Paxville and Tulsa. The trick for 800T based systems is to have the right VRMs in place for multiple processors. If you only intend to have 2 CPUs, you need nothing more than an 800T motherboard and 2 667/800 Memory risers to accomplish your goal. If you want 4 CPUs, you'll need SPECIFIC YC902 A01 CPU VRMs and 1 PD838 Cache VRM. Your system WILL NOT post without these 3 VRMs in place when Tulsa CPUs are used. If you have a Paxville 7041, the cache VRM is not required! Beware of K5331 VRMs for sale masked as PD838. The PD838 has a heatsink on only one side -- I purchased one from Dell directly to confirm. Purchasing from Dell is almost as cheap as third party for brand new, I might even recommend it.

Below are some known part numbers to make life easier (and searchable!)

Dell PowerEdge 6850 Motherboard Planar 800T - RD318
Dell PowerEdge 6850 Memory Riser 667/800 - ND891
Dell PowerEdge 6850 CPU VRM 800T - YC902 (version 10.2, important)
Dell PowerEdge 6850 Cache VRM 800T - PD838 (Tulsa Support for big cache)

Awesome CPUs supported: 7041 7120M 7130M 7140M 7150M
I have NOT tested 667 support for rock and roll CPUs except for 7040.
However, they are: 7040 7110N 7120N 7130N 7140N 7150N.

That's pretty much everything I know about the 6850. A note: there are TWO firmwares for this system - one for the motherboard and a second auxiliary system firmware. Be sure to keep both updated for proper CPU support. When in doubt, buy two low-mHZ pre-dual core Xeon MPs for Socket 602 for a couple bucks and you should be able to boot no problem for BIOS fixes and the like.