GPU Rig

Submitted by Xilodyne on Mon, 10/30/2017 - 16:10
GTX 1060

I recently visited my friend MJ in Taipei City, Tiawan, and we made our way to Guanghua Digital Plaza, an awesome place for new and cheap electronics.  I ended up walking out with 18kg of gear for a crypto-currency mining platform which I figured would be easy to convert to a neural network server.

 

Initial Setup

For about $500 I came home with the beginning of a GPU server:

Segotep case front with fans    Segotep front

Segotep without gpu    ROG STrix H270F motherboard

 

ROG Strix H270FI'm not a hardware guy and its been literally years since I've looked at a BIOS but not having an OS, the Strix BIOS pops right up.  It looks amazing with lots of great information, including fan speed and CPU Temperature.

 

 

 

only two matching drive holesOnce back in Tokyo, I bought at 6TB HD for about $180.  It turns out the drive mount that came with the case only had two holes that matched the drive.  I would have preferred four screws holding the drive, but two will do.  I then installed a Ubuntu 17.03.  Up and running!

 

 

 

Pretty loud serverStill not sure how I'm going to run this inside our apartment.  At 78 db it is pretty loud.

 

 

 

 

Adding a GPU

I wasn't sure how the GPU was supposed to connect to the mother board as the PCI 16 slots on the mother board are far from the card mounts on the case.  After some googling, it was apparent I needed a riser, something that will connect the PCIEX slot to the PCI16 connector of the GPU.  Searching  I saw that Tsukumo eX in the Akihabara district of Tokyo might have a riser.

Luckily Tats Beniya works there.  He is a fluent English speaker and a super knowledgeable hardware guru.  With his help I obtain CUDA 8 compatible GPU and riser.  It turns out the Nvidia doesn't sell any cards in Japan.  So I bought a locally produced GPU and riser for 18700 yen.  Tats also suggested that I spend an extra 1000 yen that would allow me to exchange the GPU for another GPU greater value for up to one month later.  I'm glad I did that as I started to have problems later on.

1050 gpu    1050 gpu

PCI Express riser   riser

I installed the riser and GPU.  I could see right away that there might be some issues with the riser separating from the GPU as there is space between the bottom of the GPU and the case.  I could just imagine that over time the riser would become loose and cause problems.  The riser uses a USB 3.0 to connect to the PCIEX cable. 

USB 3.0 connector to PCIEX     gpu

After I installed the GPU with riser, about every third reboot I would see a massive amount of PCI errors appear on the screen.  It would prevent booting into the OS.  Googling the problem a lot of people said it could be related to riser. 

pci errorsPowering down, removing the USB from the PCIEX, powering up, the powering down, attaching the USB, would only fix the problem occasionally.  Sometimes it would work for days only for the problem to emerge its ugly head.

 

 

 

motherboard gpu preferenceLooking through the BIOS I saw this message:

 

 

 

 

 

 

For best performance of your graphics card(s), use the following configuration according to the number of graphics card(s) you want to install:

To use 1 graphics card, we recommend you to install the graphics card on the PCIE_X16_1 slot.

To use 2 graphics cards, we recommend you to install the graphics cards onto the PCIE_X16_1 and PCIex16_2 slots.

So I wasn't sure if my PCI errors were being generated from:

  • The riser card
  • Something with the mother board (using the PCIEX1_1 slot and not the PCIEX16_1 slot)
  • The GPU is faulty

 

New GPU

Back at Tsukumo eX and speaking with Tats who again was supper helpful. 

He had nice solutions to my problems:

USB in PCIEX16_1 slotTo use the PCIEX16_1 slot, the riser attachment card for the PCIEX_1 could be reversed and used PCIEX16_1.

 

 

 

To test if my riser was a problem, I bought a bit more expensive PCIEX16 to PCIEX16 ribbon.  I figure as I would be installing up to five more GPUs it would be best to have someway of testing problems are with the risers or the GPUs.

new riser    pci ribbon

 

I also traded in my GTX1050 for a GTX1060.  I decided to get an ASUS GPU to be the same brand as motherboard manufacture (though Tats said it probably didn't matter).  Even better is this GPU has a two fans that only turn on at 60 degrees Celsius.  Thus by turning off my three fans on the case, and letting the GPU decide when to cool itself, I dramatically reduced the amount of sound the rig produces.  It does mean that I leave the top off case so that the heat can dissipate easier.  Nice to have some quiet finally. 

GTX 1060    gtx 1060

Tats also pointed out that this GPU needs its own power supply in addition to the riser power supply.  As he expected, my power supply had all the cables I needed.

gpu power input    gpu power cable

Finally, as a solution to removing the gap between the bottom of the riser board, Tats suggested adding screws to the bottom of the riser board to rest against the case.  As he didn't have non-conducting screws in stock, he advised insulating them with electrical tape.

screw supports  insulated screws

Altogether the new GPU and riser cost me 34,600 yen.

Alright!  A quiet server, up and running without any errors.  Now onto getting CUDA installed...

Tags