Folding@Home

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
Coug,

One thing I'd be interested in is what kind of PPD one of your super-servers get on the Chrome-app as opposed to bigadv.
That's the first thing I've tried before I resumed the bigadv earlier this week. I get the failed ID request error that I've complained about a few days ago. On both servers. It's probably something in our firewall that is blocking it, but I won't open ports in our corporate firewall just to let the NaCl client run.
 

Howell

Storage? I am Storage!
Joined
Feb 24, 2003
Messages
4,740
Location
Chattanooga, TN
Looking in my proxy logs, there is nothing exotic there. All port 80 and all to either folding.Stanford.edu or assign5.Stanford.edu. I assume the assign server is part off a cluster and can be different for others.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
Depending upon the proxy server, it could also be looking at the application layer and filtering based on that...
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
Anyone know what's up with ExtremeOverclocking's stats?
At this point, I think it's fair to assume that who ever takes care of the stats there either gave up or screwed up in the process. Maybe he did a Davin/Eugene special. Or maybe he concluded that the income from the site's ads wasn't worth the trouble of maintaining the stats system. I'm no expert in database management, but I doubt that it takes something like four days to purge old data, even with trully limited hardware.

There's still Kakao...
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
My PPD is going to permanently drop significantly. I've agreed to give a bunch of my surplus machines away to nieces and nephews so that they can game with them (i.e. using the games off my steam account). Their parents do not have the money to buy their own machines and this is really just an excuse to get a computer in each of their households. I started agreeing to give one away and now that the news has gotten around so now I have a flood of requests and I've decided to say yes to all of them.

While I can set the machines up originally to fold, gaming and GPU folding are mutually exclusive further, these are children so the likelihood that they will consistently turn on and off folding is small to nil.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
Thanks for taking the time to ask. I thought about doing it yesterday in Extreme Oc's forum, but their forum doesn't seem to receive a lot of traffic and I thought I wouldn't get an answer quickly.

It is a good gesture to give away your systems. It also means our global contribution will collapse very soon. I'll continue to use two servers for the FAH cause until early Wednesday and after that, they will be reassigned permanently to their original intended purpose. My personal contribution (the one with my own systems won't go back to what it was for the better part of last year since I've lost my most powerful system due to a GPU failure (which I haven't bothered to RMA yet). I also doubt I'll put money in computer gears for the remaining of 2014. I have other priorities right now (like replacing my car, which is long overdue).
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
A couple of high end video cards can easily make up for my loss. Handruin is likely to more than make up what is lost from my actions when he switches back from mining to folding.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
My team's new setup came in today with the 16 UCS blades and we got a VXN 7600 storage array with 377 drives (of which are 50 SSDs and the rest are 10K SAS!). It's a massive 6'4 display of blinking blue LEDs as it was initializing the drives this afternoon. It's almost ready for me to get my paws on it to test it out. I wish it was a viable option to run FAH to stress them in over a weekend. My plan for now is to use Prime95 with a local stress test otherwise I'd do like you've done and burn it in with some FAH.
So, how were your tests? What was the top electrical consumption of the blade chassis when all 8 blades were running Prime 95 (I assume two 5108 chassis)?
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,926
Location
USA
So, how were your tests? What was the top electrical consumption of the blade chassis when all 8 blades were running Prime 95 (I assume two 5108 chassis)?

We ended up with two 5108 chassis, two 5596UP switches, two 6296UP fabric interconnects, and 16 B200 M3 blades. We weren't able to get the optimized RAM config of 256GB so they came with 196GB in the non-optimal DIMM arrangement. I'm also going to run some performance benchmarks to determine what impact this has and compare it to an optimal config by shuffling RAM around. If it's significant, I'll make our setup half 256GB and half 128GB rather than all 196GB.

The burn-in hasn't happened yet. We are still waiting for some of the final lab configuration to complete before I can begin my stress testing. We have a dedicated lab team that oversees a lot of the deployment (such as finding us floor space, power connections, and network uplinks). Hardware management isn't my full-time role so they help us with a lot of the process. Right now they're helping me with configuring the Nexus 5K/6K switches for network uplink connectivity, vlan and vsan management, and failover management.

I get involved once some of the lower-level configurations are completed and then I'll get into the UCS template and blade configurations which include the storage configuration (RAID types, RAID groups/pools, LUN organization, and SAN zoning management). From there I can then begin to roll out the bare metal OS installations (ESXi 5.5 in this case) along with accompanying vCenter deployments. When I get the stress test up and running, I'll give you some feedback on the consumption figures.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
Since the management still hasn't provided me with the software licenses needed to properly configure the servers for production use, I've decided, in protest, to relaunch the FAH client. Only during the week-end and perhaps too during off business hours, until I finally receive what's missing to finalize the servers' software setup. Only on the two servers from the backup site. The other three servers already have partial production virtual machines on them and I don't want to disturb the setup.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
A new client is available since March 19th : version 7.4.4. The previous version, the one most of us still use, is version 7.3.6.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
The new client is incompatible with HFM.net. If that is important, then don't upgrade. Other than that one big problem, it is an improvement.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,926
Location
USA
Seems my GTX 780 is producing crap for points these days. It's averaging like 40K. Did something change with F@H or is there a way for me to see if something is wrong on my end of things?
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
PG has only a limited Core_17 WU's, so the vast majority of WU's avail are old Core_15 non-QRB (Quick return bonus) p8018's or p76xx which really suck PPD-wise. At some point, they will create some new projects and all will be good again. When, I can't say.

The other problem is that they have a new Core_17 (v0.0.54) version available for internal (the p8902's were my bread and butter for solid PPD) and beta but it just fails the vast majority of WU's on Nvidia GPU's and that will also kill off productivity if you are using those client-type's.

For GPU's the only good solution currently is to run Linux with non-beta/internal for Linux only which only give good Core_17 projects.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
Stanford seems to be giving out Core_17 projects, once again. At least I'm getting some now.
 
Last edited:

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
To any folding using the NaCl client:

It is useful to upgrade Chrome to the 34.1847.92+ beta-m version. It gets rid of several major bugs of the NaCl engine. Specifically, a memory leak; an accumulation of files in the temp folder (The temp folder could get enormous to the point of filling up a drive unless it was cleaned out occasionally); A limit of how long you can fold without needing to restart Chrome. The previous beta version had fixed those but added a new bug that prevented it being recommended, but that also has now been fixed with the most recent version. So it is worth upgrading to the newest beta. Even if you don't choose to use a beta version of Chrome, eventually these fixes will percolate to the standard release.
 

timwhit

Hairy Aussie
Joined
Jan 23, 2002
Messages
5,278
Location
Chicago, IL
I've had a few issues with stability. I don't think the temp file issue affects Linux.

The WU points seem to be really low right now for NaCl. I did between 54-56 per 3 hours overnight and that netted me 10 points per WU. Is this what I can expect from now on or was that a fluke?
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
They created a new WU but forgot to add QRB to it so everyone was getting just the base 10 points. Stanford knows and will correct.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
I don't know why, but my ppd collapsed for the virtual machines computing big units. The only difference I can see is the recent bunch of Windows Update. It look likes I lost half of the performance I had before.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
It wasn't like the issue discussed in the link you gave. Anyway, I've upgraded both the guest OS, the FAH client and control and installed one update on the host OS and one of the two virtual machine went back to its original performance. The other one became slower. Way slower (8.2 days to complete a project 8575, for instance). I tried to reinstall the client again and to delete the work folder, no improvement. So I erased the VM and I'm now re-creating a new one. The UI seems faster than the state in which was the old one. I'll see later if I can compute big units at the expected performance or if I'll go with just one VM from now on.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,729
Location
Québec, Québec
I've lost several hours on this, for what was probably a non-issue. When I attempted to reinstall the VM's OS, it didn't work. Unit still showed as taked taking several days to complete. Pissed, I decided to format the entire server (the host, not only the VM). I even modified the BIOS setting to switch everything to performance mode. A few hours later, when everything was reinstalled, I switched from Lubuntu to the Alpha 1 release of Debian Jessie, because, you know, I like to play it safe. Once the VM all configured, I restarted the FAH client and the first unit still showed as needing 7.7 days to complete. Then I decided to look for answer on the web, but when I returned to the FAH control screen, the time-per-frame had resumed to expected timing. It probably would have been like that if I stayed with the original configuration too.

So, while I don't know what create this oddity on this particular server, everything is back to normal. I've simply lost about two days worth of production.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
Sorry about that,

One thing of note is that the recent beta v7.4+ estimates of PPD are much more accurate than v7.3.6 was.
 
Top