Folding@Home

CougTek · Mar 7, 2014

P5-133XL said:
Coug,

One thing I'd be interested in is what kind of PPD one of your super-servers get on the Chrome-app as opposed to bigadv.

That's the first thing I've tried before I resumed the bigadv earlier this week. I get the failed ID request error that I've complained about a few days ago. On both servers. It's probably something in our firewall that is blocking it, but I won't open ports in our corporate firewall just to let the NaCl client run.

Howell · Mar 7, 2014

Looking in my proxy logs, there is nothing exotic there. All port 80 and all to either folding.Stanford.edu or assign5.Stanford.edu. I assume the assign server is part off a cluster and can be different for others.

Howell · Mar 7, 2014

According to proxy logs all traffic is port 80 and all to folding.Stanford.edu and assign5.Stanford.edu.

timwhit · Mar 7, 2014

That domain or IP could be restricted by a proxy.

P5-133XL · Mar 7, 2014

Depending upon the proxy server, it could also be looking at the application layer and filtering based on that...

CougTek · Mar 9, 2014

timwhit said:
Anyone know what's up with ExtremeOverclocking's stats?

At this point, I think it's fair to assume that who ever takes care of the stats there either gave up or screwed up in the process. Maybe he did a Davin/Eugene special. Or maybe he concluded that the income from the site's ads wasn't worth the trouble of maintaining the stats system. I'm no expert in database management, but I doubt that it takes something like four days to purge old data, even with trully limited hardware.

There's still Kakao...

P5-133XL · Mar 9, 2014

I agree, it is hard to justify a 4+ day prune and that something else is likely to have happened.

P5-133XL · Mar 9, 2014

My PPD is going to permanently drop significantly. I've agreed to give a bunch of my surplus machines away to nieces and nephews so that they can game with them (i.e. using the games off my steam account). Their parents do not have the money to buy their own machines and this is really just an excuse to get a computer in each of their households. I started agreeing to give one away and now that the news has gotten around so now I have a flood of requests and I've decided to say yes to all of them.

While I can set the machines up originally to fold, gaming and GPU folding are mutually exclusive further, these are children so the likelihood that they will consistently turn on and off folding is small to nil.

P5-133XL · Mar 9, 2014

I posted in the foldingforums about EOC: https://foldingforum.org/viewtopic.php?f=14&t=25970

jimerickson said:
i have been in touch with Jason the owner of the site and he says that
it will be sometime monday at the earliest that things will be back up.

timwhit · Mar 9, 2014

Good to know it will be back eventually.

CougTek · Mar 9, 2014

Thanks for taking the time to ask. I thought about doing it yesterday in Extreme Oc's forum, but their forum doesn't seem to receive a lot of traffic and I thought I wouldn't get an answer quickly.

It is a good gesture to give away your systems. It also means our global contribution will collapse very soon. I'll continue to use two servers for the FAH cause until early Wednesday and after that, they will be reassigned permanently to their original intended purpose. My personal contribution (the one with my own systems won't go back to what it was for the better part of last year since I've lost my most powerful system due to a GPU failure (which I haven't bothered to RMA yet). I also doubt I'll put money in computer gears for the remaining of 2014. I have other priorities right now (like replacing my car, which is long overdue).

P5-133XL · Mar 9, 2014

A couple of high end video cards can easily make up for my loss. Handruin is likely to more than make up what is lost from my actions when he switches back from mining to folding.

timwhit · Mar 10, 2014

P5-133XL said:
I posted in the foldingforums about EOC: https://foldingforum.org/viewtopic.php?f=14&t=25970

Looks like it's back up now.

timwhit · Mar 10, 2014

It also appears that there's several days of missing data.

P5-133XL · Mar 10, 2014

They are crediting for back data now.

CougTek · Mar 10, 2014

Handruin said:
My team's new setup came in today with the 16 UCS blades and we got a VXN 7600 storage array with 377 drives (of which are 50 SSDs and the rest are 10K SAS!). It's a massive 6'4 display of blinking blue LEDs as it was initializing the drives this afternoon. It's almost ready for me to get my paws on it to test it out. I wish it was a viable option to run FAH to stress them in over a weekend. My plan for now is to use Prime95 with a local stress test otherwise I'd do like you've done and burn it in with some FAH.

So, how were your tests? What was the top electrical consumption of the blade chassis when all 8 blades were running Prime 95 (I assume two 5108 chassis)?

Handruin · Mar 10, 2014

CougTek said:
So, how were your tests? What was the top electrical consumption of the blade chassis when all 8 blades were running Prime 95 (I assume two 5108 chassis)?

We ended up with two 5108 chassis, two 5596UP switches, two 6296UP fabric interconnects, and 16 B200 M3 blades. We weren't able to get the optimized RAM config of 256GB so they came with 196GB in the non-optimal DIMM arrangement. I'm also going to run some performance benchmarks to determine what impact this has and compare it to an optimal config by shuffling RAM around. If it's significant, I'll make our setup half 256GB and half 128GB rather than all 196GB.

The burn-in hasn't happened yet. We are still waiting for some of the final lab configuration to complete before I can begin my stress testing. We have a dedicated lab team that oversees a lot of the deployment (such as finding us floor space, power connections, and network uplinks). Hardware management isn't my full-time role so they help us with a lot of the process. Right now they're helping me with configuring the Nexus 5K/6K switches for network uplink connectivity, vlan and vsan management, and failover management.

I get involved once some of the lower-level configurations are completed and then I'll get into the UCS template and blade configurations which include the storage configuration (RAID types, RAID groups/pools, LUN organization, and SAN zoning management). From there I can then begin to roll out the bare metal OS installations (ESXi 5.5 in this case) along with accompanying vCenter deployments. When I get the stress test up and running, I'll give you some feedback on the consumption figures.

timwhit · Mar 13, 2014

The NaCl app doesn't seem to be able to download WUs. Anyone else have issues?

P5-133XL · Mar 13, 2014

The NaCl server is down for maintenance. ETA sometime late tonight or early tomorrow (Fri). https://foldingforum.org/viewtopic.php?f=95&t=25991#p260701

thomas20 · Mar 21, 2014

Now i know. Thanks for the info...

Howell · Mar 22, 2014

thomas20, his post was over a week ago. Are you real?

LunarMist · Mar 22, 2014

I thought it was a bot.

CougTek · Mar 22, 2014

Since the management still hasn't provided me with the software licenses needed to properly configure the servers for production use, I've decided, in protest, to relaunch the FAH client. Only during the week-end and perhaps too during off business hours, until I finally receive what's missing to finalize the servers' software setup. Only on the two servers from the backup site. The other three servers already have partial production virtual machines on them and I don't want to disturb the setup.

Howell · Mar 23, 2014

Sounds like you've some real prioritizing and project management wizards there.

CougTek · Mar 24, 2014

A new client is available since March 19th : version 7.4.4. The previous version, the one most of us still use, is version 7.3.6.

P5-133XL · Mar 24, 2014

The new client is incompatible with HFM.net. If that is important, then don't upgrade. Other than that one big problem, it is an improvement.

timwhit · Mar 24, 2014

Seems that the NaCl WU server is down again.

Handruin · Mar 25, 2014

Seems my GTX 780 is producing crap for points these days. It's averaging like 40K. Did something change with F@H or is there a way for me to see if something is wrong on my end of things?

P5-133XL · Mar 25, 2014

PG has only a limited Core_17 WU's, so the vast majority of WU's avail are old Core_15 non-QRB (Quick return bonus) p8018's or p76xx which really suck PPD-wise. At some point, they will create some new projects and all will be good again. When, I can't say.

The other problem is that they have a new Core_17 (v0.0.54) version available for internal (the p8902's were my bread and butter for solid PPD) and beta but it just fails the vast majority of WU's on Nvidia GPU's and that will also kill off productivity if you are using those client-type's.

For GPU's the only good solution currently is to run Linux with non-beta/internal for Linux only which only give good Core_17 projects.

P5-133XL · Mar 28, 2014

Stanford seems to be giving out Core_17 projects, once again. At least I'm getting some now.

P5-133XL · Apr 1, 2014

To any folding using the NaCl client:

It is useful to upgrade Chrome to the 34.1847.92+ beta-m version. It gets rid of several major bugs of the NaCl engine. Specifically, a memory leak; an accumulation of files in the temp folder (The temp folder could get enormous to the point of filling up a drive unless it was cleaned out occasionally); A limit of how long you can fold without needing to restart Chrome. The previous beta version had fixed those but added a new bug that prevented it being recommended, but that also has now been fixed with the most recent version. So it is worth upgrading to the newest beta. Even if you don't choose to use a beta version of Chrome, eventually these fixes will percolate to the standard release.

timwhit · Apr 1, 2014

I've had a few issues with stability. I don't think the temp file issue affects Linux.

The WU points seem to be really low right now for NaCl. I did between 54-56 per 3 hours overnight and that netted me 10 points per WU. Is this what I can expect from now on or was that a fluke?

P5-133XL · Apr 1, 2014

They created a new WU but forgot to add QRB to it so everyone was getting just the base 10 points. Stanford knows and will correct.

timwhit · Apr 1, 2014

It looks like they already fixed the issue.

CougTek · Apr 12, 2014

I don't know why, but my ppd collapsed for the virtual machines computing big units. The only difference I can see is the recent bunch of Windows Update. It look likes I lost half of the performance I had before.

P5-133XL · Apr 12, 2014

https://foldingforum.org/viewtopic.php?f=18&t=26088? Supposedly fixed...

CougTek · Apr 13, 2014

It wasn't like the issue discussed in the link you gave. Anyway, I've upgraded both the guest OS, the FAH client and control and installed one update on the host OS and one of the two virtual machine went back to its original performance. The other one became slower. Way slower (8.2 days to complete a project 8575, for instance). I tried to reinstall the client again and to delete the work folder, no improvement. So I erased the VM and I'm now re-creating a new one. The UI seems faster than the state in which was the old one. I'll see later if I can compute big units at the expected performance or if I'll go with just one VM from now on.

P5-133XL · Apr 13, 2014

To my knowledge, nothing has changed that would be affected by re-installing the OS, VM, or client.

CougTek · Apr 13, 2014

I've lost several hours on this, for what was probably a non-issue. When I attempted to reinstall the VM's OS, it didn't work. Unit still showed as taked taking several days to complete. Pissed, I decided to format the entire server (the host, not only the VM). I even modified the BIOS setting to switch everything to performance mode. A few hours later, when everything was reinstalled, I switched from Lubuntu to the Alpha 1 release of Debian Jessie, because, you know, I like to play it safe. Once the VM all configured, I restarted the FAH client and the first unit still showed as needing 7.7 days to complete. Then I decided to look for answer on the web, but when I returned to the FAH control screen, the time-per-frame had resumed to expected timing. It probably would have been like that if I stayed with the original configuration too.

So, while I don't know what create this oddity on this particular server, everything is back to normal. I've simply lost about two days worth of production.

P5-133XL · Apr 14, 2014

Sorry about that,

One thing of note is that the recent beta v7.4+ estimates of PPD are much more accurate than v7.3.6 was.

Folding@Home

Hairy Aussie

Storage? I am Storage!

Storage? I am Storage!

Hairy Aussie

Xmas '97

Hairy Aussie

Xmas '97

Xmas '97

Xmas '97

Hairy Aussie

Hairy Aussie

Xmas '97

Hairy Aussie

Hairy Aussie

Xmas '97

Hairy Aussie

Administrator

Hairy Aussie

Xmas '97

What is this storage?

Storage? I am Storage!

I can't believe I'm a Fixture

Hairy Aussie

Storage? I am Storage!

Hairy Aussie

Xmas '97

Hairy Aussie

Administrator

Xmas '97

Xmas '97

Xmas '97

Hairy Aussie

Xmas '97

Hairy Aussie

Hairy Aussie

Xmas '97

Hairy Aussie

Xmas '97

Hairy Aussie

Xmas '97