Note, the instructions below are how I diagnosed the problem I describe. I'm providing it here in case these steps help you narrow down your issue somewhere along the way.
I have an Antminer S7 (batch 8) unit that has been giving me problems. It worked fine for a week or so then started slowing down. I could hear the fan slowing down and speeding up and if I checked the web GUI of the unit I could see that “chains” 1 and 3 often had very cool temperatures, as if they weren’t even running.
I have an Antminer S7 (batch 8) unit that has been giving me problems. It worked fine for a week or so then started slowing down. I could hear the fan slowing down and speeding up and if I checked the web GUI of the unit I could see that “chains” 1 and 3 often had very cool temperatures, as if they weren’t even running.
It would then try to get those engaged again and it would
speed up and then the cycle would repeat.
Eventually, over time, I’d get a high piercing alarm sound and at that
point if I checked the GUI I’d see Xs in the ASIC display instead of Os (image
below).
Above is how it was typically running. After a while an alarm would sound and it would look like below:
PSU: The first things I tried were to swap power supplies with a
working unit. It’s very likely a PSU is
slightly under powered and can’t handle the demands of the S7. That didn’t work for me.
FREQUENCY: The next thing I tried was to underclock it. Maybe Bitmain was pushing the boards too much
at 700M. You can go into the GUI and
under Miner Configuration – Advanced Settings, set the frequency to something
slower like 650. You’ll void the
warranty if you overclock, so don’t do anything higher than 700M. I went down to 650M and the unit actually
worked for a while and I thought that was it, but it eventually failed in the
same way. Put the frequency back up to
700M before continuing to diagnose.
At that point I contacted Bitmain with screenshots and even
though it was a holiday there they responded within hours with this response:
Dear
customer,
One of your hash boards is defective. Please separately test
your hash board and find out which one did not work.
Please create the repair ticket as below if you know which part
is broken.
How to test hash board separately: keep the cable on one of the
hash boards linked and the other two unlinked.
Be sure disconnect the PSU from the other hash boards when you test only one hash board, or the miner will get burnt.
Be sure disconnect the PSU from the other hash boards when you test only one hash board, or the miner will get burnt.
Best Regards,
Bitmain
Bitmain
I appreciated the quick response but having never opened an
Antminer the instructions were kind of vague.
Here’s what I learned from the process and how I found resolutions to
the problem.
I suggest the first thing you do is open a ticket with Bitmain
. They are touchy about the warrantee so
if you are under warrantee (90 days) you want to make sure you get on the
clock. Also, you can’t open the box
until they tell you too. Their tamper resistant
sticker will tell them you opened the box.
It’s somewhat hard to find on the site but what you are
looking for is under the Support section and it is called “Submit a request”. You should log in to your Bitmain account
first, then here’s a direct link to the request form:
http://support.bitmain.com/hc/en-us/requests/new
That being said, unless you are going to return a board
there is no reason to open the box.
There is nothing inside. It is
simply an aluminum shell holding the 3 board parallel and the fan(s) at the
ends. ALL the connections are
outside. There are NO connections
inside. I thought maybe the boards were
in PCIe slots or something like that, but that’s not the case. Here's opening the box, just for grins:
To test the boards separately, what they mean is to power
down the unit and disconnect the ribbon cable to one of the boards. Also disconnect the 3 PCI power connectors to
that same board. Then power the unit back up
as normal. Give it at least 5 minutes and check
the GUI.
In my case I didn't disconnect 2 boards and leave one running, I disconnected one board and left 2 running. I first disconnected the right-most board. I
consider the FRONT of the unit to be where the Ethernet connector is. So this is when you are facing the front of
the unit, the board furthest to the right. I'm sure it's just coincidence but I've heard at least 2 other people who had problems with this rightmost board.
Attach the remaining 7 PCI power connectors to your PSU(s) and turn it on. I powered up the unit and it worked perfectly. It was only showing 2 Chains, as would be expected, and it was only hashing at about 3200 GH/s, which was to be expected. But it was rock solid. I let it ran for an hour with no issues.
So it would appear that rightmost hashboard was bad. But, for grins, I reconnected that board and
disconnected the middle board. I
expected it to fail, but again it worked fine for as long as I let it run.
Hmm, what’s with that?
That would point back to a bad PSU.
One that doesn’t deliver enough power for the 2 hashing boards. I have 2 high end PSUs, so I took them off a
working unit and put them on this unit and the problem remained. WTF.
It would seem the issue is the unit can run with any 2 hashing boards
but not 3 boards at once. Eventually I
tried 3 PSUs, one on each board, and no matter what I did, when I had boards 2
& 3 connected it would shut off both of those boards.
So, I reread their instructions and did what they actually
said, which was to test each board separately.
Instead of disconnecting one board and running the other 2, I
disconnected 2 boards and ran on one board at a time. The results were as expected, each board ran
fine when run as a stand-alone board.
Now I’m thinking that it is either low line voltage, or that
the IO board (or the BB) on the miner is bad.
To test that I decided to try switching the ribbon cables to different
boards. It comes set up with RA -> B1
(ribbon “A” connected to board “1”), RB->B2 and RC->B3.
So I decided to move apart the troublesome boards by putting
RA on B2, and Rb on B1. That way boards
2 & 3 weren’t next to each other from the IO board’s point of view. To my surprise that worked. That fixed the problem. The unit was back up to hashing at 4.7 Th/s
and ran for about 12 hours and then crapped out.
After working for 12 hours, now it is even worse than
before. No matter what I do I can’t get
it to run on more than one board.
BTW, I added a high load UPS to watch the line voltage and
wattage and nothing surprising there.
Everything was within specs at about 111-112 volts.
Conclusion: I think
my IO board is bad (or the BB board). It’s
either that or the software, but I find that hard to believe, unless it is some
config issue from their original setup.
Bitmain thinks it is the IO board so they are sending me a new one. I’ll update this once I receive it.
What happen? do you receive the new board?
ReplyDeleteI have an S7 with very similar issue. Is it possibly caused by faulty ribbon cable??
ReplyDeleteWhat happened i have same issue on couple of my miners? I received new board but still same. Could you please tell me the solution?
ReplyDeleteCan you be back here with updates?
ReplyDeleteI had an absolute same problem, what shall I do now
Hi, have you got updates about this? I have the same problem with one board and i dont know if deliver to bitmain or not. Many time they can't fix. What about you? What happened after?
ReplyDeleteOntology will constantly provide common modules on the underlying infrastructure for different kinds of distributed scenarios, such as those for the distributed digital identity framework, distributed data exchange protocol, and so on. Based on specific scenario requirements, Ontology will continue to develop new common modules. Read more at: Neon Beginner
ReplyDeleteYou have shared a lot of information in this article about Antminer L7. I would like to express my gratitude to everyone who contributed to this useful article. Keep posting.
ReplyDeleteYou wrote this post very carefully.buy cryptocurrency Canada The amount of information is stunning and also a gainful article for us. Keep sharing this kind of articles, Thank you.
ReplyDelete