Announcement

Collapse
No announcement yet.

Trident Z F4-3200C14D-32GTZ on a ASUS Z170-Deluxe: Problems and Questions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trident Z F4-3200C14D-32GTZ on a ASUS Z170-Deluxe: Problems and Questions

    Hello @all,

    I recently built a new machine and initially went with a Corsair CMK32GX4M2B3200C16 kit which caused all kinds of trouble (and I am by far not the only one having those with them across all current Asus boards right now). So I decided to buy my first memory from G.SKILL -- and, lo and behold, most of the problems simply went away.

    But a few questions and problems remain, so I would kindly ask for advice and any help would be greatly appreciated.

    The kit I have is a Trident Z F4-3200C14D-32GTZ which is running on a ASUS Z170-Deluxe (BIOS 1801) with a 6700K at stock clocks (no intention of overclocking since I need this machine to be stable for my work as a self-employed software engineer). Since the BIOS by default pretty much overvolted the CPU (with >1.40v spikes under load, thus heavy LLC) and overclocked it, I deactivated the Overclock AI and put the Vcore on offset (with SVID enabled). The system is powered by a Corsair HX850i, by the way.

    When I load the XMP profile, even though the system boots/reboots fine, it is not stable. Usually a few minutes into Prime95 28.9 (Blend), it hangs and reboots itself... no BSOD or anything. That happened multiple times without exception.

    When I changed the "DRAM Current Capability" to 120% and increased the DRAM voltage to 1.3550v, Prime95 did not cause a reboot in two consecutive tries but failed with rounding errors (usually also after a few minutes in).

    Talking about the DRAM voltage: The curious thing that not only I have noticed on the Z170-Deluxe is that even though 1.3500v have been entered, the BIOS monitor and any software monitor will show only 1.3440v. If that is upped to 1.3550v, the voltage constantly alternates (as-in: jumps, no transitions) between 1.3440v and 1.1360v -- if other values are entered (1.35XXv), those are rounded up or down to 1.3440v or 1.1360v. Like I said, this is not a faulty board but something also seen by others, so I am not alone with it. Just thought I would mention it if it matters or is a known bug with a workaround.

    Right now I am running the kit at 3000 @ 1.3440v (see above) with 14-14-14-34-2N. The "DRAM Current Capability" is set back to 100%. I did a Prime95 (Blend) stability test for 26h straight while still using the machine and thus causing secondary load: No rounding errors, hangs or reboots.

    Before I forget: VCCSA and VCCIO or both set on Auto which results in 1.208v for VCCSA and 1.144v for VCCIO, at least according to the BIOS monitor. Both fluctuate a bit under load. The former between 1.200v and 1.216v and latter can go up to 1.160v. So, based on the standard values in Intel's datasheet, the board already overvolts by default, most likely because of the higher memory frequency or such. "MCH Full Check" is enabled and "MRC Fast Boot" is disabled.

    So, with all that preamble out of the way, here finally my questions:

    1)
    I understand that according to the forum's sticky, with a 6700k, 1500Mhz is guaranteed to work. Out of experience though, should the kit work at its tested XMP settings with a 6700k on the Z170-Deluxe with voltages stated above?

    2)
    I understand that Skylake's IMC is only specified for 2133 Mhz and everything north of that is overclocking but what DRAM speeds are supposed to just work at stock voltages and when (at what speeds) does one have to start increasing which voltages? Or does this really heavily depend on board, CPU quality and memory quality?

    My impression right now is that 1500Mhz is some kind of barrier that needs extra fiddling.

    3)
    What are the safe ranges for VCCSA and VCCIO? Naturally there is nothing to be found in the Intel datasheet. :-) And on the net, there are magic numbers floating around that more like people's opinions than anything else.

    4)
    What is "DRAM Current Capability"? I searched for it but I actually never found a good explanation nor why it is needed and when it is exactly needed.

    5)
    I heard everywhere that the ASUS Z170-Deluxe is *the* board to get for memory compatibility but wherever I look now it seems this board is one of the pickest ones around. Is this just my impression or really the case?

    Thanks again in advance for any advice, explanations and help. And sorry for the huge wall of text. But you made it till the end. Thanks for sticking with me...

    Have a nice day,
    Matthias

  • #2
    1) Yes, no problem, it should be a great combination

    2) DDR4-3000 (1500MHz) is usually around the turning point. Intel standard is 2133, so you are right, 3000+ is when manual VCCSA Voltage may be necessary.

    3) It's safe as long as the system is using the additional Voltage and the hardware is cool and not over heating. Numbers seem magic because it varies depending on hardware combo, so every person has their own opinion and experience.

    4) Don't worry, it is not necessary to tweak unless for DDR4-4000+

    5) Deluxe and ROG will always get attention and good OC ability because they are specifically designed for it. Most memory kits can work in these two motherboards.

    Comment


    • #3
      Originally posted by BinaryKhaos View Post
      I understand that Skylake's IMC is only specified for 2133 Mhz and everything north of that is overclocking but what DRAM speeds are supposed to just work at stock voltages and when (at what speeds) does one have to start increasing which voltages? Or does this really heavily depend on board, CPU quality and memory quality?
      From what point onwards you need to increase VCCSA and IO depends on IMC quality of your specific CPU, but also varies from one board to another and also the memory itself. Some kits might need one and/or the other a bit higher for a certain memory ratio. The Vdimm is usually not the problem, when people have POST issues with their memory at XMP settings.

      Originally posted by BinaryKhaos View Post
      I heard everywhere that the ASUS Z170-Deluxe is *the* board to get for memory compatibility but wherever I look now it seems this board is one of the pickest ones around. Is this just my impression or really the case?
      The ASUS ROG models are probably their best tier memclockers, closely followed by their other higher end boards. The lower end models and 4 layer PCB designs like the Z170-A tend to struggle a bit earlier but can usually still hit DDR4-2800/3000.
      Team HardwareLUXX | Show off your G.SKILL products!

      Comment


      • #4
        Hello @all,

        I am very sorry I did not answer earlier but I did not get a notification from the board that new replies were there, so I missed them entirely.

        Originally posted by GSKILL TECH View Post
        1) Yes, no problem, it should be a great combination
        Emphasis on the "should", unfortunately. This board is giving me quite some headache, actually. After I had a hang yesterday (Linux) and the kernel hang at a very unusual place, I decided to run another Prime95 v28.9 test (blend) and after 13 hours, I got a very nice:

        Code:
        Self-test 28K passed!
        Self-test 12K passed!
        FATAL ERROR: Final result was 00000000, expected: 7CD3183C.
        Now I am back to the drawing board since I need this machine to be stable. I have now adjusted the DIMM voltage up to 1.36V, since previously it was "only" at 1.344V and there is no way to get it at 1.35V fixed.

        In the meantime I have also updated to UEFI 1902 (from 1801). But the the Prime95 error already happened on 1902.

        Originally posted by GSKILL TECH View Post
        2) DDR4-3000 (1500MHz) is usually around the turning point. Intel standard is 2133, so you are right, 3000+ is when manual VCCSA Voltage may be necessary.
        You mean "manual" as-in: Additional VCCSA Voltage adjustment above what the firmware already applies -- or in general, above the Intel standard voltage?

        Are timings also important for the IMC? Could the very low timings of the TridentZ modules I chose cause the trouble -- or is it usually simply the frequency the DIMMs are operated at?

        Originally posted by GSKILL TECH View Post
        4) Don't worry, it is not necessary to tweak unless for DDR4-4000+
        Ok... I set it to 120% now, but gonna revert it later then.

        Originally posted by emissary42 View Post
        From what point onwards you need to increase VCCSA and IO depends on IMC quality of your specific CPU, but also varies from one board to another and also the memory itself. Some kits might need one and/or the other a bit higher for a certain memory ratio. The Vdimm is usually not the problem, when people have POST issues with their memory at XMP settings.
        Had I only known this all before I assembled the new system, I would have made a few different choices. I usually research everything into the smallest detail, but with the memory I missed (initially) that the Skylake IMC was only rated @ 2133 MHz and everything above is OC and as such: luck.

        I only wish I could more easily test and trigger an error here instead of having Prime95 stress the system for countless hours. I also tried (and purchased) both HCI MemTest and AID64 Extreme... but the errors are sporadic and as such hard to replicate and test (and fix).

        Also, without knowing exactly if it is the CPU or the RAM, this is even more difficult.

        If anyone has any suggestions or ideas, that would be very much appreciated... I feel like I am slowly going insane over this machine. I never thought it would be so much trouble (I ran into other things as well).

        Thanks,
        Matthias

        Comment


        • #5
          Hello @all,

          so I upped the VDIMM to 1.36V, changed the VCore back to Auto (from Offset w/ SVID enabled) and let Prime95 v28.9 run (with VCSA and VCIO same as stated in my initial post at Auto, see above):

          Code:
          [Aug 1 11:13] Self-test 21K passed!
          [Aug 1 11:13] Test 1, 21000 Lucas-Lehmer iterations of M12969343 using FMA3 FFT length 800K, Pass1=320, Pass2=2560.
          [Aug 1 11:16] FATAL ERROR: Final result was 00000000, expected: 477EEFE4.
          [Aug 1 11:16] Hardware failure detected, consult stress.txt file.
          [Aug 1 11:16] Torture Test completed 966 tests in 39 hours, 7 minutes - 1 errors, 0 warnings.
          [Aug 1 11:16] Worker stopped.
          This is really frustrating, especially since the modules are set at 1500 MHz (DDR4-3000). :-( So I guess I could either play with VCSA and VCIO -- or run them at 1066.67 MHz (DDR4-2133) and call it a day. Naturally there is also the option that either the DIMMs or the CPU is defective. I guess I will let Prime95 run at stock settings and see if the same error pops up then as well.

          Any advice or suggestions would be very much appreciated -- especially if it is normal for this low latency DIMMs @ DDR4-3000 to cause trouble with a 6700K at the voltages mentioned.

          Thanks again,
          Matthias

          Comment


          • #6
            Hello @all,

            so I decided to return to stock (as-in: no XMP and run the RAM @ 2133) while I stress test the system for 72 hours straight with Prime95 blend at ~70% memory utilization (I still need to be able to use the machine).

            If this does not reveal any errors, then I hope the machine is stable and I guess I will just call it a day, even though I could have saved quite a bit of money on the RAM. Well... it is lottery after all, so... I learned my lesson.

            But at the end of the day, I need this machine to be rock-solid and dependable. And even if I find a combination that works at higher speeds, there is still no guarantee that a data corruption won't happen due to the OC (and go unnoticed while screwing up my data) as I could have only made it just way more unlikely to happen. And that is simply not acceptable to me.

            Still, if anyone has any more suggestions, hints and/or opinions, I would gladly hear them.

            Thanks a lot again,
            Matthias

            Comment


            • #7
              Hello again,

              and unfortunately I just got this after 49 hours of Prime95 28.9 blend:

              Code:
              [Aug 3 17:10] Test 1, 4000000 Lucas-Lehmer iterations of M138527 using FMA3 FFT length 8K, Pass1=128, Pass2=64.
              [Aug 3 17:15] FATAL ERROR: Final result was 00000000, expected: D2E65D57.
              [Aug 3 17:15] Hardware failure detected, consult stress.txt file.
              [Aug 3 17:15] Torture Test completed 796 tests in 49 hours, 8 minutes - 1 errors, 0 warnings.
              [Aug 3 17:15] Worker stopped.
              Let's emphasize that the system is now running completely stock and I still ran into that problem again. It shouldn't be the infamous Skylake / Prime95 bug since the MC is 74.

              I am really out of ideas.

              I would really appreciate any help, suggestions, opinions and/or hints...

              Thanks so much,
              Matthias

              Comment


              • #8
                An Error after 50 hours of prime blend does not necessarily have to hint at a problem with the memory itself. It could also be an issue related to voltage regulation or temperatures. Also if you are really paranoid about single errors it might be worth considering workstation class hardware with ECC memory.

                In the meantime why not test the memory with the current memtest86 v7 instead? That is basically the standard nowadays. A few loops don't take that long to complete and one night should be more than enough.
                Team HardwareLUXX | Show off your G.SKILL products!

                Comment


                • #9
                  Hello...

                  Originally posted by emissary42 View Post
                  An Error after 50 hours of prime blend does not necessarily have to hint at a problem with the memory itself.
                  That is exactly the point and what is driving me insane since locating the culprit without having a "fast" and reliable way to trigger this -- it is a nightmare.

                  Originally posted by emissary42 View Post
                  It could also be an issue related to voltage regulation or temperatures.
                  Temperatures are all in the green, that is for sure. Voltage regulation would be the board. In all honesty, I wouldn't be surprised... but there is no way to proof this at all.

                  Originally posted by emissary42 View Post
                  Also if you are really paranoid about single errors it might be worth considering workstation class hardware with ECC memory.
                  This really doesn't sit too well with me, I am afraid. I decided in favor of Skylake and Z170 for various reasons... and yes, I really would have liked ECC memory. But all in all, a machine has to be stable... period. If you know there is data corruption going on that is not as unlikely as a bit switched due to cosmic rays, that is absolutely unacceptable.

                  I know, there are a lot of people out there who OC their system to the maximum, never ever pass an hour of Prime95 but still use the machine. I personally think that is insane but to each his own.

                  Originally posted by emissary42 View Post
                  In the meantime why not test the memory with the current memtest86 v7 instead? That is basically the standard nowadays. A few loops don't take that long to complete and one night should be more than enough.
                  Ran it for almost 13 hours (w/ multithreading), no errors. I cannot have it run for 72 hours or whatever unfortunately, since I do need to use the machine.

                  Thanks,
                  Matthias

                  Comment


                  • #10
                    Originally posted by BinaryKhaos View Post
                    Voltage regulation would be the board. In all honesty, I wouldn't be surprised... but there is no way to proof this at all.
                    Yes, that would be the mainboard but a low quality power supply could also be a factor in it.

                    There is ways to measure this, but logging MB voltages with oscilloscope probes right at the VRMs is not for everyone.

                    Originally posted by BinaryKhaos View Post
                    If you know there is data corruption going on that is not as unlikely as a bit switched due to cosmic rays, that is absolutely unacceptable.
                    Well, if you test long enough you will eventually encounter a bit flip. There are many other possible triggers to it besides cosmic rays. That is why ECC memory and error correction in data storage is still useful even for data centres that are sealed off to any kind of radiation from the outside.

                    Originally posted by BinaryKhaos View Post
                    I know, there are a lot of people out there who OC their system to the maximum, never ever pass an hour of Prime95 but still use the machine. I personally think that is insane but to each his own.
                    I do, so please keep that in mind^^

                    Originally posted by BinaryKhaos View Post
                    Ran it for almost 13 hours (w/ multithreading), no errors. I cannot have it run for 72 hours or whatever unfortunately, since I do need to use the machine.
                    Then the memory is probably just fine (by my standards anyways ).
                    Team HardwareLUXX | Show off your G.SKILL products!

                    Comment


                    • #11
                      Originally posted by emissary42 View Post
                      Yes, that would be the mainboard but a low quality power supply could also be a factor in it.
                      I wouldn't call a Corsair HX850i low quality. ;-)

                      Originally posted by emissary42 View Post
                      There is ways to measure this, but logging MB voltages with oscilloscope probes right at the VRMs is not for everyone.
                      That necessitates the proper equipment though that not everyone has... at least not me. But otherwise, really testing at that level would probably reveal what is the culprit and save me a lot of gray hair.

                      Originally posted by emissary42 View Post
                      Well, if you test long enough you will eventually encounter a bit flip.
                      I agree. But 19-49 hours should definitely not be it, imho -- especially not if it is reproducible every time... even though with some time variation.

                      Originally posted by emissary42 View Post
                      I do, so please keep that in mind^^
                      I am sorry, I meant no disrespect at all. Like I said, to each his own.

                      Originally posted by emissary42 View Post
                      Then the memory is probably just fine (by my standards anyways ).
                      Maybe. I have seen faulty memory that had very sporadic "failure rates", so right now I am thinking either memory, cpu or board. Again, this feels like lottery... and I never win anything, so... sigh.

                      Thanks,
                      Matthias

                      Comment


                      • #12
                        Originally posted by BinaryKhaos View Post
                        I wouldn't call a Corsair HX850i low quality. ;-)
                        Yes, that is not low quality at all. I would rank it pretty high even among my personal favourites.

                        Originally posted by BinaryKhaos View Post
                        I am sorry, I meant no disrespect at all. Like I said, to each his own.
                        It's all good, don't worry about it.

                        Originally posted by BinaryKhaos View Post
                        Maybe. I have seen faulty memory that had very sporadic "failure rates", so right now I am thinking either memory, cpu or board.
                        So lets just pretend your power supply and memory are not faulty. On the hardware side that leaves the CPU and MB as possible culprits. The first things that come to mind:

                        1) Does your MB apply some kind of automatic turbo enhancement? If that is the case disable that. I you want to go a step further even disable the CPU turbo altogether, but that will have an even bigger impact on performance.

                        2) To improve power regulation you could raise the switching frequency of the VRM, for the CPU and the memory by a step. However keep in mind that while this will help to smooth the corresponding voltage output, it does lower VRM efficiency and does produce additional heat. Since you already tried increasing Vdimm above spec, it is maybe worth a try nonetheless.
                        Team HardwareLUXX | Show off your G.SKILL products!

                        Comment


                        • #13
                          Hi...

                          Originally posted by emissary42 View Post
                          1) Does your MB apply some kind of automatic turbo enhancement? If that is the case disable that. I you want to go a step further even disable the CPU turbo altogether, but that will have an even bigger impact on performance.
                          The magic performance enhancements are all off... and should be by default, if you ask me. The turbo boost is on, and I don't think this is causing any trouble here... it never kicks in while Prime95 is running -- ever.

                          Originally posted by emissary42 View Post
                          2) To improve power regulation you could raise the switching frequency of the VRM, for the CPU and the memory by a step.
                          Sorry to ask but what exactly is that for? And what would that imply if I could fix/mask the problem with that?

                          Originally posted by emissary42 View Post
                          However keep in mind that while this will help to smooth the corresponding voltage output, it does lower VRM efficiency and does produce additional heat. Since you already tried increasing Vdimm above spec, it is maybe worth a try nonetheless.
                          Heat on the VRM that is?

                          Thanks,
                          Matthias

                          Comment


                          • #14
                            Originally posted by BinaryKhaos View Post
                            Heat on the VRM that is?
                            Yes.

                            Originally posted by BinaryKhaos View Post
                            Sorry to ask but what exactly is that for? And what would that imply if I could fix/mask the problem with that?
                            All VRM basics explained, including switching frequency: http://sinhardware.com/index.php/vrm...s/82-vrm-guide

                            Team HardwareLUXX | Show off your G.SKILL products!

                            Comment


                            • #15
                              Hey...

                              Thanks a lot for the fantastic link -- that will take me a while to read though. ;-)

                              Just a quick question: Wouldn't it imply a fault either in the CPU, RAM or mobo, if increasing the switching frequency actually fixed this problem? And should it even be necessary at stock settings at all?

                              So long,
                              Matthias

                              Comment

                              Working...
                              X