Thursday, May 16, 2013

Korean IPS Monitors and Nvidia under Linux

The business is doing well, so I thought I'd upgrade my computer. Bye bye old laptop plus extra 23" monitor; hello new desktop system, SSD, a meaty Nvidia graphics card and best of all, a pair of awesome Korean 26" IPS monitors. Cool!

Actually, not cool - the displays wouldn't work due to what turned out to be a stupid bug in the Nvidia drivers for Linux. Read on for the story and a solution...

At first, I used the open-source Nouveau drivers, and the displays worked, but were very slow - slower than my old laptop, which made a bit of a mockery of the Nvidia graphics card. I tried the proprietary Nvidia drivers, but both screens blanked as soon as I logged in so I switched back.

But when I upgraded to the latest version of Ubuntu (from 12.10 to 13.04), it became obvious that the new Nouveau drivers were unstable. The machine would randomly crash several times a day, displaying a minced version of the desktop and requiring a hard reset. So I tried the proprietary Nvidia drivers again.

Same result... booting fine to the login page, then two blank screens as soon as I logged in. I managed to get a terminal session going by hitting Ctrl-Alt-F1 at the login page, and went searching for clues to what was going wrong. X-windows creates a log file /var/log/Xorg.0.log, and it contained the following entries:

[     4.380] (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
[     4.380] (==) NVIDIA(0): RGB weight 888
[     4.380] (==) NVIDIA(0): Default visual is TrueColor
[     4.380] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[     4.380] (**) NVIDIA(0): Enabling 2D acceleration
[     5.140] (WW) NVIDIA(GPU-0): The EDID read for display device DFP-0 is invalid: the
[     5.140] (WW) NVIDIA(GPU-0):     checksum for EDID version 1 extension is invalid.
[     5.140] (--) NVIDIA(GPU-0): 
[     5.140] (--) NVIDIA(GPU-0): Raw EDID bytes:
[     5.140] (--) NVIDIA(GPU-0): 
[     5.140] (--) NVIDIA(GPU-0):   00 ff ff ff ff ff ff 00  04 62 9b 04 00 00 00 00
...
[     5.140] (--) NVIDIA(GPU-0):   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
[     5.140] (--) NVIDIA(GPU-0): 
[     5.159] (WW) NVIDIA(GPU-0): The EDID read for display device DFP-2 is invalid: the
[     5.159] (WW) NVIDIA(GPU-0):     checksum for EDID version 1 extension is invalid.
[     5.159] (--) NVIDIA(GPU-0): 
[     5.159] (--) NVIDIA(GPU-0): Raw EDID bytes:
[     5.159] (--) NVIDIA(GPU-0): 
[     5.159] (--) NVIDIA(GPU-0):   00 ff ff ff ff ff ff 00  5d 34 fa 00 00 00 00 00
...
[     5.159] (--) NVIDIA(GPU-0):   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00

After hitting Google, I found that the EDID is a block of data that a monitor passes to the graphics card, containing it's operational parameters (name, frequencies, resolution etc.). The data is protected by a checksum because corrupted data could physically damage the card or the monitor, and the Nvidia card was getting the wrong checksum so it was ignoring the data - and therefore the monitor. Fair enough, if the EDID checksum actually was wrong...

But then why did they work with other graphics cards, and under Windows and IOS? I wrote a small python program to grab the EDID data from the log, and fed that into a useful utility for reading and parsing EDID data. According to this, the checksum was correct!

More research found that there was an X-windows option to tell the driver to ignore the checksum when reading the EDID, although this came with ominous warnings about how using the option could damage your hardware. To use it, you first need to create an xorg.conf file using the following command:

sudo nvidia-xconfig

Now you need to edit the new xorg.conf file:

sudo vi /etc/X11/xorg.conf

Find the section below, and add the highlighted line:


Section "Device"
   Identifier     "Device0"
   Driver         "nvidia"
   VendorName     "NVIDIA Corporation"
   Option         "IgnoreEDIDChecksum" "DFP"
EndSection

Save the file, and reboot.

After doing this, the monitors fired up beautifully after logging in. There are some warnings about ignoring the EDID in the log, but the Nvidia card is obviously reading the data correctly. So this means there is a bug in the Nvidia driver that incorrectly calculates the checksum for some monitors - particularly these Korean IPS screens, it seems.


Open Source the Drivers, Nvidia!

What happens next is a classic example of the advantages of open-sourcing your code. 

What should happen is that I download the source for the driver, figure out why it's getting the wrong checksum, and let the maintainer know exactly what the problem is. I then feel great that I've solved this problem for lots of other people, Nvidia has a better driver, and lot of other people don't have this problem. Awesome - everyone loves Nvidia!

But Nvidia have kept their driver proprietary, for whatever reason. So what will actually happen is that I will write blog post criticising Nvidia, rather than submitting a bug report. (Actually I will report the bug, but I'm much less motivated to do this because their development process is so opaque I'm not at all confident it will be acted on.) So the end result is I feel frustrated with Nvidia, they still have a buggy driver, and a lot of other people still have this problem. Everyone thinks Nvidia stinks and will buy a different graphics card next time!

Do the right thing, Nvidia.



5 comments:

  1. Fuck Nvidia, I finally got my monitor working after a long time messing with xorg.conf.

    ReplyDelete
  2. Thanks a lot for this article. This did the trick for me. Also this is really bizarre. I'm thinking the korean monitor is setting some bits in reserved space or something that are normally zero on other monitors so the nvidia drivers use these bit when they shouldn't and work with other monitors but not these korean ones. Anyhow we may never know though since the drivers arn't open.

    ReplyDelete
  3. THANK YOU! After messing with my monitor (X-Star DP2710) and after struggling for weeks to get it to work with the Nvidia drivers, this is what finally did it!

    The monitor would flash an RGB cycle as soon as the Nvidia drivers were installed.

    A few other posts suggest changing the monitor's settings, like the timings, in xorg.conf, but it's unnecessary and might make your monitor work incorrectly (the first solution I tried got me away from the RGB flashes and displayed the desktop, but the backlighting was completely wrong and the left side of the monitor was brighter than the right... not good!) This solution fixes everything in one line.

    Awesome, thanks again for this.

    ReplyDelete
  4. Thank you so much. I always refer to this post whenever I reinstall Linux on my machine. I am using a Crossover 27Q LED-P and a GTX 670.

    ReplyDelete
  5. Hi, I understand you, cuz half a year ago I bought a monitor with game technology and really it needs to install drivers, so I decided to spend a lot of time to do it. Thanks to NVidia drivers download for
    Windows http://bitdrivers.com/manufacturers/nvidia here I was able to find exactly what was needed.

    ReplyDelete