开发者

Losing link to the FPGA device

开发者 https://www.devze.com 2023-01-25 02:53 出处:网络
I am trying to debug somewhat strange problem in the device driver for the PCIe FPGA device. Both the device driver and the FPGA image are developed in the house.

I am trying to debug somewhat strange problem in the device driver for the PCIe FPGA device. Both the device driver and the FPGA image are developed in the house.

The target system is 开发者_开发技巧x86, and the OS is fedora 9. It has a PCIe card with the FPGA plugged in it's only PCIe slot. The FPGA image is loaded after the boot from the EEPROM.

The driver is written in such a way that it uses the /sys/bus/pci/devices/0000:02:00.0/ resource files (where 0000:02:00.0 is the PCI slot of the card containing the FPGA) to configure the FPGA.

When the system boots (or when it returns from the hibernation), the FPGA link seams to be lost, and the resource files are missing. When the FPGA boots properly, everything works fine (the resource files are there). When the system enters the hibernation, the FPGA is powered off. When it returns from the hibernation, the FPGA is powered on, before starting the driver initialization.

I am suspecting at next things :

  • a bug in firmware - something related to PCI plug in?
  • a bug in kernel - least likely, because other PCI cards are recognized fine. Only

    this PCI card makes problems

And the questions are :

  • Has anyone had similar problems?
  • What else could be wrong?
  • Any suggestions on how to debug this issue?

EDIT

I just found this bug, which is very similar to the problem I am seeing.


I finally managed to debug my problem. Just before entering the hibernation, all processes which are still using the resource files are being killed. For some unknown reason, one process didn't release resources, and was killed. We have a watchdog, which respawns all processes which are not running.

When coming back from the hibernation, this process respawned, and since it couldn't open the resource files, it died again, and then a critical error was declared. After some very small time, the resources files were added by the OS, and this process could continue normally.


A PCIe card has to reply to a "Is anybody there" message within a certain time. Is is possible that your card is not responding quickly enough after hibernation / reset?

Without more details of your design, it is hard to do anything but guess.

Can you list the differences between the system working and not working, i.e. what do you do differently to get the card to work?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号