rbd: sysfs write failed, lrbd.service: Unit entered failed state.

  • 7022004
  • 04-Oct-2017
  • 05-Oct-2017

Environment

SUSE Enterprise Storage 4

Situation

Running "systemctl start lrbd.service" fails with:

lrbd[81379]: ERROR: command failed
lrbd[81379]: rbd -p PoolName --name client.admin map VolumeName
lrbd[81379]: In some cases useful info is found in syslog - try "dmesg | tail" or so.
lrbd[81379]: rbd: sysfs write failed
lrbd[81379]: rbd: map failed: (110) Connection timed out
systemd[1]: lrbd.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: Failed to start configures target.service from Ceph.
systemd[1]: lrbd.service: Unit entered failed state.

Resolution

Use "ceph osd reweight-by-utilization" to have ceph reweight osd's by utilization,
or
Use "ceph osd reweight osd.ID Weight_Value" where osd.ID is the osd number that "ceph osd df" displays and the Weight_Value would be a value less than 1.00. 

After issuing the command(s) allow ceph time to re-balance the cluster. Monitor the status with "ceph status" and "ceph osd df". When ceph no longer claims the osd as being 95% or greater usage, start the lrbd service:

systemctl start lrbd.service
systemctl status lrbd.service

In some cases it will be necessary to add disks (osd's) to the cluster, so that the cluster has room to grow.

Cause

Ceph was reporting a full osd.

Additional Information

Use "ceph osd df" to show disk usage, note the %USE column from this output will provide the usage percentage.

Also note that:

The mon osd full ratio defaults to 0.95, or 95% of capacity before it stops clients from writing data. 
The mon osd nearfull ratio defaults to 0.85, or 85% of capacity when it generates a health warning.

Feedback service temporarily unavailable. For content questions or problems, please contact Support.