rbd: sysfs write failed, lrbd.service: Unit entered failed state.

SUSE Enterprise Storage 4


Running "systemctl start lrbd.service" fails with:

lrbd[81379]: ERROR: command failed
lrbd[81379]: rbd -p PoolName --name client.admin map VolumeName
lrbd[81379]: In some cases useful info is found in syslog - try "dmesg | tail" or so.
lrbd[81379]: rbd: sysfs write failed
lrbd[81379]: rbd: map failed: (110) Connection timed out
systemd[1]: lrbd.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: Failed to start configures target.service from Ceph.
systemd[1]: lrbd.service: Unit entered failed state.


Use "ceph osd reweight-by-utilization" to have ceph reweight osd's by utilization,
Use "ceph osd reweight osd.ID Weight_Value" where osd.ID is the osd number that "ceph osd df" displays and the Weight_Value would be a value less than 1.00. 

After issuing the command(s) allow ceph time to re-balance the cluster. Monitor the status with "ceph status" and "ceph osd df". When ceph no longer claims the osd as being 95% or greater usage, start the lrbd service:

systemctl start lrbd.service
systemctl status lrbd.service

In some cases it will be necessary to add disks (osd's) to the cluster, so that the cluster has room to grow.


Ceph was reporting a full osd.

Additional Information

Use "ceph osd df" to show disk usage, note the %USE column from this output will provide the usage percentage.

Also note that:

The mon osd full ratio defaults to 0.95, or 95% of capacity before it stops clients from writing data. 
The mon osd nearfull ratio defaults to 0.85, or 85% of capacity when it generates a health warning.

